Why RUN command which deletes files inflates image size?

Below is my docker file ( last RUN statement deletes files form TEMP directory) . If anything it shall reduce image size but instead it increases it by 38 MB. What gives?

# escape=`
FROM microsoft/windowsservercore
SHELL ["powershell", "-Command", "$ErrorActionPreference = 'Stop'; $ProgressPreference = 'Continue'; $verbosePreference='Continue';"]
ADD http://go.microsoft.com/fwlink/?LinkId=829373 .\iisadmin.exe
RUN remove-item $env:temp\* -Recurse -ErrorAction Ignore

Output of docker history

IMAGE                                                                     CREATED             CREATED BY                                                                                                                                                                                                                                                    SIZE                COMMENT
sha256:e9ab8109c03677ec0175815fddd451456621c14aea1d5e15fb42c5a03b13cbd5   3 minutes ago       powershell -Command $ErrorActionPreference = 'Stop'; $ProgressPreference = 'Continue'; $verbosePreference='Continue'; remove-item $env:temp\* -Recurse -ErrorAction Ignore                                                                                    38 MB               
sha256:5f176a85b975fea1ea4e48ba59db9ece566f6396eb883346e31182e38a84bcef   7 minutes ago       powershell -Command $ErrorActionPreference = 'Stop'; $ProgressPreference = 'Continue'; $verbosePreference='Continue'; #(nop) ADD tarsum.v1+sha256:46f032ba7011d11e506afa9b9583a8ac07f800ebba6cb2b361b7631a454b455e in .\iisadmin.exe                          3.99 MB             
sha256:effc02de0d5e55627183a75e95a0427afa447d6fa92340dbdb707e41e6bdf8c3   7 minutes ago       powershell -Command $ErrorActionPreference = 'Stop'; $ProgressPreference = 'Continue'; $verbosePreference='Continue'; #(nop)  SHELL [powershell -Command $ErrorActionPreference = 'Stop'; $ProgressPreference = 'Continue'; $verbosePreference='Continue';]   41 kB               
sha256:015cd665fbddf0784e31e24ab98acc1b0c81852f26a05944060425db86ed9e49   3 days ago          Install update 10.0.14393.1358                                                                                                                                                                                                                                2.51 GB             
<missing>                                                                 6 months ago        Apply image 10.0.14393.0                                                                                                                                                                                                                                      7.68 GB             

Whenever you have a RUN command in your Dockerfile, Docker will create an image layer in your image. Essentially, your image is a bunch of diffs where each layer contains information on the difference between each layer. So here even if we delete the build tools at the end, they are still contained in a previous layer that Docker needs to built up the current layer.

So if I have ADD statement which adds something to the image and then I needed to execute that EXE for build purpouses inside container it will be impossible to shrink container at that point since ADD And RUN are 2 different layer. That means whatever I add will always be part of the image despite the fact that file is no longer there?

yes.
but if you have a relatively new version of docker you can build with docker build --squash. This will shrink all layers into one. Deleted files will then be not in the final layer anymore.

Squash did remove intermediate layer but last layer still stays (the one which was produced by Remove-Item command). Still don’t get it why Remove-Item adds to the image. Sizes are shown difference between previous layer and current layer not cumulative, so how exactly removing stuff from image can increase difference in size between 2 layers.

IMAGE                                                                     CREATED             CREATED BY

                  SIZE                COMMENT
sha256:fcf216ef5a7f0c56d8b024f1a3136e0aff3dba211f4d6da824425f3601ead6c1   34 seconds ago

                  38 MB               merge sha256:e9ab8109c03677ec0175815fddd451456621c14aea1d5e15fb42c5a03b13cbd5 to sha256:015cd665fbddf0784e31e24ab98acc1b0c81852
f26a05944060425db86ed9e49
<missing>                                                                 9 hours ago         powershell -Command $ErrorActionPreference = 'Stop'; $ProgressPreferenc
e = 'Continue'; $verbosePreference='Continue'; remove-item $env:temp\* -Recurse -ErrorAction Ignore
                  0 B
<missing>                                                                 9 hours ago         powershell -Command $ErrorActionPreference = 'Stop'; $ProgressPreferenc
e = 'Continue'; $verbosePreference='Continue'; #(nop) ADD tarsum.v1+sha256:46f032ba7011d11e506afa9b9583a8ac07f800ebba6cb2b361b7631a454b455e in .\iisadmin.exe
                  0 B
<missing>                                                                 9 hours ago         powershell -Command $ErrorActionPreference = 'Stop'; $ProgressPreferenc
e = 'Continue'; $verbosePreference='Continue'; #(nop)  SHELL [powershell -Command $ErrorActionPreference = 'Stop'; $ProgressPreference = 'Continue'; $verbosePreferen
ce='Continue';]   0 B
<missing>                                                                 3 days ago          Install update 10.0.14393.1358

                  2.51 GB
<missing>                                                                 6 months ago        Apply image 10.0.14393.0

Deleting files that were added on a previous layer does NOT to reduce the image size. The new layer created will simply not show these files anymore. Any file modified file is also copied onto new layers.

Haing said that, remember that EVERY instruction in a Dockerile (RUN, ADD, ENV, etc) creates a diferent layer. For this reason, the community has adopted the practice of bundling as many linux command lines as possible into one single RUN instruction.

It’s also worth mentioning that even simple linux commands, like touch, chown or chmod can cause files/directories to be replicated into the next layer. An innocent “chown -R” on a directory with hundredds of MB will cause all of its contents to be duplicated. Files are considered to have changed, so they are copied to the next layer.

This is a major concern while building images. I’m glad the “–squash” parameter is on its way and look forward to using it.

I have lost any understanding how image size calculated at the end. I understand concept of layers and recording changes in between layers. While I understand new layer even with deleted files will not show negative increase in layer size but how exactly a layer which is created with command which deletes files from previous layer can have a positive delta size? See example below. It’s 2 layers. One starts with 9f7 and layers on top of it is 11f. So layer 9f7 is 280 MB in size and contains bunch of files in workdir and temp folders. Layer 11f is created on top of layer 9f7 and deletes all content inside those folders. I might understand if you say that total size of image shall not change and layer 11f shall be 0 bytes in size. But why is it 63.1 MB?

sha256:11f02856c1019237f1daac1144ba630dc5efb5f72d4ec8c3b0b1e9ed12430f62   3 minutes ago       powershell -Command $ErrorActionPreference = 'Stop'; $ProgressPreference = 'Continue'; $verbosePreference='Continue'; del .\*; del $env:temp\*
               63.1MB
sha256:9f7953ab69e972d2542ce5c473a764992557bd6393cfffd71f6b0356155a1e40   3 minutes ago       powershell -Command $ErrorActionPreference = 'Stop'; $ProgressPreference = 'Continue'; $verbosePreference='Continue'; .\server_config.ps1; del .\server_config.ps1; del .\iisadmin.exe;
               280MB

The 63.1MB are probably some combination of:

  • whiteout files and directories that mark what’s deleted
  • Assorted log files and other metadata generated while running the delete commands