Docker Caching - do I restore my cache when I pull an image?

Hi all,

This may be a naive question, but…

When I pull an image from the docker registry, do I also get the benefits of any docker caching (i.e. intermediary layers) when I try to re-build that same image (locally)?

e.g.

Given the Dockerfile

FROM alpine
COPY . .
RUN ls -la

If it’s built remotely and published to a registry, then I docker pull deleteme:latest the image to my local machine, and try to build docker build . --tag deleteme, assuming my local directory (checksum?) isn’t different from where it was built, will the build have access to the existing cache (I assume these are docker images?).

i.e.

$ docker build . --tag deleteme
Sending build context to Docker daemon
Step 1/3 : FROM alpine
 ---> cdf98d1859c1
Step 2/3 : COPY . .
 ---> Using cache
 ---> 45541c2f74df
Step 3/3 : RUN ls -la
 ---> Using cache
 ---> eda2a2c93544
Successfully built eda2a2c93544
Successfully tagged deleteme:latest
REPOSITORY   TAG        IMAGE ID        CREATED           
deleteme     latest     eda2a2c93544    About a minute ago
<none>       <none>     d6f5782e1c09    2 minutes ago        <-- is cache?/shared?

(Obviously d6f5782e1c09 doesn’t match 45541c2f74df and eda2a2c93544)

--cache-from seems interesting, but not sure it works with a remotely pulled image?

Follow up question: If not, then where is the cache stored, and can it be shared somehow?

Kind regards,
Nick

1 Like

Hello Nick, were you able to resolve this? I’m having the same question.

1 Like

Yes, but Docker caches layers not images. When you see:

Step 2/3 : COPY . .
 ---> Using cache
 ---> 45541c2f74df

It is using the layer 45541c2f74df from its cache. As long as nothing changes in the folder for the COPY command, the cache will be used. If the contents do change, this will invalidate the cache for that line and every line after that in the Dockerfile. So once you get a cache miss, the Dockerfile is rebuilt from that point forward. This is why it is recommended that you place things that change often (like source code) at the bottom of your Dockerfile and things that change less often (like installing dependencies) near the top.

The images with the tag <none> <none> are called “dangling” images. They could contain cached layers and probably do if you are rebuilding an image. You can remove these images with the command:

docker image prune

Any dangling image that are not being used will be deleted.

If you want to see the layers in a Dockerfile use the command:

docker inspect <image_name>

At the end of the inspection output you will see a node called RootFS with the sha256 of all of the layers contained within the image. If you build a new image and it uses the same beginning lines in the Dockerfile the layers from the existing image will be reused until the contents being added to the image are different.

I start all of my Dockerfile with pretty much the same commands and only the app that gets copied is different so most of the time, all of those layers come from the cache of my other images. Docker takes care of all of this caching for you.

If you want to test this out for your original question about pulls, try:

docker pull alpine

followed by

docker pull redis:alpine

The second pull will say:

alpine: Pulling from library/redis
e6b0cf9c0882: Already exists <-- This is the layer from the Alpine image 
7c5ff11edca6: Pull complete 
14fa80ee9473: Pull complete 
4d4f6840431a: Pull complete 
9d4162ad1104: Pull complete 
b2c320096d0f: Pull complete 
Digest: sha256:a4e0b7bff7ecec0dc0be95d185d6c99323a92a51065d9563a5bafbc1cf6b3497
Status: Downloaded newer image for redis:alpine
docker.io/library/redis:alpine

Notice that e6b0cf9c0882 already exists. That is the layer from the Alpine image that you already pulled. So Docker is smart enough to only pull down layers from Docker Hub that you don’t already have and it will reuse them when building your own Docker images.

Hope that helps,

~jr

So silly that Docker makes this confusing. Your response only pulls and doesn’t build. And the layer id in the pulls is a different number of characters as the layer id in the build. Just silly. We need this to be much more clear.