Why does docker push base image layers that are already on docker hub?

I’ve created a image, based on an image that is already published on dockerhub:

estebanmatias92/hhvm 3.7.0-fastcgi 8f2e7309ce69

This image has the following history:

$ docker history estebanmatias92/hhvm:3.7.0-fastcgi
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
8f2e7309ce69        3 weeks ago         /bin/sh -c #(nop) CMD ["hhvm" "--mode" "serve   0 B                 
2116df048ec4        3 weeks ago         /bin/sh -c #(nop) EXPOSE 9000/tcp               0 B                 
1b07751941c9        3 weeks ago         /bin/sh -c mkdir /var/run/hhvm                  0 B                 
2e417278871a        3 weeks ago         /bin/sh -c #(nop) COPY file:15df6ddfe60b3f2c5   500 B               
5a0fdf66ef9a        3 weeks ago         /bin/sh -c #(nop) COPY file:7027bf3a1a6376a39   1.925 kB            
bcc40f95c372        3 weeks ago         /bin/sh -c /usr/bin/update-alternatives --ins   5.879 kB            
ef0eb6a48f67        3 weeks ago         /bin/sh -c set -x     && git clone git://gith   105.3 MB            
1811eb2b8921        3 weeks ago         /bin/sh -c #(nop) ENV HHVM_VERSION=HHVM-3.7.0   0 B                 
5a5c97cbfda7        3 weeks ago         /bin/sh -c mkdir $PHP_INI_DIR                   0 B                 
5df616860635        3 weeks ago         /bin/sh -c #(nop) ENV PHP_INI_DIR=/etc/hhvm/    0 B                 
3915977dff9c        3 weeks ago         /bin/sh -c apt-get update && apt-get install    24.97 MB            
ec59dd9636e9        3 weeks ago         /bin/sh -c apt-get update && apt-get install    791.9 MB            
0e3c9f825252        3 weeks ago         /bin/sh -c #(nop) MAINTAINER "Matias Esteban"   0 B                 
bf84c1d84a8f        3 weeks ago         /bin/sh -c #(nop) CMD ["/bin/bash"]             0 B                 
64e5325c0d9d        3 weeks ago         /bin/sh -c #(nop) ADD file:085531d120d9b9b091   125.2 MB

My image uses FROM estebanmatias92/hhvm:3.7.0-fastcgi. I’ll leave out the details, as they don’t matter for this case. After I’ve built my image and try to push it to docker hub, it still pushes some of the base images layers:

ec59dd9636e9: Pushing [=============================================>     ] 249.5 MB/273.4 MB

As you see from the history above, this layer is already on docker hub. Why is it pushed again?

2 Likes

The reason for this is that images can be private.

So: imagine you eavesdrop on someone and get access to the layer ids of their private content. If you could push an image to the hub pointing to these layer ids, you could then gain access to the layer themselves.
So, in this case, the hub requires you to upload the actual content, in order to prove that you do have access to it.
Next time you push a different tag of the same image though, the shared layers will not have to be pushed.

There are certainly optimizations that can (and will) be worked on in the future to not push, for example, layers that are known to be from public images, but basically the answer to your question is: the first time you push an image under your name, you have to prove you have the layers you want to link to.

Hope that helps.

Ok, thanks, that makes sense.

This makes me wonder what happens during download: If I base my image on another public image, will users of my image then have to download my copy (if there is any…) of the layers of the public base image again, even if they already have them?

During download you should never have to download a resource that you already have (eg: if you have a layer with digest X, then it’s usable for any image that link to it without having to redownload it).

When these optimizations are going to be implemented?
On some cloud continuos integration systems you end up pushing again over and over all the layers.
Sometimes the layer that is being built is 10Mb and the base image is 600Mb…
That is a useless waste of bandwidth and time.

1 Like

There is no reason why you would “push again over and over” all layers.
Once you push the base layers to a given image (first time pushing to “user/foo”), you never have to push them again.

There is still some problems with pushing Private Images to DockerHub. I have loads of tags and what I’m changing is only top layer that is max 5 MB. But every time push still pushes loads of layers.
Let me exemplify:

  • First pull image from Docker Hub
    docker pull kuekuune/heatmap-client:0.27.1.51 0.27.1.51: Pulling from kuekuune/heatmap-client fdd5d7827f33: Already exists a3ed95caeb02: Already exists 716f7a5f3082: Already exists 7b10f03a0309: Already exists d8f1cf58f924: Already exists 9df68f97f936: Already exists 758357287b8a: Already exists 60162a70eb35: Already exists 3b652fca4381: Already exists a49d0a410c40: Already exists 9124416e8857: Already exists ab273d36c0c2: Already exists 18b678a0c607: Pull complete Digest: sha256:00c1442a3d190686ff2c70776d176bd2afc939861a825e09014d5a8ede22783c Status: Downloaded newer image for kuekuune/heatmap-client:0.27.1.51
  • Then push it right back to DockerHub
    docker push kuekuune/heatmap-client:0.27.1.51 The push refers to a repository [docker.io/kuekuune/heatmap-client] 387403934352: Layer already exists f2790b5754a4: Layer already exists 1c78f1350876: Layer already exists aaedb5e569af: Pushed 14bfeef71cab: Pushed 796e2ee71356: Pushed 7cbd3e76c6a5: Pushing [===========> ] 52.67 MB/223.9 MB 288d77512d07: Pushed 629a9320508f: Pushing [=======================================> ] 31.86 MB/39.98 MB 5f70bf18a086: Layer already exists 3f3324023e75: Layer already exists f0d7d68f89e5: Layer already exists 917c0fc99b35: Pushing [=======> ] 19.55 MB/125.1 MB
  • As can be seen, only 6 layers are considered to exist in DockerHub.
  • In the end my build takes time 3 minutes and 10 minutes I push 4 images to DockerHub.

Docker Engine information:
`docker version
Client:
Version: 1.10.3
API version: 1.22
Go version: go1.5.3
Git commit: 20f81dd
Built: Thu Mar 10 21:49:11 2016
OS/Arch: darwin/amd64

Server:
Version: 1.10.3
API version: 1.22
Go version: go1.5.3
Git commit: 20f81dd
Built: Thu Mar 10 21:49:11 2016
OS/Arch: linux/amd64`

I found one open bug that v2 registry does not recognize layers: https://github.com/docker/distribution/issues/1411. Might this be the source of my troubles?

The optimizations now appear to be in place. Docker should no longer push base image layers that already exist on the registry. Instead, the client will perform what is referred to as a “cross-repository blob mount”, which makes a blob from a different repository available in the one being pushed.

I’m developing a repository from microsoft/windowsservercore.
When pushing my image, I see the big fat layers of microsoft/windowsservercore. being uploaded also.
Initially my repository was private and when I read this thread I switched it to public.
This is snapshot from the push

The push refers to a repository [docker.io/asarafian/mininugetserver]
f86d94080e37: Layer already exists
ab2b05472a1a: Layer already exists
1390aff6c430: Layer already exists
01d4324cff0f: Layer already exists
e722c9bc5429: Pushing [> ] 3.768 MB/515.3 MB
aa119f26bc6b: Pushing [==> ] 12.97 MB/281 MB
227185834f0c: Layer already exists
09d11b626e02: Layer already exists
46e02c68ba30: Layer already exists
2013a7156691: Layer already exists
de57d9086f9a: Pushing [=> ] 36.6 MB/1.744 GB
f358be10862c: Pushing [> ] 2.755 MB/7.677 GB

As you can see the last two layers are the big ones that it will try to push. Based on my understanding that shouldn’t happen from the first time and not just when updating the repository.

History is

IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
482229c83c20        53 minutes ago      cmd /S /C #(nop)  CMD ["cmd" "/S" "/C" "po...   41 kB
0a70c1d54c60        53 minutes ago      cmd /S /C #(nop)  ENV packagesPath=~/Packages   41 kB
eb4744e51270        53 minutes ago      cmd /S /C #(nop)  ENV apikey=mininugetserver    515 MB
4c19c500948d        53 minutes ago      cmd /S /C #(nop)  EXPOSE 80/tcp                 281 MB
b63e883f08e3        53 minutes ago      cmd /S /C powershell -NoProfile -NonIntera...   177 kB
ab445b6a0384        About an hour ago   cmd /S /C powershell -NoProfile -NonIntera...   47.4 kB
9e55e4be25c6        About an hour ago   cmd /S /C #(nop) ADD tarsum.v1+sha256:912b...   111 kB
1bb8e1cd2394        About an hour ago   cmd /S /C #(nop) COPY file:1158ad9458093d9...   41 kB
fff7d3803a6a        About an hour ago   cmd /S /C #(nop) ADD dir:d3b205d629ebb7e85...   1.74 GB
6ad1e575a6a8        4 days ago          cmd /S /C #(nop)  MAINTAINER Alex Sarafian      7.68 GB