When I try to do in a local machine a build dockerfile, a sabe to file and a gzip this generated file, the size not same as size printed inside tags section of a repository, when do an automated build.
When hub.docker.com do an automated build, what command is launched to compress it?
Regards,
Cesar Jorge
Nothing as far as I’m aware. Automated build just does docker build
and then docker push
es the result. How big of a difference are you seeing? Could it be related to network config? It’s not out of the question for a build from the same Dockerfile to have different results on different computers.
Automatic build makes some kind of compression.
Depending on the image size, this difference is larger
An example of centos:6 image:
The centos:6 in web https://hub.docker.com/r/library/centos/tags/ is size 69 MB
If build the image in local:
git clone https: //github.com/CentOS/sig-cloud-instance-images
cd sig-cloud-instance-images
git checkout CentOS-6
cd docker
docker build .
Sending build context to Docker daemon 39.03 MB
Step 1 : FROM scratch
—>
Step 2 : MAINTAINER https://github.com/CentOS/sig-cloud-instance-images
—> Using cache
—> 176e1def595f
Step 3 : ADD centos-6-docker.tar.xz /
—> 0bd534633b30
Removing intermediate container d10caf05f06f
Step 4 : LABEL name “CentOS Base Image” vendor “CentOS” license “GPLv2” build-date “20160701”
—> Running in 656d28be4997
—> 96aeae3a64d7
Removing intermediate container 656d28be4997
Step 5 : CMD /bin/bash
—> Running in c62b3736d784
—> dbe51ffe4344
Removing intermediate container c62b3736d784
Successfully built dbe51ffe4344
docker save --output="/root/myimage.tar" dbe51ffe4344
du -h /root/myimage.tar
193M /root/myimage.tar
gzip -c /root/myimage.tar > /root/myimage.tar.gz
du -h /root/myimage.tar.gz
66M /root/myimage.tar.gz
(think I misunderstood slightly in my original reply)
If you’re worried about the 69MB vs. 66MB difference here, that could be attributed to all sorts of things, e.g. different git revision.
But you are showing that when you built locally and gzipped you wound up with a 66M file. The Hub one is 69M. So, I’m not seeing where this idea that Docker Hub somehow adds compression is coming from. Can you provide a more definitive example of an image which is actually significantly smaller on Hub than when built locally?
Other example (without builds):
https://hub.docker.com/r/repositoryjp/centos/tags/
(6.7, web size 206Mb)
Directly pull image:
docker pull repositoryjp/centos:6.7
Status: Downloaded newer image for docker.io/repositoryjp/centos:6.7
sudo docker images
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
docker.io/repositoryjp/centos 6.7 001712261f84 2 weeks ago 608.2 MB
sudo docker save --output="/root/myimage.tar" 001712261f84
Output of docker save (tar file):
sudo du -h /root/myimage.tar
595M /root/myimage.tar
Gzip output:
sudo gzip /root/myimage.tar
sudo du -h /root/myimage.tar.gz
198M /root/myimage.tar.gz
With ls (list bytes…)
sudo ls -l /root/myimage.tar.gz
-rw-r–r-- 1 root root 207177921 Jul 18 11:40 /root/myimage.tar.gz
207177921 / 1024 / 1024 = 197,5802621841431 Mb round 198Mb
Can be a visualization issue of your web page tags?
Regards,
Cesar Jorge
It is possible that Docker Hub may have a bug, but you’re also making a lot of assumptions here. The difference between 198M and 206M is not large at all and could be attributed to a wide variety of factors. It could be attributed to:
- Difference in Docker versions
- Difference in
gzip
and Docker compression (This seems highly probable to be contributing to me – Running gzip
is NOT the exact same operation that Docker performs when it compresses images for a push, since Docker most likely relies on the built-in Go version)
- Difference in output of
docker load
vs. the artifacts bundled for docker push
- Difference in system library handling of compression vs. Go native
You are benchmarking apples and oranges here. Why be concerned about these small differences in image size?
We use many images/containers. Impacts in performance because small differences are big when multiplied, and we work hard to minimize these image sizes
We actually use docker 1.9.1, with Centos/RH 7.2, updated
Then, the visualization in web page tags is correct? In your docker Hub servers the image stored is the same size?
Docker uses its own library to compress?
I’m not sure but I’d be somewhat surprised to find out that it’s very off.
It uses gzip package - compress/gzip - Go Packages to compress instead of gzip
binary IIRC. Try benchmarking against that to see if the results are any different. Note also that gzip settings (in Go or the CLI) can be modified to optimize for compression speed or for optimal compression, so that could effect the results you are seeing as well. A difference of a few MB is not really surprising.