Setup:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 516
Server Version: 18.03.1-ce
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 773c489c9c1b21a6d78b5c538cd395416ec50f88
runc version: 4fc53a81fb7c994640722ac585fa9ca548971871
init version: 949e6fa
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.4.0-1060-aws
Operating System: Ubuntu 16.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 15.67GiB
Name: ip-10-60-5-65
ID: VJPH:S4RT:BP2J:D4AF:AF7L:ZJVP:QS7Y:YOSL:QPAC:EVMY:GCEW:JVA5
Docker Root Dir: /home/ubuntu/data/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: No swap limit support
Scenario:
We’ve got an automated docker build
process that runs every few hours, feeds in new data files to the build using --build-arg
, and creates a new image, tags it, and it as a tarball via docker save $IMAGE:$TAG | gzip > image.tgz
.
For our use case, we have to publish the image to our own registry without using the docker daemon, so we’ve hand-rolled our own solution that unpacks the tarball and publishes each layer one-by-one. The untarred layers in the image each have a layer.tar
file in them.
In recent weeks, without any change to the underlying Dockerfile, we’ve noticed that one of those $LAYERNAME/layer.tar
files symlink to a $OTHERLAYERNAME/layer.tar
in a previous layer. Specifically, the final layer in the manifest.json
file has a tar file that symlinks to a tar file a few layers prior to this final one (in the ordering in manifest.json
).
Ex: (output truncated)
$ ls -al $(find . | grep tar)
-rw-r--r-- 1 ubuntu 1718279092 2.0K Jun 3 22:32 ./77ff53ee0f3089958822132855087fc29fbf1c55aa87ace208bd4afd9c420898/layer.tar
lrwxr-xr-x 1 ubuntu 1718279092 77B Jun 4 16:03 ./96eace2f03af5a2b3a715dd77590c2cd33e35cb86af84461fd5440a48605cd29/layer.tar@ -> ../77ff53ee0f3089958822132855087fc29fbf1c55aa87ace208bd4afd9c420898/layer.tar
The Dockerfile does nothing special, we simply base off ubuntu:16.04
, install python and some other libraries, copy in some data files, and finally set up an entrypoint script to setup a server to serve information from the data files. However, we have been unable to write Dockerfiles following the same pattern that replicate this symlinking behavior.
Questions:
- Where in the docker source code does it decide if a layer should create a new
layer.tar
? - How can I write a Dockerfile in such a way that I could replicate this scenario?
- Is this effect a facet of layer caching?
- Is there a production-ready solution out there for pushing a docker tarball layer-by-layer to a private registry without using the docker daemon?