I recently experienced unexpected behaviour from docker import and docker build when starting with a static tarball.
The scenario can be reproduced with this series of commands:
echo hello >hello.txt
echo -e 'FROM hello\nENV foo=bar' >Dockerfile
tar -cf hello.tar hello.txt
docker import hello.tar hello
docker inspect -f '{{.Id}} {{.Created}}' hello
docker build .
docker build .
docker import hello.tar hello
docker inspect -f '{{.Id}} {{.Created}}' hello
docker build .
Explained:
- I begin with a tarball with a single file, created once and not updated.
- I use
docker importto import this tarball as an image namedhello. This new image has a particular Id (egsha256:...) and Created timestamp. - I have a Dockerfile which uses this imported
helloimage as its base and adds a single layer which sets an environment variable. - Executing
docker buildonce performs the necessary steps. Executingdocker builda second time recognises the layers from the previous build and reports---> Using cacheas expected. - Re-importing the same tarball with the same contents and same file timestamps, results in an image with a new Id and new Created timestamp. Unexpected when the source tarball is unchanged.
- Re-running
docker builddoes not leverage the build cache even though the base layer should have identical contents to the base layer used last time.
I am assuming the changed Id and Created timestamp of the re-imported image are responsible for breaking the build cache. I feel that docker import should be deterministic, or at least accept an argument to specify a Created timestamp (or use the tarball timestamp) and hopefully lead to a consist sha256 Id hash (it is supposed to be a content-derived hash right?).
Once docker import behaves deterministically, I presume docker build would then use the build cache as expected.
Before I raise this as a GitHub issue, are my expectations out of sync with the Docker image system?