Docker import breaks docker build cache

I recently experienced unexpected behaviour from docker import and docker build when starting with a static tarball.

The scenario can be reproduced with this series of commands:

echo hello >hello.txt
echo -e 'FROM hello\nENV foo=bar' >Dockerfile

tar -cf hello.tar hello.txt

docker import hello.tar hello
docker inspect -f '{{.Id}} {{.Created}}' hello
docker build .
docker build .

docker import hello.tar hello
docker inspect -f '{{.Id}} {{.Created}}' hello
docker build .

Explained:

  1. I begin with a tarball with a single file, created once and not updated.
  2. I use docker import to import this tarball as an image named hello. This new image has a particular Id (eg sha256:...) and Created timestamp.
  3. I have a Dockerfile which uses this imported hello image as its base and adds a single layer which sets an environment variable.
  4. Executing docker build once performs the necessary steps. Executing docker build a second time recognises the layers from the previous build and reports ---> Using cache as expected.
  5. Re-importing the same tarball with the same contents and same file timestamps, results in an image with a new Id and new Created timestamp. Unexpected when the source tarball is unchanged.
  6. Re-running docker build does not leverage the build cache even though the base layer should have identical contents to the base layer used last time.

I am assuming the changed Id and Created timestamp of the re-imported image are responsible for breaking the build cache. I feel that docker import should be deterministic, or at least accept an argument to specify a Created timestamp (or use the tarball timestamp) and hopefully lead to a consist sha256 Id hash (it is supposed to be a content-derived hash right?).

Once docker import behaves deterministically, I presume docker build would then use the build cache as expected.

Before I raise this as a GitHub issue, are my expectations out of sync with the Docker image system?

as you have noticed, docker import does NOT examine (hash, …) the contents of the tarball before creating the image, and its matching id. I think the behavior is an acceptable design… you are injecting an outside object into the docker system.

i see your point also… but I think this is a change request, not a bug.