Same Dockerfile and build context, different layer digests

roy2006 · November 16, 2023, 5:33pm

I’m building the following Dockerfile twice using buildx. For some reason I fail to understand the layers produced from the RUN command have different digests. The two executions of the build happen within seconds on the same host (Mac Intel, Docker Engine 24.06).

This is the Dockerfile:

FROM python:3.10-slim-bullseye 

RUN sh -c "echo hello"

These are the commands I’m using to run the build twice:

docker buildx build --build-arg BUILDKIT_STEP_LOG=1 --progress plain --no-cache --tag ca-test:1 . 
docker buildx build --build-arg BUILDKIT_STEP_LOG=1 --progress plain --no-cache --tag ca-test:2 .

The resulting layer digest for the first build are:

  "sha256:74c0af6e02274b54b88f851843ae69880a234694dede8ff9fb93bfa076af45ed",
  ...
  "sha256:9647f452a52939a4807a7534ecbac34470618d7822881739b99c076862f69fdf",
  "sha256:a12f7c086f54288199c551849f2ece71758541ca02bda42793a6e8efe754b91f"

While for the second build, they are (note the difference in the last layer):

  "sha256:74c0af6e02274b54b88f851843ae69880a234694dede8ff9fb93bfa076af45ed",
  ...
  "sha256:9647f452a52939a4807a7534ecbac34470618d7822881739b99c076862f69fdf",
  "sha256:cf80db41eb34c3282f4456b0b71bab7c0521078e60cccbc01de407197dd54b6d"

Any help would be greatly appreciated.

bluepuma77 · November 16, 2023, 7:47pm

Maybe the dive tool (link) can provide more insights on the layer and it’s content.

roy2006 · November 16, 2023, 8:26pm

I’m familiar with dive, but not an expert. As far as I could see, it doesn’t provide any additional information about the impact of this layer on the content of the image (which is no impact whatsoever …).

Using ^u to filter out unmodified files, this layer doesn’t seem to have any impact on the content of the image (as expected).

And BTW - I was able to reproduce the same phenomenon on a Ubuntu host.

rimelek · November 16, 2023, 10:44pm

--no-cache

You disabled the cache so your new layer created by the RUN instruction will always be different.

roy2006 · November 17, 2023, 4:42am

Isn’t the digest of the layer computed based on its contents? If so, there’s no difference between the content of the layers in the two builds. Also, using a different base image (e.g. alpine) does produce the same layer digest time after time again.

meyay · November 17, 2023, 7:06am

Docker builds are repeatable, but not reproducible.

If you build an image with the same Dockerfile twice, the content of the image can vary in package versions, and will vary in the data of files, and in the date of the image layer metadata.

For your Dockerfile example, if cache would be used, I would have expected that Docker detects no change and uses all image layers from the build cache

roy2006 · November 17, 2023, 8:15am

@meyay - I’m aware of the fact that the resulting image may vary between two builds because of timestamps or external dependencies such as packages. My question, however, was about layers digests - not image, and the dockerfile quoted above clearly doesn’t have any external dependencies.

This is even more strange considering the fact that slight changes to the dockerfile (different base image or copying a file to the root directory of the image) do produce the same layer digest.

rimelek · November 17, 2023, 7:21pm

As far as I know the cache works based on what the parent layer was and what the command was that generated it. That’s why running apt-get update in a separate RUN instruction is not a good idea as the cache would never be invalidated unless a parent layer is also invalidated, since the command is the same. When you put it in front of apt-get install and you change the package list, that invalidates the layer and apt update runs. I know, its image layer not filesystem layer, but these are related. Note that I’m not saying it works as I describe here, but this is how I imagine now. So you disable the cache, which also means you want to run the command in the RUN instructions again. That will result some output and Docker will not now what the output would be. The same command could produce a random output as well, so it will be stored in a different folder. After that Docker could make a hash and try to find the same content somewhere else and drop the temporary folder, but it seems this is not what is happening.

Can you show an example file? I can’t imagine how that would work, as a different base image means different content in the image so every output could make different changes even if the commands are the same.

When you copy a file, Docker can check the content before copying and the command is also the same Since the filesystem layers are mounted on top of eachother, it’s not a problem even when you mount it on top of totally different layers when the base images were different.

Topic		Replies	Views
Can't find why my docker images digest are different when building from the exact same repo version General docker , build	9	11640	January 23, 2021
Restart docker build General docker , build	2	15044	June 23, 2016
Sending build context to Docker daemon seems quite high General build	3	57486	July 13, 2018
Issue raised with Docker layer General docker , build	1	46	September 27, 2024
Does each layer in docker container filesystem contain the full filesystem or only the changed ones? Image Builds docker , build	5	174	November 9, 2024

Same Dockerfile and build context, different layer digests

Related topics