Local docker build basing on incorrect image somehow

Docker version 24.0.7-ce, build 311b9ff0aa93

I’ve written a Dockerfile that produces a custom image based on opensuse/leap (the exact contents of which aren’t important).

I build this using docker buildx build -t my-base .

I run a container using docker run -it --rm my-base and inspect the filesystem. So far so good, everything is as I expect.

Now I run this:

docker buildx build --no-cache -t my-test - <<EOF
FROM my-base
EOF
docker run -it --rm my-test

Inspecting this container, it does not match the first container – it has similar contents but from a build made several days ago (where I’ve since changed the Dockerfile to do different things).

How is this possible? I’ve already tried docker buildx prune -a to clear cache and it still somehow manages to find the wrong image the second time. I’ve also tried to explicitly specify :latest tag in the FROM line and it makes no difference. I can’t specify a digest because local images don’t have one.

I do still have a tagged image which has the filesystem in the “wrong” state (and I don’t want to remove it at the moment), but the image repository is something completely different and the Dockerfile commands that were used to build it are also completely different, so I can’t see how it would get sucked into this build.

I don’t have any other images with the same repository name as either my-base or my-test, and I also tried completely removing these images prior to building both.


Elsewhere someone suggested using the IMAGE ID from docker image ls in the FROM line, but trying that I just get:

ERROR: failed to solve: fc27666e10d4: pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed

(Again: local image, so it shouldn’t be trying to pull it from anywhere.)


I can see the issue by inspecting both the images – I’m expecting them to either have completely identical layers, or perhaps for my-test to have one more layer, but instead they’re almost completely unrelated (only the first layer is the same):

my-base:

            "Layers": [
                "sha256:442880a3cbe795a6b5c20f0af0ac2e771ed7e574d641cf22af260466f140b6bf",
                "sha256:021d3e623a9a5263f04fc5cdc1f189254dd8d6e127dfc05c0d7a722402a76852",
                "sha256:a6ef1a2a99b028ba344c02a78e11278b23d54e15065b7b965f090838ac5a4eca",
                "sha256:10d16b64d36e2d2c193dd84117c0f6441d2a367327de0a8b5286b9d8b21b194f",
                "sha256:0d4a67d3c825655253d9be3f856920e545f2f1a37eb266be05aea530c2e5069b",
                "sha256:a47702fc019246c0fb8b6c37357015a80fe4ca0f17f97032d1db012275c3916a",
                "sha256:f7606c98c9a608d354677cf44e258fd683e8655f0dacbb5517c85193c5754f08",
                "sha256:221b04d2088ebc66209367e306794872cd21805de8d3284d1e8592f0b77cd6d2",
                "sha256:8cc8e3d11d57e873feb836b24cc3d38c9b8af54bff4439a859bfddc54434ce2e",
                "sha256:44f3fc11d3cc8016f96f35ef0d76dd9d9004b66ee8690c0f3c18b923b21d9dbd",
                "sha256:b4993d99e2af256135d84d9bd8c471aeb95fc4dc8f74f0b255c9ad393f05b413",
                "sha256:326bcf2655645b345d25e951f4c8277f9e303b9a4b09d8e57d171d146e9fad18",
                "sha256:398f93ab0ae8412a0f39776d012d262f70c4cb3580ca6520f6d39ebd678804e4",
                "sha256:e8cd574bb35aadf502b0cec05673e6546f8df948de30950201249ab86265fdc9",
                "sha256:ea7732c596ddc906d03b60154d5b03308e0b705348c7a300637dfc26ca8122b4",
                "sha256:a43fa6f577e3a0dfc658abf95eba3c2ee874af0091615c865caf634f33fd1794",
                "sha256:b59232bb2b5790f2cda0e8a288c6274a244375bec0c3c7d5344fa4d352f2c616",
                "sha256:9a2e64d0349c1f3998b39e95f88517c58949edc8b80f7a82a7690e4fa385937e",
                "sha256:9429027a94f396ac40ac8eae68331b666238503acf92309d20955449a4ccab02"
            ]

my-test:

            "Layers": [
                "sha256:442880a3cbe795a6b5c20f0af0ac2e771ed7e574d641cf22af260466f140b6bf",
                "sha256:2ad73c2bb2fb09e985af5740aaec9571e4fe09efa0eb7d6b4c70315ef1d0b9ba",
                "sha256:e8faef621830e9b1fab6ac407debd5dbcda9fdfe499fbe6e8b7f5799a3435653",
                "sha256:2c3e195f75414053bf639242d1321b05e963acc8e0d4401415f77189d39040e3",
                "sha256:0a8c2cac59651fb097712dfe8609f3a9e9fe0684256fe4a9cbd15da4e1b00522",
                "sha256:e50a3c6b615be06adfaa20ba4ebf1b20c713b79912599ccc08b16016590efaf7",
                "sha256:3a16c86e22d17c72f4d9cc775c6e704c18741930251fc0059431d38e2c62a79d",
                "sha256:026c201dafb6f15b3cf9af45bcbd488f60e587d63ab5f1243fd189c630323fa1",
                "sha256:c432fb58d172128bd16b70b2453632087db7ea11e8d494155d985f89428121d1",
                "sha256:bcbbe35d1e36bb8ad374cd5534f5db6855c6ed888a123f7badcfa5d715a159a8",
                "sha256:6bf1f1e79ed2ec7c88677dc5cd300edcb2fa163b84e0e275ebca432b9fcf581d",
                "sha256:ff9c64e0e999b8105edc97d88e2b586b2f10664b425078a772aad08bc0d37248",
                "sha256:f19dc1a32b8c213a572dccd325959b513ff5f72691e799a3b6eab795a4cb55e6",
                "sha256:d1ef903603c86dc081af70b566bce12a42b6998bae84f343af3dc1751eab4934",
                "sha256:88dc97adcd7bcaa29509b5a35ca122715ad22c7b3f1086636ba12a9a084198a9",
                "sha256:de0805b7aac7b8b4fa8d9b2060176fe69d0d3848f3d5599d212b202c8f1bdae4",
                "sha256:7c84020757098cd396429f28df0bf5ad65fce35de5eef044c24da2c0ccf41a1a",
                "sha256:3448d2bc7e18a8ef22c530e0f49dc17b747f44fe71799a29030a2ac776d570a7",
                "sha256:9e4231aadb6e77901cb696503bacf4161bcc03e1a7f58ce8e1e2546d2bd047b1"
            ]

I’ve found that this particular issue does seem to be related to the repository name, in that I only get the above happening when both builds use my-base as the name. If I build them with any other name, then the first build completes immediately from cache (producing an identical image hash) but the second build then finds the correct filesystem.

As best I can tell, the incorrect filesystem is from some earlier build of my-base and does not match anything in any tagged images. What really bothers me is that I can’t find any way to get rid of it – even after docker rmi my-base, docker image prune, and docker buildx prune -a it somehow manages to find it again.

Is there some way to tell why these bad layers aren’t getting pruned? Or how to prevent them being picked up as part of a repository they should have been disconnected from after a subsequent build?

Regarding using the IMAGE ID in the FROM line, I’ve since discovered that this only works in non-BuildKit mode (i.e. with DOCKER_BUILDKIT=0) defined. This is obviously non-ideal – while it fixes the exact case above, there are some other cases where I would like to base a new image on an existing image while also using BuildKit features.

Is this a bug, or does BuildKit require different syntax to use image ids? Using the id would be preferable to avoid similar caching problems in the future.

I’ve tried using similar syntaxes with the full id (from docker images -q --no-trunc my-base), such as:

FROM cc887f1ec6e7
FROM sha256:cc887f1ec6e7a5f9061607a870dd1b31bedf03aedafe42dbcd775887de1e8be5
FROM my-base@sha256:cc887f1ec6e7a5f9061607a870dd1b31bedf03aedafe42dbcd775887de1e8be5
FROM my-base:latest@sha256:cc887f1ec6e7a5f9061607a870dd1b31bedf03aedafe42dbcd775887de1e8be5
FROM cc887f1ec6e7a5f9061607a870dd1b31bedf03aedafe42dbcd775887de1e8be5

All of these fail by stating that the repository was not found, except the last which fails with an error that 64-byte hex strings are not allowed.

You possibly already solved it or moved on, but I share some notes.

You can use the id (digest) as described here: https://docs.docker.com/reference/cli/docker/image/pull/#pull-an-image-by-digest-immutable-identifier, so with a repository name and the digest only after it separated by @.

It tried to pull, because the reference was invalid and there was nothing indicating it was a digest. It was handled as a regular image, and without any owner in the name, it would be an official image, and there is no official imag witht hat name.

Do you have any way to reproduce it in a newer Docker version? If it can be reproduced in a newer Docker version, it is more likely to be fixed. If you can’t try to reproduce it in newer Docker, you can still open an issue, but among other issues it would probably not get a high priority and it would require someone to check if it is still an issue.

How did you install Docker? Would there be a newer version available to you in that environment?

If you want to open an issue, you can try GitHub · Where software is built or since you discovered it was related to buildkit, you can try GitHub - moby/buildkit: concurrent, cache-efficient, and Dockerfile-agnostic builder toolkit too.

Local images don’t have a digest (and as far as I understand it, there is no way to give them one, short of pushing to a registry and re-pulling). I explicitly do not want to push the original image anywhere; this is a purely local operation.

I have found that the subsequent build will use the correct image if I give it a unique tag, e.g:

id=$(docker images -q my_base:latest)
docker tag my_base:latest my_base:$id
...
FROM my_base:$id

If I use my_base:latest directly then it will use the wrong (outdated) image. I have no idea where it’s even storing that, since I pruned the images and builder cache. Unlike in non-buildkit mode, there does not appear to be any syntax that works in BuildKit to allow using $id alone without explicitly tagging the image first.

No, since I’m not sure how it got into that state, other than by iterating on the contents of the Dockerfile across multiple builds. My best guess is that BuildKit is choosing the oldest available image in the same repository instead of the newest, when multiple are available. Having said that, there shouldn’t be multiple available in the first place, unless it’s sneaking one out of some cache that didn’t get deleted even when I told it to.

You are right. I just quickly reacted to the digest part and focused on the correct syntax not checking that it wouldn’t work with an image that was not pulled from a registry. I rarely use digest so I was wrong on that one. Even if we could add a digest, it would be just a workaround, so the bets would be to find out the original issue, but I have to say I don’t really have any idea.