How to debug buildkit caching?

Hi,

I have a base image, on which some frequently changing images are built upon, so caching is important.
This worked fine so far, but sometime after/around switching to 20.10.7 we can see more frequent full image builds on this base image, despite the unchanged Dockerfile and its contents.

The “failed” builds seem to restart from the first step, so there’s no CACHED at 2/34:

18:56:43 #5 [stage-0  1/34] FROM docker.io/library/ubuntu:focal@sha256:82becede498899ec668628e7cb0ad87b6e1c371cb8a1e597d83a47fac21d6af3
18:56:43 #5 sha256:f2cd327f458c0c3e109c4d1b5604acc64e00ce97749420432a214153a0744f94
18:56:43 #5 CACHED
18:56:43 
18:56:43 #7 [stage-0  2/34] ADD model/Dockerfile/om-datasci-base/.aws/ /root/.aws
18:56:43 #7 sha256:5d29f3c6631a503444d5e28e6bf29b111e976dc0460f47f878a6906443345f22
18:56:55 #7 DONE 12.7s

while a good build has it (and builds many steps in mixed/parallel order):

**17:49:32** #7 [stage-0 2/34] ADD model/Dockerfile/om-datasci-base/.aws/ /root/.aws **17:49:32** #7 sha256:ef9e29390b92491ae211935095abbb187a523be3872ceaf363d3a2f55bbca1e9 **17:49:32** #7 CACHED

AFAIK this file hasn’t changed, so I’m not sure why the full rebuild. The hash itself seems to be confusing here as well, because the previous build was CACHED, yet, it has a different hash (so a question here: what does it mean? I would assume the hash of the step, but despite it’s CACHED, this hash can’t be seen in any previous builds):

17:39:22 #7 [stage-0  2/34] ADD model/Dockerfile/om-datasci-base/.aws/ /root/.aws
17:39:22 #7 sha256:6a8e4df260e28eae328da6c69d7b00291ee5e505fe6d3cf31249e41d857d78fe
17:39:22 #7 CACHED

I’m building with the command:

docker buildx build --progress=plain ${BUILD_ARGS} \
            --force-rm=true --build-arg http_proxy=http://172.17.0.2:3128 \
            --build-arg https_proxy=http://172.17.0.2:3128 \
            --build-arg REPO_HASH=${REPO_HASH} \
            --build-arg GIT_TREE=${GIT_TREE} \
            --build-arg GIT_BLOB=${GIT_BLOB} \
            --cache-from $tag \
            --cache-to=type=inline \
            -t $tag \
            --file=$dockerfile .

now, but previously didn’t use inline caching and could observe similar behaviour.
We regularly prune the build cache, but wasn’t between the above runs and use

docker buildx prune -f --filter until=48h

Any ideas how to understand why is the image is getting rebuilt from scratch occasionally?