Docker buildx hangs

I have one docker file for amd and another for arm. Let’s call them my-priv-registry/amd and my-priv-registry/arm. I have a CI running which compiles the images when a modification is done. The amd.dockerfile and the arm.dockerfile are the same

FROM ubuntu:20.04

WORKDIR /install

ARG DEBIAN_FRONTEND=noninteractive

RUN apt-get update && apt-get install -y --no-install-recommends \
    python3 python3-pip python3-dev \
    git wget build-essential \
    curl zip unzip tar \
    vim automake libssl-dev \
    && rm -rf /var/lib/apt/lists/*

RUN python3 -m pip install -U pip

RUN apt-get update && apt-get install -y --no-install-recommends cmake \
    && rm -rf /var/lib/apt/lists/*

COPY --chown=1000:1000 boost.sh .
RUN ./boost.sh

... I stop here as the CI stops here too

I already have my images in my docker registry and I can connect to the host of gitlab-runner and actually pull. I also can list the tags I have locally in the host.

The configuration of my runner is

concurrent = 8
check_interval = 0

[session_server]
  session_timeout = 1800
[[runners]]
  name = "my_executor"
  url = "https://my.server.io"
  token = "WontShowIt"
  executor = "shell"
  environment = [
    "GIT_STRATEGY=clone",
  ]
  shell = "bash"
  cache_dir = "/cache"
  [runners.custom_build_dir]
  [runners.cache]
    [runners.cache.s3]
    [runners.cache.gcs]
    [runners.cache.azure]

In my CI I’m compiling the image like this

DOCKER_BUILDKIT=1 docker buildx build \
  --platform linux/arm64 \
  -t my-priv-registry/arm:test \
  -f arm.dockerfile \
  --cache-to=type=inline \
  --cache-from=type=registry,ref=my-priv-registry/arm:latest \
  install_scripts

However, I am getting the following error

#12 [internal] load build context
#12 DONE 0.0s
#4 importing cache manifest from my-priv-registry/arm:latest
#4 ERROR: pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed
#12 [internal] load build context
#12 transferring context: 7.58kB done
#12 DONE 0.0s

and then it just get stuck.

Any idea of why is it failing?
Some more info:

  • the amd image actually works.
  • I suppose it means it is not a login problem
  • I can connect to the host of gitlab-runner and pull the image and it works
  • If I connect to the host and run the build by myself it also hangs so it doesn’t seem to be a problem of gitlab-runner.
  • I’m running all the build jobs in parallel (thre build jobs using buildx), could it be a problem?

I also tried importing the cache as:

--cache-from=type=local,src=/var/lib/docker/buildkit

but it still gets stuck.

Also tried with a different builder

docker buildx create --name DockerImageBuilder --use

and using cache locally

    --cache-from type=local,src=/tmp/docker-cache \
    --cache-to   type=local,dest=/tmp/docker-cache,mode=max \

Still hanging…

I restarted the host PC and… it worked for a few pipelines but now it is hanging again.