We have a local gitlab instance using 3 runners which works fine when we have a single build job running.
Sadly, when launching 3 build jobs using
dind in parallel, it fails with a multitude of errors:
- sometimes unable to login to docker to pull the image for cache
- sometimes the login succeeds and it fails in the build
but in both cases it complains about the certificate:
failed to dial gRPC: cannot connect to the Docker daemon. Is ‘docker daemon’ running on this host?: x509: certificate signed by unknown authority (possibly because of “crypto/rsa: verification error” while trying to verify candidate authority certificate “docker:dind CA”)
Suspecting that the certificates get crashed by the other build job, we decided to separate the folder used for certificates, so it is unique to each runner, sadly the issue remains.
We have also noticed that
DOCKER_HOST="tcp://docker:2376" the docker address is random, and many times returns the same value, which means again they are using the same resources.
I have found a guide on how to manually use a script to ensure each job is connected to its unique dind service (HERE), but since the article is over 5 years old, I wonder if that is still applicable or am I doing something wrong.
Please share any advice or guidance on where to look.