Cannot push image to registry in docker swarm

I have this working in development (because everything is on one machine), but on production I have a docker swarm with managers and worker nodes. When I run docker commit it creates a snapshot and running docker image ls confirms this, but when I try to push to a private registry (ecr) docker cannot find the image. I guess this is because the image is on the worker and the manager cannot find it, or the image is on the manager and the worker cannot find it. How can I fix this?

{ 
  Error: '(HTTP code 404) no such image - No such image: {account-id}.dkr.ecr.us-east-1.amazonaws.com/{repo}:{tag}'
    at /root/labs/node_modules/docker-modem/lib/modem.js:257:17
    at getCause (/root/labs/node_modules/docker-modem/lib/modem.js:287:7)
    at Modem.buildPayload (/root/labs/node_modules/docker-modem/lib/modem.js:256:5)
    at IncomingMessage.<anonymous> (/root/labs/node_modules/docker-modem/lib/modem.js:232:14)
    at Object.apply (/root/labs/node_modules/harmony-reflect/reflect.js:2064:37)
    at IncomingMessage.emit (events.js:187:15)
    at Object.apply (/root/labs/node_modules/harmony-reflect/reflect.js:2064:37)
    at IncomingMessage.EventEmitter.emit (domain.js:441:20)
    at endReadableNT (_stream_readable.js:1094:12)
    at Object.apply (/root/labs/node_modules/harmony-reflect/reflect.js:2064:37)
    at process._tickCallback (internal/process/next_tick.js:63:19)
  reason: 'no such image',
  statusCode: 404,
  json: { 
    message: 'No such image: {account-id}.dkr.ecr.us-east-1.amazonaws.com/{repo}:{tag}' 
  }
} 

Some people have said the way to solve this is adding a private registry. I see how the registry could make it easier for managers, workers. basically all nodes to find an image, however, we are using docker commit and this saves the image locally. Unless docker commit updates the registry as well.

Hi.

You are not showing the commands you use to commit the docker image, tag the docker image, push the docker image or pull the docker image, so it is hard to assist you.

A docker commit just builds an image on the local machine. It does not push it to a registry and it does not make it available to other docker nodes.

Here’s an example from my Docker Swarm environment with 1 manager and 2 workers. I will push an image up to my private registry in AWS ECR on the manager node and then run a container from that image on a worker node. My private registry in AWS ECR is 641851146588.dkr.ecr.us-east-1.amazonaws.com.

🐳  root@172.16.129.75:[~] $ hostname
manager.example.com
🐳  root@172.16.129.75:[~] $ docker node ls
ID                            HOSTNAME              STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
a49ha9p64gait59go1sc1iexl *   manager.example.com   Ready               Active              Leader              18.09.1
foxhv12a6u9z1e0y0gxvrnw6g     worker1.example.com   Ready               Active                                  18.09.1
brn0ata5azwe3y09xvobj1qho     worker2.example.com   Ready               Active                                  18.09.1
🐳  root@172.16.129.75:[~] $

On my manager node I log in to my AWS ECR private registry with a docker login command to my AWS ECR private registry.

🐳  root@172.16.129.75:[~] $ docker login -u AWS -p **my-aws-ecr-password** https://641851146588.dkr.ecr.us-east-1.amazonaws.com
WARNING! Using --password via the CLI is insecure. Use --password-stdin.
WARNING! Your password will be stored unencrypted in /root/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded

On my manager node I tag an image in order to push it up to my private registry in AWS ECR. I then push it up to my AWS ECR private registry. My AWS ECR private registry is 641851146588.dkr.ecr.us-east-1.amazonaws.com. I will tag the image and push it up to a repository named gforghetti/apache. I created the gforghetti/apache repository in my AWS ECR private registry prior to this from the AWS ECR console. I will give the image a tag of latest. So the full image name will be 504948365413 641851146588.dkr.ecr.us-east-1.amazonaws.com/gforghetti/apache:latest.

🐳  root@172.16.129.75:[~] $ docker image ls
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
gforghetti/apache   latest              504948365413        12 hours ago        133MB
🐳  root@172.16.129.75:[~] $ docker image tag 504948365413 641851146588.dkr.ecr.us-east-1.amazonaws.com/gforghetti/apache:latest
🐳  root@172.16.129.75:[~] $ docker image push 641851146588.dkr.ecr.us-east-1.amazonaws.com/gforghetti/apache:latest
The push refers to repository [641851146588.dkr.ecr.us-east-1.amazonaws.com/gforghetti/apache]
b29cbb1c94ef: Pushed
34c671894ba8: Pushed
6b718024b06b: Pushed
eae00250502c: Pushed
9464fb202df9: Pushed
e1049375b47d: Pushed
f8a0a368bee8: Pushed
3c816b4ead84: Pushed
latest: digest: sha256:7237a2b69ecfa8b458b43b34c0f56d7940241e4e38ab604cc8647cd16f4437c2 size: 1993

Now on the worker1 node I can run a container from that image stored in my private registry in AWS ECR.

🐳  root@172.16.129.76:[~] $ hostname
worker1.example.com

On the worker1 node I need to login to my private registry in AWS ECR in order to pull images from it and run containers from them.

🐳  root@172.16.129.76:[~] $ docker login -u AWS -p **my-aws-ecr-password** https://641851146588.dkr.ecr.us-east-1.amazonaws.com
WARNING! Using --password via the CLI is insecure. Use --password-stdin.
WARNING! Your password will be stored unencrypted in /root/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded

I can then run a container from the image stored in my private registry in AWS ECR.

🐳  root@172.16.129.76:[~] $ docker container run -it --name apache -p 80:80 641851146588.dkr.ecr.us-east-1.amazonaws.com/gforghetti/apache:latest
Unable to find image '641851146588.dkr.ecr.us-east-1.amazonaws.com/gforghetti/apache:latest' locally
latest: Pulling from gforghetti/apache
5e6ec7f28fb7: Pull complete
566e675a8212: Pull complete
ef5a8026039b: Pull complete
22ecb0106557: Pull complete
91cc511c603e: Pull complete
702a920b29ec: Pull complete
4479f1020334: Pull complete
0c74fd3c1e98: Pull complete
Digest: sha256:7237a2b69ecfa8b458b43b34c0f56d7940241e4e38ab604cc8647cd16f4437c2
Status: Downloaded newer image for 641851146588.dkr.ecr.us-east-1.amazonaws.com/gforghetti/apache:latest
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 172.17.0.2. Set the 'ServerName' directive globally to suppress this message
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 172.17.0.2. Set the 'ServerName' directive globally to suppress this message
[Sat Feb 02 15:08:27.442598 2019] [mpm_event:notice] [pid 1:tid 140121858180288] AH00489: Apache/2.4.38 (Unix) configured -- resuming normal operations
[Sat Feb 02 15:08:27.442728 2019] [core:notice] [pid 1:tid 140121858180288] AH00094: Command line: 'httpd -D FOREGROUND'
🐳  root@172.16.129.76:[~] $

Wow! Thank you so much for your response and information.

Here are the commands we are running:

docker commit {containerId} {registry/repo/tag}
docker push {registry/repo/tag}

This fails in docker swarm, but not in standalone mode.

I see that you are manually logging into hosts, and I can see how this would solve the missing image issue, but this is a little impractical for us. We actually use a node module to manage working with the api dockerode, and that connects to the manager, and the image is on the worker after the commit, so the manager doesn’t know what to do.

It’s possible for us to figure out which host the node is on and log into it, but seems like there might be a way to use the manager to do this for us.

What about filters and constraints for the push command, or anything else. Since everything is ok up to docker push, if we could tell it which node to use that would probably solve the issue.

What do you think?

I’m started to understand more about what you are trying to do.
However I do not fully understand your “workflow”.

Sounds like to me (correct me if I am wrong), that you flow looks like this:

  • You are building your docker image on a Docker Swarm worker node by running a docker commit command on a container.
  • You are then trying to push the that docker image to the registry from the manager node?
  • You then will run your container in swarm mode as a service?

My question then is this: Why don’t you push the image from the worker node?
Is it because the worker node is also not logged into the registry?

I’m also confused by this statement, please clarify:

  • We actually use a node module to manage working with the api dockerode.

Are you issuing the Docker API using nodejs to build your image in a container, commit the image in the container and then push it to the registry?

I need more details on your “workflow” - step by step, what is being done on each node, etc.

Also what version of Docker are you running?
Docker CE?
Docker EE?
Docker EE with Universal Control Plane (UCP)?

All we really want is a way to programmatically save the filesystem changes that have occurred in a container on a worker when running in swarm mode.

$ docker --version
Docker version 18.06.1-ce, build e68fc7a

Our workflow is mostly programatic:

-1) we use docker swarm to create a service (via the manager node) using the API, which creates a container on a worker node,
-2) changes happen to files on that container,
-3) we commit to save the state of those changes, which happens locally on the worker node,
-4) we need to push the newly committed snapshot to the registry via API, but the manager doesn’t know where to find the image/what worker saved it locally.

On the consumer product side of things, users type into a terminal on a website which connects to a container on the backend. We use docker commit to save the state of the container, so they can come back and have the same filesystem, database, etc…

Don’t worry too much about dockerode, it’s just a library we use to programmatically use docker. It’s effectively a wrapper around the api, and the commands I’ve given you are the translations of the api requests.

Again, thanks for your feedback, it’s been super helpful. And please let me know if I can provide more information.

I think I found a way to do what you want from the manager node.
It requires you to configure the dockerd daemons on the worker nodes to listen to another socket (in addition to unix:///var/run/docker.sock). You need to edit the /lib/systemd/system/docker.service file on each worker node and add a socket connection for tcp://the-worker-node-ip-address:2376. That will make the worker node listen for the docker API from the local socket and it’s IP address via TCP. Port 2376 is the encrypted port. You should use TLS.

Then from the manager node you can issue docker commands to the workers using the -H parameter. Refer to this URL for more details -> https://docs.docker.com/engine/reference/commandline/dockerd/

I ran a quick test on my swarm using the unencrypted port.

I edited the /lib/systemd/system/docker.service file on the worker1 node and appended -H tcp://172.28.128.4 to the ExecStart statement. After doing that I had to run a systemctl daemon-reload command followed by a systemctl restart docker command.

ExecStart=/usr/bin/dockerd -H fd:// -H tcp://172.28.128.4

On the Manager node I started a Docker service.

root@manager:~# docker service create --with-registry-auth --name my-customer-3 --label my-customer-3 alpine:latest sleep 1000000
acav6xgeh46qiiy1dwk110ecj
overall progress: 1 out of 1 tasks
1/1: running   [==================================================>]
verify: Service converged

I display the Docker service and it is running on worker1.

root@manager:~# docker service ps my-customer-3
ID                  NAME                IMAGE               NODE                DESIRED STATE       CURRENT STATE            ERROR               PORTS
uhymzvd7uqw4        my-customer-3.1     alpine:latest       worker1             Running             Running 14 seconds ago

From the Manager node I issue a docker container ls command to the worker1 node (via the -H worker1.example.com argument) to display the container created by the service.

root@manager:~# docker -H worker1.example.com container ls -f name=my-customer-3
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
68878e77ce21        alpine:latest       "sleep 1000000"     48 seconds ago      Up 48 seconds                           my-customer-3.1.uhymzvd7uqw4btajio96ocx4d

Then from the Manager node I issue a docker container commit command to the worker1 node (via the -H worker1.example.com argument) to create a docker image from that container.

root@manager:~# docker -H worker1.example.com container commit 68878e77ce21 gforghetti/my-customer-3:1.0.0
sha256:1fdcb5fa4e6dc7d50d2f23a410017eb217b00523fe02da64bb22e10fa021301b

Then from the Manager node I issue a docker image push command to the worker1 node (via the -H worker1.example.com argument) to push the image up to the registry. The manager node must be logged into the registry.

root@manager:~# docker -H worker1.example.com image push gforghetti/my-customer-3:1.0.0
The push refers to repository [docker.io/gforghetti/my-customer-3]
503e53e365f3: Mounted from library/alpine
1.0.0: digest: sha256:b867752c5d592c7f3e4a641284dd2a87a01823f5fff83ba96fd6bae664956291 size: 528
root@manager:~#

Now this all being said, you really should consider running Docker Enterprise Edition UCP with DTR.

We are considering Docker Enterprise Edition UCP with DTR. I’m in the process of applying your solution. Will post and update within a day or so.