Docker swarm deploy hanging, can't deploy more to my Swarm

I’m hit a wirdt error in my Docker Swarm setup, i have around 62 stack with 110+ services. splatted to 3 docker nodes, with 8 CPU, 16GB and 320 disk space.

The main problem is I have trying everything to get it working but when deploy stacks, running update command or change replicates counts then process just hanging like this.

swarmpit_app
overall progress: 0 out of 1 tasks
1/1: new       [=====>                                             ]

What I have trying to do its running a manager node in drain mode its not working, I have running it like i always do manager with container support for now, I have added more server to my swarm so its having 4 works + 3 manager nodes and nothing working.

My docker instants running on Docker 20.10.8 in staging, production have 20.10.5 and now i’m very scary about to hit this issue in production, the only different between my production and staging setup its the version number of Docker and production got 2 servers more from the beginning.

Hope some one can help me to debug this issue and maybe explain why its happen.

I have trying to remove some of my stacks and its now returning this error.

swarmpit_app
overall progress: 0 out of 1 tasks
1/1: no suitable node (scheduling constraints not satisfied on 4 nodes; insuffi…
1 Like

if only the error message wouldn’t be cut-off at “insuffi…”… we could read what bothers the scheduler ^^.

Though, generally the error message indicates that the scheduler is not able to find a node that matches the placement contraint (and/or ressource constraints) for the requested ressources.

You might want to check your placement contraints AND if any node has more cpu/memory left then requested (for the sum of all services and their replicas in the stack).

A small update for this issue, I have following data to updated after I remove 4 stacks so now my data look like this.

58 Stacks
102 Services / Running 178 containers
200 Networks ( 4 nodes total )

Now the funny part is when I remove Swarmpit and replace it with Portainer i can start Portainer up, but Swarmpit its still henging.

So can it be some limit for my network, services size or something like that?

Becures when i increased the amount of Stacks I hit the wall again.

I found out of what happening here, i round out if IP’s in ingress network, defualt network for Docker Swarm is /24 = 254 IP’s

So when change it to /16 i have 65k IP’s the problems its now gone, its just required er fully reconfig of your docker swarm cluster, so I recreate a new setup and copy settings from the old setup to the new setup.

docker network rm ingress
docker network create \
  --driver overlay \
  --ingress \
  --subnet=10.11.0.0/16 \
  --gateway=10.11.0.2 \
  --opt com.docker.network.driver.mtu=1200 \
  ingress
1 Like