Docker Community Forums

Share and learn in the Docker community.

Docker service scale is buggy

docker

(Doron) #1

I’m having a Docker Swarm cluster of 3 nodes. All of them are managers.
I’ve deployed single service called ‘web’. Than, I’ve scaled the service to 100 using:
docker service scale web=100
The result was as expected. Swarm scaled the service between the nodes.
Than, I’ve scaled the service to 1000. However, this time the scale command seemed to get stuck.
When issuing docker service ls command I get only 249 out of 1000 replicas:
docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
d6qf3rq32ni0 web replicated 249/1000 web-image

When running: docker service ps web
I get list of 1000 replicas while some of the tasks are in state ‘New’.
How can I debug why the scale command had failed and tasks are left in ‘New’ state?

docker service ps web | head
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
m27xstnqh0rm web.1 doronbl/swarm:web node1 Running Running 24 hours ago
ny5887bf5c9v web.2 doronbl/swarm:web node2 Running Running 24 hours ago
j5lyujpqva73 web.3 doronbl/swarm:web node3 Running Running 24 hours ago
j90vbfotvyfq web.4 doronbl/swarm:web node1 Running Running 24 hours ago
inqcjnh2dvqn web.5 doronbl/swarm:web node2 Running Running 24 hours ago


k10qb5ao36qk web.996 doronbl/swarm:web Running New 24 hours ago
zqblpz403nzl web.997 doronbl/swarm:web Running New 24 hours ago
rdsifppy3ehw web.998 doronbl/swarm:web Running New 24 hours ago
brzkv911dpnm web.999 doronbl/swarm:web Running New 24 hours ago
n1vl8xmq01h1 web.1000 doronbl/swarm:web Running New 24 hours ago

UPDATE:
I’ve found the issue: I had overlay network defined with /24, thus didn’t have enough IP to allocate all containers.
Creating /16 overlay network allowed me to scale the service to 1000 tasks

I’ve found the hint for my problem here:


(Satwikbanerjee) #2

Are you sure you are not hitting resource ceiling on nodes?
Can you paste your available resource stat from nodes?