What happens when a swarm worker node with running stacks times out of the swarm and rejoins?

egrimeshd · April 25, 2019, 3:35pm

Hello,

We have an issue where the dockerd on a worker node redeployed the stacks running on a worker node. From the logs, it looks like the worker node is having trouble communicating with swarm just before the stacks are redeployed. What happens when a swarm worker node with running stacks times out of the swarm and then rejoins? Would this cause the stacks to be redeployed?

Docker version 18.09.2, build 6247962
Ubuntu 18.04.1 LTS

Thanks,

Erik

bryceryan · April 25, 2019, 6:20pm

In general, if a node goes offline, the orchestration layer will redeploy any tasks that were running there to suitable other workers. Those new tasks on the new nodes will stay put until they exit or are evicted. The presence of a new node will not cause an overall redeployment of happily running tasks from other nodes on to the “new” node.

All this assumes there’s sufficient capacity in the cluster to survive the loss of a worker node, and it’s possible to run the displaces services/tasks onto some other node(s). I mention this because some folks deploy under-sized clusters which can’t really handle the loss of a worker, or they heavily constrain specific services to run on only a single node or very small set of nodes. In cases like that, the displaced tasks might not be able to run at all when the node stops/errors/is rebooted/is unavailable.

egrimeshd · April 25, 2019, 9:15pm

Hmm. What happens when a stack is constrained to a specific node, the node becomes unavailable, and then becomes available again? Does the swarm detect the stack is already running on the node when it return and leave it alone or does
the stack get removed and created fresh on the node?

Topic		Replies	Views
How to auto redistribute tasks after recovery of a swarm node Swarm	1	3158	October 1, 2017
Docker Swarm - Nodes/containers unreachable on swarm after deployments Swarm swarm	0	1971	March 3, 2018
Cannot re-deploy service to node after unexpected restart (Simulated node death) Swarm	7	865	July 17, 2023
The tasks of the worker node do not work randomly General dockerhub , docker , beta , build , swarm	2	910	May 15, 2019
Correct way to stop/remove swarm services on new deployment, connectivity problem between containers Swarm	2	3827	October 27, 2020

What happens when a swarm worker node with running stacks times out of the swarm and rejoins?

Related topics