Graceful restart of swarm manager leader

jnordberg · August 30, 2021, 3:35am

I have a 3-node swarm with all managers and I’m wondering what’s the best practice for taking the leader offline for maintenance without service distruption.

For example:

$ docker node ls
ID                            HOSTNAME   STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
xxxxxxxxxxxxxxxxxxxxxxxxx *   node1      Ready     Active         Leader           20.10.8
xxxxxxxxxxxxxxxxxxxxxxxxx     node2      Ready     Active         Reachable        20.10.8
xxxxxxxxxxxxxxxxxxxxxxxxx     node3      Ready     Active         Reachable        20.10.8

^ I want to restart node1 there.

meyay · August 30, 2021, 5:26am

The control plan will see no outage if one of three nodes is unavailable.

Assume all deployed containers are swarm services, which use no placement constraint unique to node1, and have their volumes placed on a globaly available storage (cifs, nfsv4, portworx), have an additional replica running on one of the other nodes, then it should be enough to drain the node:

docker node update --availability drain node1

Make sure to wait until the last container is drained before you beginn your maintance task.

Once maintainance is done set the node active again:

docker node update --availability active node1

Global type services will immediate be scheduled on node1 again. Replica type services won’t be re-balanced to node1, until they get redeployed.

Finaly, if you have services with a desired replica count of 1 running on node1, there is no way arround service disruption during the redeployment of the service to a different node. The downtime can be somewhere between seconds up to 2 minutes.

Topic		Replies	Views
Docker worker nodes shown as "Down" after re-start General	19	68559	August 3, 2022
Unnecessary rescheduling after node lost connection with manager Swarm swarm	5	59	September 17, 2024
No running container after node swarm failover General docker	0	780	March 9, 2018
Question reboot Docker Swarm General docker , swarm	7	304	May 14, 2024
Restarting single Docker manager of swarm Swarm	3	3116	August 17, 2019

Graceful restart of swarm manager leader

Related topics