Can a cluster with 2 managers handle failure?

danehammer · June 29, 2016, 6:23pm

From the docs here, it would appear the expectation is that a 3-manager setup can handle 2 failures. But in my experimentation, a 2-manager setup where I kill the node the primary manager is running on does not recover. I assume this means the secondary manager does not have connectivity to a majority of the managers it knows about, so it can’t elect a leader (assuming Raft works like Zookeeper’s consensus protocol, which I’m more familiar with). Is this correct? Can someone edify my understanding?

junius · September 16, 2017, 5:40am

The docs claims to support 2 availability zones failure on 3 availability zones deployment. Also curious how Swarm supports it. What if network partition happens?

For example, az1 is still alive, but az1’s network is broken to az2 and az3, az2 and az3 could talk with each other. The swarm nodes on az2 and az3 will definitely work well, as they have the majority. But according to the doc, the swarm nodes on az1 will still work well? Once the network recovers, how does swarm handle the conflict?

The doc link may be out of date. It is under “Superseded products and tools”. Docker swarm managers use Raft protocols, see the swarm raft page. For 3 swarm managers, when 2 are down, the swarm managers will go down. For 2 swarm managers, any one goes down will bring the swarm managers down. We should not use 2 swarm managers in production.

Topic		Replies	Views
Help with understanding number of managers General	0	535	January 14, 2018
Question reboot Docker Swarm General docker , swarm	7	283	May 14, 2024
For Docker Swarm with two nodes, one manager and one worker, what happens if the manager node goes out? General swarm	8	50	September 19, 2024
Can you create a swarm with two manager nodes only? Swarm	7	36277	September 14, 2018
Best management strategy for 3 nodes Swarm	0	718	November 27, 2017

Can a cluster with 2 managers handle failure?

Related topics