Existing swarm failed and unable to add managers after reinitialisation of swarm

darrenstokes · December 6, 2017, 11:01am

We had some issues with an existing 3 manager swarm where the managers all became disconnected and the swarm complained about not having enough managers to make a quorum.

We have since removed 2 of the managers from the swarm and forced a swarm initialisation on the first manager.

Now we’re in a situation where we can’t add more managers to the swarm.

When we add a second manager it appears in docker node ls as active but in the status is has nothing. It neither says reachable or unreachable.

When we docker node inspect second host we get heartbeat failure on the primary manager.

We had previously been making dns changes but believe these have all been switched back and DNS is resolving for all the manager hostsnames. They’re all on the same network and there are no firewalls denying traffic between managers.

I cannot understand why the managers are failing to communicate as we see nothing of use in the docker logs (or syslog) even with debug logging enabled on the first manager.

I believe the nodes communicate on udp port 7946 and we did have an issue where the managers weren’t listening on these ports but this now seems to have been resolved.

Anyone have any suggestions on where to start looking at this one?

Topic		Replies	Views
Can't add third swarm manager or create overlay network - The swarm does not have a leader Swarm swarm	2	1763	May 18, 2018
New leader election failed Swarm	0	5665	September 7, 2017
Docker swarm tries to connect to removed managers Swarm	1	2392	May 6, 2021
Docker 19.03.12 : The swarm does not have a leader aferter swarm upgrade General docker , swarm	2	12250	May 18, 2021
Swarm managers lost manager status issue General	0	880	September 20, 2016

Existing swarm failed and unable to add managers after reinitialisation of swarm

Related topics