Docker 19.03.12 : The swarm does not have a leader aferter swarm upgrade

Hello,

Some strange troubleshouting with docker since laste update.

Can you help me about this ?
It’s is not my firstr upgrade of package and this case have been reproduice on a freshnew stack.

Updgraded from 18.09.9 to 19.03.12

OS : Ubuntu 16.04 Server

Docker package
docker-ce=5:18.09.9~3-0~ubuntu-bionic
docker-ce-cli=5:19.03.11~3-0~ubuntu-bionic
containerd.io=1.2.13-2

Details

A problem identified with version 19.03.12 of docker
Managers have been put in version 19.03.12
When you want to add a manager to the group with an active leader, an error message is visible
The different known solutions were used

Error Message

  • The swarm does not have a leader
  • Error response from daemon: rpc error: code = Unknown desc = The swarm does not have a leader. It’s possible that too few managers are online. Make sure more than half of the managers are online.
  • Logs manager non-leader : docker msg=“error reading the kernel parameter net.ipv4.vs.expire_nodest_conn”

Case

  • As soon as you play the docker swarm join --token command on non-leader managers, after a few minutes, the leader manager is no longer available
    -> Forced to replay the docker swarm init command --force-new-cluster --advertise-addr xx.xx.xx.xx --listen-addr xx.xx.xx.xx: 2377 to find the leader operational

  • The leader sees the worker nodes in version 19.03.12. No problem with workers

References applied

i cant add link in the post

Hello.

The subject is still open.

Nobody with enough knowledge to help me ?

Thanks

It sounds like you have lost quorum. Docker swarm needs at least 50% + 1 managers active in order for the swarm to be usable. If you have lost enough managers to fall below that number, then your swarm cannot work until you bring up the down managers.

If you feel like you should be at that threshold, then it is likely that you have lost managers previously and never removed them from the swarm. For example, if you have three managers, one goes offline, so you add another one, you have 3/4 managers online. Swarm doesn’t know the fourth one is gone until you remove it with docker node rm on one of the other managers. If that happens again, then you’d be at 3/5 which is the minimum needed on a 5 node cluster. If it happens again, you’ll be at 3/6 and that drops below the threshold for swarm to maintain consensus.

The easiest way to recover is to run the docker swarm init --force-new-cluster on one of the remaining managers. This will create a new swarm with the old service/config/secret data, and that node will be the only manager on the new cluster. On your other managers, you can docker swarm leave --force and then join them to the new manager as a manager until you have the desired number of managers. You should run an odd number of managers. Three is common.