Node takes a long time to join swarm

anshelmi · June 10, 2022, 12:05pm

Docker swarm gets the following error:
“Error response from daemon: Timeout was reached before node joined. The attempt to join the swarm will continue in the background. Use the “docker info” command to see the current swarm status of your node.”

Node eventually joins the swarm, after a long time (several minutes)

This is an issue since we want to automate the building of the swarm.

What could cause this delay? Is it possible to extend the timeout period? What would be the best way to debug this?

meyay · June 11, 2022, 6:40pm

By any chance, are you trying to create a swarm for nodes over a WAN connection? Swarm uses the RAFT consensus under the hood. RAFT requires low latency networks for stable operation.

If you want to join nodes in edge/wan locations to a swarm cluster, it’s not going to work reliable. You might want to look at Portainer and it’s Edge Agent if you want a single point of control for such a scenario. Its not going to give you a swarm cluster, but it will allow to controll all instances from a single Portainer instance…

anshelmi · June 13, 2022, 7:48am

The ping times of from from one manager server to two different worker servers are as follows:

64 bytes from worker1 (xxx.xxx.xxx.xxx): icmp_seq=1 ttl=64 time=0.157 ms
64 bytes from worker1 (xxx.xxx.xxx.xxx): icmp_seq=2 ttl=64 time=0.252 ms
64 bytes from worker1 (xxx.xxx.xxx.xxx): icmp_seq=3 ttl=64 time=0.207 ms
64 bytes from worker1 (xxx.xxx.xxx.xxx): icmp_seq=4 ttl=64 time=0.289 ms
64 bytes from worker1 (xxx.xxx.xxx.xxx): icmp_seq=5 ttl=64 time=0.238 ms

64 bytes from worker2 (xxx.xxx.xxx.xxx): icmp_seq=2 ttl=64 time=0.277 ms
64 bytes from worker2 (xxx.xxx.xxx.xxx): icmp_seq=3 ttl=64 time=0.392 ms
64 bytes from worker2 (xxx.xxx.xxx.xxx): icmp_seq=4 ttl=64 time=0.289 ms
64 bytes from worker2 (xxx.xxx.xxx.xxx): icmp_seq=5 ttl=64 time=0.303 ms

The joining of workers to the swarm does not time out, but when we try to join a second Manager to the swarm the timeout occurs.

We are running docker version 20.10.2.

meyay · June 13, 2022, 3:39pm

The only think that commes to mind are that firewalls block required ports (or security groups if the nodes are in the cloud)

See: Getting started with swarm mode | Docker Documentation.

Note: you didn’t share the ping result of the 2nd machager instance that fails to join. It says nothing about the connectivity of the affected node.

Topic		Replies	Views
Swarm join problems: Timeout was reached before node joined Swarm	3	1204	April 2, 2024
Hello everyone, I am trying to form a Docker Swarm with a manager node and 2 workers. The problem is that the worker nodes do not join the swarm, and they show me the following error: Swarm swarm	1	598	August 29, 2023
Fail join node as Worker Swarm	14	31022	July 26, 2023
Encountering Issue while building a swarm cluster General	3	203	April 24, 2024
Can't Join Worker Node to Swarm Swarm swarm	0	1759	March 20, 2020

Node takes a long time to join swarm

Related topics