Bulk sync to node XXX timed out docker swarm

I having some problem with my node worker. I have 1 manager and 9 worker nodes from different IP range with manager node, and my problem is my service deployed on A node(this node having some trouble) does seem to communicate with other node in my swarm. I have opened connection both way from A node to all my swarm node but it doesn’t seem working.
All I got is this error message when I view Docker logs: time=“2022-06-03T21:55:22.399445386+06:30” level=error msg=“Bulk sync to node eb8646bd85de timed out”.

I’m using Docker version: 20.10.16 on A node and 19.03.9 on manager node.
All my machine is using CentOS 7.

Have you tried to search for this message? There are multiple reports related to this. They say it might be that a required port is not accessible. I realize that not all of the related issues have a solution, so I quote which may be more helpful:

What I would do:

  • Go to the node (SSH) and check if docker is running properly (You have probably done that already)
  • Check if Docker Swarm is listening on its ports: Open protocols and ports between the hosts
    You can use netstat to check the ports
  • Try to check if you can access those ports locally. You can use telnet or netcat for that.
  • Try to check the ports from an an other node…
  • Configure your firewall if that is the problem

To extend on @rimelek’s post: make sure that the subnets (what you call “different IP range”) allow low-latency network connections amongst all nodes. Swarm uses RAFT for cluster membership and coordination, which itself relies on low-latency networking for stable operation - everything else will be brittle.

For instance: running swarm cluster nodes in different availability zones inside a region of a cloud provider works like a charm, but running swarm cluster nodes in different regions is brittle, even worse if the nodes are spread accross different cloud providers… All that scenarios have in common that the nodes are in different subnets (like in your scenario).