Swarm manager IP is wrong

Hi,

I’m having trouble setting up my docker swarm. The nodes are not reconnecting after restarting docker and the overlay network does not work once connected.

I suspect it’s due to the manager address being published wrong. The problem is that the wrong IP is noted under “manager addresses” in docker info. The IP is the “src” address of the network adapter I’m not using. It should be my manager’s IP, 10.133.31.190

Does anyone know what is causing this issue and how I can fix this?


Some info about my setup:

I created the swarm using: docker swarm init
Due to the issue described above, I recreated the cluster with: docker swarm init --force-new-cluster --advertise-addr 10.133.31.190 --listen-addr eth1
I joined a worker with docker swarm join --token the-token-here 10.133.31.190:2377 --advertise-addr 10.133.37.180

I got 2 nodes (digital ocean VPS):

  • manager01
    • role: manager
    • ip: 10.133.31.190
  • manager02
    • role: worker (not manager!)
    • ip: 10.133.37.180
  • private network trough adapter eth1

manager01: docker info (snippet)

Swarm: active
 NodeID: lzlnkjvo5w24dh96uup118jlm
 Is Manager: true
 ClusterID: ncsckxolbsscv36u3mhnzwkr3
 Managers: 1
 Nodes: 2
 Default Address Pool: 10.0.0.0/8
 SubnetSize: 24
 Orchestration:
  Task History Retention Limit: 5

manager02: docker info (snippet)

Swarm: pending
 NodeID: 5f8hoitagnendhiuc2kr9cdxc
 Is Manager: false
 Node Address: 10.133.37.180
 Manager Addresses:
  10.18.0.5:2377                 ## << I think this is the issue!

docker version

Client:
 Version:           18.09.5
 API version:       1.39
 Go version:        go1.10.8
 Git commit:        e8ff056
 Built:             Thu Apr 11 04:44:24 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.5
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.8
  Git commit:       e8ff056
  Built:            Thu Apr 11 04:10:53 2019
  OS/Arch:          linux/amd64
  Experimental:     false

manager01: ip route show

default via 188.166.64.1 dev eth0 onlink
10.18.0.0/16 dev eth0  proto kernel  scope link  src 10.18.0.5
10.133.0.0/16 dev eth1  proto kernel  scope link  src 10.133.31.190
172.17.0.0/16 dev docker0  proto kernel  scope link  src 172.17.0.1
172.18.0.0/16 dev docker_gwbridge  proto kernel  scope link  src 172.18.0.1
188.166.64.0/18 dev eth0  proto kernel  scope link  src 188.166.84.156

manager01: ifconfig eth1

eth1 Link encap:Ethernet HWaddr 76:7f:79:27:d4:87
inet addr:10.133.31.190 Bcast:10.133.255.255 Mask:255.255.0.0
inet6 addr: fe80::747f:79ff:fe27:d487/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:179048 errors:0 dropped:0 overruns:0 frame:0
TX packets:114873 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:32907687 (32.9 MB) TX bytes:14378778 (14.3 MB)

Same Problem here, with only one node. But in our case, the node address in docker info is also wrong:

Node Address: 185.2.252.101
Manager Addresses:
185.2.252.101:2377

This leads to all services on the node being stopped from time to time because of lots of daemon errors in journalctl:

addrConn.createTransport failed to connect to {185.2.252.101:2377 0 }. Err :connection error: desc = "transport: Error while dialing dial tcp 185.2.252.101:2377: connect: no route to host

After a couple of minutes, the services are starting again.

The IP of the node is 185.2.252.201, as can be seen in ip a:

inet 185.2.252.201/16

Hi ikkentim,

I am having the same issue. How did you fix it?