Service in swarm running on different nodes are not reachable through overlay network

Hi!
Please, need some help here! I can’t find what I’m doing wrong.

I have 3 nodes: 1 VDS, 2 VPS
On each installed:
OS: Ubuntu 16.04.1 LST
Docker: 1.13.0.0

I have created a swarm:
on master

docker swarm init --advertise-addr EXTERNAL_IP

on 2 other nodes

docker swarm join ... (copied output from prev command)

Then on master

docker network create -d overlay test-nw

docker service create --name ping1 --network test-nw alpine ping docker.com
docker service create --name ping2 --network test-nw alpine ping docker.com
docker service create --name ping3 --network test-nw alpine ping docker.com

then I have tried lookup each service on each node and only running on same node service is reachable:

/ # nslookup ping1
nslookup: can't resolve '(null)': Name does not resolve
Name:      ping1
Address 1: 10.0.0.2
/ # nslookup ping2
nslookup: can't resolve '(null)': Name does not resolve
nslookup: can't resolve 'ping2': Name does not resolve
/ # nslookup ping3
nslookup: can't resolve '(null)': Name does not resolve
nslookup: can't resolve 'ping3': Name does not resolve

on second node:

/ # nslookup ping1
nslookup: can't resolve '(null)': Name does not resolve
nslookup: can't resolve 'ping1': Name does not resolve
/ # nslookup ping2
nslookup: can't resolve '(null)': Name does not resolve
Name:      ping2
Address 1: 10.0.0.4
/ # nslookup ping3
nslookup: can't resolve '(null)': Name does not resolve
nslookup: can't resolve 'ping3': Name does not resolve

on third:

/ # nslookup ping1
nslookup: can't resolve '(null)': Name does not resolve
nslookup: can't resolve 'ping1': Name does not resolve
/ # nslookup ping2
nslookup: can't resolve '(null)': Name does not resolve
nslookup: can't resolve 'ping2': Name does not resolve
/ # nslookup ping3
nslookup: can't resolve '(null)': Name does not resolve
Name:      ping3
Address 1: 10.0.0.6

As far as I know, each container has it’s own IP (10.0.0.3, 10.0.0.5, 10.0.0.7) and IPVS resolves 10.0.0.2 to 10.0.0.3, 10.0.0.4 to 10.0.0.5, 10.0.0.6 to 10.0.0.7, but on each node only IP of running on it container is reachable and IP of its service, other IP are not reachable
For example on second node 10.0.0.2, 10.0.0.3, 10.0.0.6, 10.0.0.7 are not reachable (not only ping, I have tried run simple http server on each)

In all manuals this task looks simple, what I’m doing wrong?

Thanks.

What’s /etc/resolv.conf on the host where these are running say? Is the DNS server from the host in 10.0.0.x block?

Check the Docker daemon log (either using journalctl or in /var/log/docker, etc.) for clues – to me this looks like issues I’ve seen before where the default subnet for networks conflicts with the IP range of the DNS server, and messes with the Docker networking stack.

One thing you can try immediately, define subnet of a network to be different than 10.0.0.x range, e.g., docker network create --driver overlay --subnet 11.0.0.0/24 or something like that.

nameserver 213.133.98.98
nameserver 213.133.100.100
nameserver 213.133.99.99
nameserver 2a01:4f8:0:a0a1::add:1010
nameserver 2a01:4f8:0:a111::add:9898
nameserver 2a01:4f8:0:a102::add:9999

With network 11.0.0.1 same results =\

/ # nslookup ping1
nslookup: can't resolve '(null)': Name does not resolve

nslookup: can't resolve 'ping1': Name does not resolve
/ # nslookup ping2
nslookup: can't resolve '(null)': Name does not resolve

nslookup: can't resolve 'ping2': Name does not resolve
/ # nslookup ping3
nslookup: can't resolve '(null)': Name does not resolve

Name:      ping3
Address 1: 11.0.0.6

in journalctl -u docker there are a lot of infos like:

level=info msg="memberlist: Suspect web-033c16cead74 has failed, no acks received"
level=info msg="memberlist: Suspect jira-fa1ae190d203 has failed, no acks received"

and there are only two errors:

level=error msg="Error in responding to bulk sync from node 172.31.1.100:  failed to send a TCP message during bulk sync: dial tcp 172.31.1.100:7946: i/o timeout"

level=error msg="periodic bulk sync failure for network c1bhrwo1gb5l0i8t6yrm5aukc: bulk sync failed on node jira-fa1ae190d203: failed to send a TCP message during bulk sync: dial tcp 172.31.1.100:7946: i/o timeout"

P.S. main, jira, web are hostnames of nodes

hm, very strange results…
in first “docker exec” ping2 resolved, but in second is not, on other two nodes same result as at first time (with 10.0.0.1 network)
(ping1 service on node main, ping2 on web, ping3 on jira)

alex@main:~$ docker exec -it ping1.1.bfo6e8pq5afyx1s50atw1l0g3 /bin/sh
/ # nslookup ping1
nslookup: can't resolve '(null)': Name does not resolve

Name:      ping1
Address 1: 11.0.0.2
/ # nslookup ping2
nslookup: can't resolve '(null)': Name does not resolve

Name:      ping2
Address 1: 11.0.0.4
/ # nslookup ping3
nslookup: can't resolve '(null)': Name does not resolve

nslookup: can't resolve 'ping3': Name does not resolve
/ # exit
alex@main:~$ docker exec -it ping1.1.bfo6e8pq5afyx1s50atw1l0g3 /bin/sh
/ # nslookup ping1
nslookup: can't resolve '(null)': Name does not resolve

Name:      ping1
Address 1: 11.0.0.2
/ # nslookup ping2
nslookup: can't resolve '(null)': Name does not resolve

nslookup: can't resolve 'ping2': Name does not resolve
/ # nslookup ping3
nslookup: can't resolve '(null)': Name does not resolve

nslookup: can't resolve 'ping3': Name does not resolve

nope, its always randomly resolves or not other nodes at first “docker exec”, and always not resolves at next runs
UPD: double nope, its always randomly
moreover, even if all “servicenames” resolves, they may not response for ping
P.S. firewall uninstalled for purity of an experiment :smile:

alex@main:~$ docker exec -it ping1.1.bfo6e8pq5afyx1s50atw1l0g3 /bin/sh
/ # nslookup ping1
nslookup: can't resolve '(null)': Name does not resolve

Name:      ping1
Address 1: 11.0.0.2
/ # nslookup ping2
nslookup: can't resolve '(null)': Name does not resolve

Name:      ping2
Address 1: 11.0.0.4
/ # nslookup ping3
nslookup: can't resolve '(null)': Name does not resolve

Name:      ping3
Address 1: 11.0.0.6
/ # nslookup ping1
nslookup: can't resolve '(null)': Name does not resolve

Name:      ping1
Address 1: 11.0.0.2
/ # nslookup ping2
nslookup: can't resolve '(null)': Name does not resolve

Name:      ping2
Address 1: 11.0.0.4
/ # nslookup ping3
nslookup: can't resolve '(null)': Name does not resolve

Name:      ping3
Address 1: 11.0.0.6
/ # ping 11.0.0.2
PING 11.0.0.2 (11.0.0.2): 56 data bytes
64 bytes from 11.0.0.2: seq=0 ttl=64 time=0.050 ms
64 bytes from 11.0.0.2: seq=1 ttl=64 time=0.071 ms
^C
--- 11.0.0.2 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.050/0.060/0.071 ms
/ # ping 11.0.0.4
PING 11.0.0.4 (11.0.0.4): 56 data bytes
64 bytes from 11.0.0.4: seq=0 ttl=64 time=0.052 ms
64 bytes from 11.0.0.4: seq=1 ttl=64 time=0.164 ms
^C
--- 11.0.0.4 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.052/0.108/0.164 ms
/ # ping 11.0.0.6
PING 11.0.0.6 (11.0.0.6): 56 data bytes
^C
--- 11.0.0.6 ping statistics ---
5 packets transmitted, 0 packets received, 100% packet loss

sometimes its work perfectly:

/ # nslookup ping1 127.0.0.11
Server:    127.0.0.11
Address 1: 127.0.0.11

Name:      ping1
Address 1: 11.0.0.2
/ # nslookup ping2 127.0.0.11
Server:    127.0.0.11
Address 1: 127.0.0.11

Name:      ping2
Address 1: 11.0.0.4
/ # nslookup ping3 127.0.0.11
Server:    127.0.0.11
Address 1: 127.0.0.11

Name:      ping3
Address 1: 11.0.0.6
/ # ping ping1
PING ping1 (11.0.0.2): 56 data bytes
64 bytes from 11.0.0.2: seq=0 ttl=64 time=0.039 ms
64 bytes from 11.0.0.2: seq=1 ttl=64 time=0.053 ms
^C
--- ping1 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.039/0.046/0.053 ms
/ # ping ping2
PING ping2 (11.0.0.4): 56 data bytes
64 bytes from 11.0.0.4: seq=0 ttl=64 time=0.031 ms
64 bytes from 11.0.0.4: seq=1 ttl=64 time=0.050 ms
^C
--- ping2 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.031/0.040/0.050 ms
/ # ping ping3
PING ping3 (11.0.0.6): 56 data bytes
64 bytes from 11.0.0.6: seq=0 ttl=64 time=0.037 ms
64 bytes from 11.0.0.6: seq=1 ttl=64 time=0.077 ms
^C
--- ping3 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.037/0.057/0.077 ms

and after 30 seconds

/ # nslookup ping1 127.0.0.11
Server:    127.0.0.11
Address 1: 127.0.0.11

Name:      ping1
Address 1: 11.0.0.2
/ # nslookup ping2 127.0.0.11
Server:    127.0.0.11
Address 1: 127.0.0.11

nslookup: can't resolve 'ping2': Name does not resolve
/ # nslookup ping3 127.0.0.11
Server:    127.0.0.11
Address 1: 127.0.0.11

Name:      ping3
Address 1: 11.0.0.6
/ # ping ping1
PING ping1 (11.0.0.2): 56 data bytes
64 bytes from 11.0.0.2: seq=0 ttl=64 time=0.053 ms
64 bytes from 11.0.0.2: seq=1 ttl=64 time=0.061 ms
^C
--- ping1 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.053/0.057/0.061 ms
/ # ping ping2
ping: bad address 'ping2'
/ # 
/ # ping ping3
ping: bad address 'ping3'
/ # 
/ # ping 11.0.0.2
PING 11.0.0.2 (11.0.0.2): 56 data bytes
64 bytes from 11.0.0.2: seq=0 ttl=64 time=0.049 ms
64 bytes from 11.0.0.2: seq=1 ttl=64 time=0.055 ms
^C
--- 11.0.0.2 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.049/0.052/0.055 ms
/ # ping 11.0.0.4
PING 11.0.0.4 (11.0.0.4): 56 data bytes
^C
--- 11.0.0.4 ping statistics ---
3 packets transmitted, 0 packets received, 100% packet loss
/ # ping 11.0.0.6
PING 11.0.0.6 (11.0.0.6): 56 data bytes
^C
--- 11.0.0.6 ping statistics ---
3 packets transmitted, 0 packets received, 100% packet loss

Yeah, making sure the ports you need (for Gossip and VXLAN) are open is the first step. However the random drops in the wide open example are extremely troubling. You might want to try upgrading to 1.13.1 which just came out. cc @mavenugo

That said you might want to re-run experiment using different protocol than ICMP (ping). With HTTP / different proto the results might be more encouraging, I do know that there’s some weirdness around supporting ICMP in swarm mode where the L4 load balancer built-in for service replicas does some magic.