Incorrect Status.Address for nodes joining from / to guest bridged OS

paccalin · August 9, 2024, 8:54pm

I’ve recently tried to deploy a new swarm (Docker version 27.1.1, build 6312585) on guest computers from a single (for now) host.

The virtual (KVM / QEmu) machines are running Debian12 (Bookworm).
They are configured in bridge networking (here is the relevant snippet of the host /network/interfaces):

auto br0
iface br0 inet static
        address <HOST_IP>
        ... #(LAN CONFIG)
        bridge_ports <HOST_INTERFACE>
        bridge_stp off       # disable Spanning Tree Protocol
        bridge_waitport 0    # no delay before a port becomes available
        bridge_fd 0          # no forwarding delay

Because some part of the default swarm address pool (10.0.0.0/8) are used in our wan, I needed to change said pool (see the following command).
Here is the command I used to initialize the swarm:

docker swarm init --advertise-addr <MASTER_IP> --default-addr-pool 192.168.0.0/16 --default-addr-pool-mask-length 24

And here is the command I used to join a worker to said swarm:

docker swarm join --token <WORKER_INV_TKN> <MASTER_IP>:2377 --advertise-addr <WORKER_IP> --listen-addr <WORKER_IP>

The worker seems to join the swarm (containers are dispatched to it).
But I’m facing some problems (timeout) when trying to communicate between containers running on the master and containers running on the worker, even if they are in the same overlay network.

For diagnostic purpose, the firewall is disabled on both guests.
Here is the output of traceroute in both directions (no intermediate hop from what I can tell).
worker => master:

traceroute to <MASTER_IP> (<MASTER_IP>), 30 hops max, 60 byte packets
 1  <MASTER_FQDN> (<MASTER_IP>)  0.794 ms  0.659 ms  0.590 ms

master => worker

traceroute to <WORKER_IP> (<WORKER_IP>), 30 hops max, 60 byte packets
 1  <WORKER_FQDN> (<WORKER_IP>)  0.704 ms  0.562 ms  0.492 ms

I ran docker node inspect <WORKER_ID> --pretty on the master guest, and the output was unexpected:

...
Status:
 State:			Ready
 Availability:         	Active
 Address:		<HOST_IP>
...

note the “<HOST_IP>” which I didn’t expect to appear in any of the swarm config / settings.
The same happen when joining a worker node from AND to another bare-metal (the runner appear to always have this same <HOST_IP> for Status.Address in all three case, which is very weird).
I would expect the following output instead:

...
Status:
 State:			Ready
 Availability:         	Active
 Address:		<WORKER_IP>
...

When joining the other guest as a second manager, I obtain the following result:

...
Status:
 State:			Ready
 Availability:         	Active
 Address:		<HOST_IP>
Manager Status:
 Address:		<OTHER_MASTER_IP>:2377
 Raft Status:		Reachable
 Leader:		        No
...

This is even weirder, as the “Manager Status.Address” is correct, but the “Status.Address” is still incorrect.

The output of inspection of the first master node seems okay:

...
Status:
 State:			Ready
 Availability:         	Active
 Address:		<MASTER_IP>
Manager Status:
 Address:		<MASTER_IP>:2377
 Raft Status:		Reachable
 Leader:		        Yes
...

I’m pretty sure this bridge config is messing things up, but another factor could also be in play:
the hosting bare metal (<HOST_IP>) is currently also participating to another swarm for the time being.

I’ve tried to be as brief as possible, so ask if you need more info to help with the diagnostic.
Thank you for your time.

bluepuma77 · August 10, 2024, 7:15pm

Have you tried using explicit Docker networks?

paccalin · August 10, 2024, 8:37pm

Not sure what you mean by “explicit Docker networks”; could you elaborate further ?
The only network I’ve tried to make containers communicate across nodes was an “external” (manually defined) overlay network.
My main concern however is the IP address reported by the nodes themselves, and not the internal networking of the swarm (as I believe it will work fine once the swarm ip config is good).

Topic		Replies	Views
Docker Swarm Connectivity Issue: "No route to host" When Trying to Add Worker Swarm docker , swarm	2	1721	January 8, 2024
Connection to container in Swarm from another host within the Swarm General docker , swarm	1	1050	October 26, 2017
Swarm: Containers on different nodes unable to communicate on overlay network Swarm	5	14622	November 1, 2016
Docker swarm overlay cont ping host to container! Swarm	0	1654	April 18, 2018
Swarm manager IP is wrong Swarm	2	4931	May 13, 2021

Incorrect Status.Address for nodes joining from / to guest bridged OS

Related topics