Docker Swarm over Tailscale: custom network doesn't work but ingress does

I’m trying to connect two servers over tailscale VPN (basically wireguard I think), but additional network doesn’t work. I pushed my minimal setup for swarm here - GitHub - sssemil/docker_swarm_test. Here’s my earlier post describing another issue I had - Docker Swarm custom network doesn't work across servers, while ingress does, but now I added tailscale, and connected the network over tailnet.

The test setup is simple, two Ubuntu 22.04 server VMs with docker 24.0.5 setup - one is the manager and one is the worker. Both are connected to tailscale. I tried disabling ufw completely, but that didn’t help.

Here’s the compose file:

version: '3.7'
services:

  test1:
    image: alpine:latest
    command: nc -l -k 4441
    ports:
      - "4441:4441"
    deploy:
      placement:
        constraints:
          - node.labels.test1 == true
    networks:
      - test-network

  test2:
    image: alpine:latest
    command: nc -l -k 4442
    ports:
      - "4442:4442"
    deploy:
      placement:
        constraints:
          - node.labels.test2 == true
    networks:
      - test-network

networks:
  test-network:
    driver: overlay
    attachable: true
    driver_opts:
      encrypted: "false"

Here’s the manager:

ubuntu@manager:~$ docker ps
CONTAINER ID   IMAGE           COMMAND           CREATED       STATUS       PORTS     NAMES
61bd6cf0279c   alpine:latest   "nc -l -k 4441"   4 hours ago   Up 4 hours             test_test1.1.dyaohbkk0svbrvebtgsaqg5e6
ubuntu@manager:~$ docker exec -it test_test1.1.dyaohbkk0svbrvebtgsaqg5e6 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
63: eth0@if64: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue state UP
    link/ether 02:42:0a:00:00:0a brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.10/24 brd 10.0.0.255 scope global eth0
       valid_lft forever preferred_lft forever
65: eth2@if66: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP
    link/ether 02:42:ac:12:00:03 brd ff:ff:ff:ff:ff:ff
    inet 172.18.0.3/16 brd 172.18.255.255 scope global eth2
       valid_lft forever preferred_lft forever
67: eth1@if68: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1424 qdisc noqueue state UP
    link/ether 02:42:0a:00:01:0b brd ff:ff:ff:ff:ff:ff
    inet 10.0.1.11/24 brd 10.0.1.255 scope global eth1
       valid_lft forever preferred_lft forever
ubuntu@manager:~$ docker exec -it test_test1.1.dyaohbkk0svbrvebtgsaqg5e6 ping 10.0.0.9
PING 10.0.0.9 (10.0.0.9): 56 data bytes
64 bytes from 10.0.0.9: seq=0 ttl=64 time=0.773 ms
64 bytes from 10.0.0.9: seq=1 ttl=64 time=0.603 ms
64 bytes from 10.0.0.9: seq=2 ttl=64 time=1.219 ms
^C
--- 10.0.0.9 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.603/0.865/1.219 ms
ubuntu@manager:~$ docker exec -it test_test1.1.dyaohbkk0svbrvebtgsaqg5e6 ping 10.0.1.10
PING 10.0.1.10 (10.0.1.10): 56 data bytes
^C
--- 10.0.1.10 ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss
ubuntu@manager:~$ docker network inspect test_test-network
[
    {
        "Name": "test_test-network",
        "Id": "oksvej7a7n5lz21zi9ylk9kax",
        "Created": "2023-11-13T12:03:57.81744055Z",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.1.0/24",
                    "Gateway": "10.0.1.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": true,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "61bd6cf0279ccd44cd95f7e7034623682a4e2ab87f7658b111f6e12d10b84f5c": {
                "Name": "test_test1.1.dyaohbkk0svbrvebtgsaqg5e6",
                "EndpointID": "89fabf9c77e231f3e02cc8cc9606ac98261c890f9ef132ebf8ce3aa598a01f1b",
                "MacAddress": "02:42:0a:00:01:0b",
                "IPv4Address": "10.0.1.11/24",
                "IPv6Address": ""
            },
            "lb-test_test-network": {
                "Name": "test_test-network-endpoint",
                "EndpointID": "55b0b5f174ffbf0da60e3eeb4fe850cf195e1f7857bbd5883eb5f95c26ed0ad5",
                "MacAddress": "02:42:0a:00:01:04",
                "IPv4Address": "10.0.1.4/24",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4097",
            "encrypted": "false"
        },
        "Labels": {
            "com.docker.stack.namespace": "test"
        },
        "Peers": [
            {
                "Name": "86c0a2ca6299",
                "IP": "100.106.126.134"
            },
            {
                "Name": "1462d73a4365",
                "IP": "100.82.135.143"
            }
        ]
    }
]
ubuntu@manager:~$ docker network inspect ingress
[
    {
        "Name": "ingress",
        "Id": "dz8gpht33ig7yz7cwr3h1qmco",
        "Created": "2023-11-13T11:05:09.25371142Z",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.0.0/24",
                    "Gateway": "10.0.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": true,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "61bd6cf0279ccd44cd95f7e7034623682a4e2ab87f7658b111f6e12d10b84f5c": {
                "Name": "test_test1.1.dyaohbkk0svbrvebtgsaqg5e6",
                "EndpointID": "9346b4595e1c2badb8e2c5aafdb0079ae2f229c0b282bf3ae696b710ac1e359d",
                "MacAddress": "02:42:0a:00:00:0a",
                "IPv4Address": "10.0.0.10/24",
                "IPv6Address": ""
            },
            "ingress-sbox": {
                "Name": "ingress-endpoint",
                "EndpointID": "05c20f541e2d7ad72296966b72a2dfa267fd3db2f5a675bcefea4d82e9b9e6fa",
                "MacAddress": "02:42:0a:00:00:02",
                "IPv4Address": "10.0.0.2/24",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4096"
        },
        "Labels": {},
        "Peers": [
            {
                "Name": "86c0a2ca6299",
                "IP": "100.106.126.134"
            },
            {
                "Name": "1462d73a4365",
                "IP": "100.82.135.143"
            }
        ]
    }
]

And here’s the worker:

ubuntu@worker:~$ docker network ls
NETWORK ID     NAME                DRIVER    SCOPE
0f342e100ada   bridge              bridge    local
0a29b742750f   docker_gwbridge     bridge    local
55fb038036ed   host                host      local
dz8gpht33ig7   ingress             overlay   swarm
bf47e3c3f2a3   none                null      local
oksvej7a7n5l   test_test-network   overlay   swarm
ubuntu@worker:~$ docker network inspect ingress
[
    {
        "Name": "ingress",
        "Id": "dz8gpht33ig7yz7cwr3h1qmco",
        "Created": "2023-11-13T11:06:37.843025886Z",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.0.0/24",
                    "Gateway": "10.0.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": true,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "48e9c717cd3c48c5ad36a59f1f05df0ef7c13f96308cae7d928375c330ff114a": {
                "Name": "test_test2.1.vz4nwz8uytjhiozx3kw4fqua5",
                "EndpointID": "97012788394d19341de59f4a389eecb82ba721ae3c231047838645324ca5b134",
                "MacAddress": "02:42:0a:00:00:09",
                "IPv4Address": "10.0.0.9/24",
                "IPv6Address": ""
            },
            "ingress-sbox": {
                "Name": "ingress-endpoint",
                "EndpointID": "724aa77031e9b7146f17bbe3fa318e7b301286e464e0c1a77522815864768c10",
                "MacAddress": "02:42:0a:00:00:03",
                "IPv4Address": "10.0.0.3/24",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4096"
        },
        "Labels": {},
        "Peers": [
            {
                "Name": "1462d73a4365",
                "IP": "100.82.135.143"
            },
            {
                "Name": "86c0a2ca6299",
                "IP": "100.106.126.134"
            }
        ]
    }
]
ubuntu@worker:~$ docker network inspect test_test-network
[
    {
        "Name": "test_test-network",
        "Id": "oksvej7a7n5lz21zi9ylk9kax",
        "Created": "2023-11-13T11:06:38.152700523Z",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.1.0/24",
                    "Gateway": "10.0.1.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": true,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "48e9c717cd3c48c5ad36a59f1f05df0ef7c13f96308cae7d928375c330ff114a": {
                "Name": "test_test2.1.vz4nwz8uytjhiozx3kw4fqua5",
                "EndpointID": "1d4aaa257220ac7ecfdbc4e35ed03baa7373f20595b578c88246a522b25fb85e",
                "MacAddress": "02:42:0a:00:01:0a",
                "IPv4Address": "10.0.1.10/24",
                "IPv6Address": ""
            },
            "lb-test_test-network": {
                "Name": "test_test-network-endpoint",
                "EndpointID": "b1017f4e2c5b0ca82221be3ba7c431536009975895dacfb538bc942b54aabf0d",
                "MacAddress": "02:42:0a:00:01:09",
                "IPv4Address": "10.0.1.9/24",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4097",
            "encrypted": "false"
        },
        "Labels": {
            "com.docker.stack.namespace": "test"
        },
        "Peers": [
            {
                "Name": "1462d73a4365",
                "IP": "100.82.135.143"
            },
            {
                "Name": "86c0a2ca6299",
                "IP": "100.106.126.134"
            }
        ]
    }
]
ubuntu@worker:~$ docker ps
CONTAINER ID   IMAGE           COMMAND           CREATED       STATUS       PORTS     NAMES
48e9c717cd3c   alpine:latest   "nc -l -k 4442"   5 hours ago   Up 5 hours             test_test2.1.vz4nwz8uytjhiozx3kw4fqua5
ubuntu@worker:~$ docker exec -it test_test2.1.vz4nwz8uytjhiozx3kw4fqua5 ping 10.0.0.10
PING 10.0.0.10 (10.0.0.10): 56 data bytes
64 bytes from 10.0.0.10: seq=0 ttl=64 time=0.832 ms
64 bytes from 10.0.0.10: seq=1 ttl=64 time=1.318 ms
64 bytes from 10.0.0.10: seq=2 ttl=64 time=1.798 ms
^C
--- 10.0.0.10 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.832/1.316/1.798 ms
ubuntu@worker:~$ docker exec -it test_test2.1.vz4nwz8uytjhiozx3kw4fqua5 ping 10.0.1.11
PING 10.0.1.11 (10.0.1.11): 56 data bytes
^C
--- 10.0.1.11 ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss

As you can see, ping only works with ingress IPs, but not over test-network IPs.

Hello eguee,

I was doing something similar to what you were doing yesterday and noticed today it wasn’t working properly and found your post here while trying to find my answer which I have, and I believe that will help you as well.

The solution is to change the MTU under the network:

    driver_opts:
      com.docker.network.driver.mtu: 1280

Docker uses MTU packet size of 1500 and Tailscale uses 1280 so the Docker packets are being quietly dropped.

Packet size limits can also cause connection problems on certain types of networks.

Tailscale uses a MTU of 1280. If there are other interfaces which might send a packets larger than this, those packets might get dropped silently. These can be verified by using tcpdump.

In order to solve this, we can set the MTU at the LAN level to a lower value, or use MSS (Maxiumum Segment Size) clamping…
Troubleshooting guide · Tailscale Docs (unable-to-make-a-tcp-connection-between-two-nodes)

2 Likes

Hi @extwhiskey , thanks for the tip! This didn’t help, but I will use this option, since it should make performance better.

Here’s what I did:

Doing ping 10.0.1.8 -i 0.1 -s 1000 from test1 to test2’s IP on “test” network:

...
09:21:30.545778 IP managervm.xxx.ts.net > workervm.xxx.ts.net: ESP(spi=0x93132b84,seq=0x216), length 1084
09:21:30.646006 IP managervm.xxx.ts.net > workervm.xxx.ts.net: ESP(spi=0x93132b84,seq=0x217), length 1084
09:21:30.746171 IP managervm.xxx.ts.net > workervm.xxx.ts.net: ESP(spi=0x93132b84,seq=0x218), length 1084
...

Same, but to test2’s ingress IP:

...
09:23:15.741878 IP workervm.xxx.ts.net.46629 > managervm.xxx.ts.net.4789: VXLAN, flags [I] (0x08), vni 4096
IP 10.0.0.22 > 10.0.0.19: ICMP echo reply, id 122, seq 41, length 1008
09:23:15.841279 IP managervm.xxx.ts.net.44570 > workervm.xxx.ts.net.4789: VXLAN, flags [I] (0x08), vni 4096
IP 10.0.0.19 > 10.0.0.22: ICMP echo request, id 122, seq 42, length 1008
09:23:15.842143 IP workervm.xxx.ts.net.46629 > managervm.xxx.ts.net.4789: VXLAN, flags [I] (0x08), vni 4096
IP 10.0.0.22 > 10.0.0.19: ICMP echo reply, id 122, seq 42, length 1008
...

Odd, since I disabled encryption… so, I just removed that encryption option, and it works now! I guess “false” is treated as true somehow…? And for some reason ESP doesn’t work over tailscale?

Doing a simple ping is not enough. You should use a payload >=1500 to make sure to not have any MTU issues.