What is the canonical way to use IPv6 with docker?

Hi,

I am following up to my issue from June, Update to docker-ce 28.2.2 breaks bridge networking to container , which was not solved back then. after half a year, I now want to upgrade to current docker-ce and current Debian, and now I have docker-ce 5:29.1.3-1~debian.13~trixie and containerd.io 2.2.1-1~debian.13~trixie.

I have built back networking to IPv4 with NAT only so that my services are up, And now I want to do it correctly, so I need someone who is familiar with using IPv6 and docker. Please tell me if I want to do something that is wrong or will not work. I am willing to adapt.

My docker host is on 192.168.196.111/24, with the IPv6 address 2001:db8:1:196::6f:100/64:

2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:10:95:dc brd ff:ff:ff:ff:ff:ff
    altname enx5254001095dc
    inet 192.168.196.111/24 brd 192.168.196.255 scope global enp1s0
       valid_lft forever preferred_lft forever
    inet6 2001:db8:1:196:5054:ff:fe10:95dc/64 scope global deprecated dynamic mngtmpaddr noprefixroute 
       valid_lft 85930sec preferred_lft 0sec
    inet6 2001:db8:1:196::6f:100/64 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe10:95dc/64 scope link proto kernel_ll 
       valid_lft forever preferred_lft forever

from the IPv4 network, I have reserved 192.168.196.128/28 and from the iPv6 network, 2001:db8:1:196:6f::/80 for the docker containers. The gateway has routes for those two subnets to 192.168.196.111 and 2001:db8:1:196::6f:100, respectively. When I tcpdump on enp1s0 and ping addresses from the two subnets, I see those packets coming in on enp1s0 with the correct target MAC address, so I can be sure that the routing works. ip_forward, IPv6 forwarding are on for all interfaces, rp_filter is off.

If I was working with KVM VMs, I’d have the VMs attached to a macvlan interface on the host and just configure the IP addresses on the VM and be fine. I’d like to have that for my docker containers as well, so that I can treat them similarly like my KVM VMs. Some of the services I’d like to run are not web services and so I can’t take the usual reverse proxy approach.

I’d like to have the IPv6 address configured inside the container, and when I’m doing that, I’d like to have the IPv4 addresses without NAT as well so that accessing the container will work the same way for both protocols.

I understand this might be stretching it because the IP ranges overlap, but IP routing should handle this well. I would at least be surprised if docker would not allow me to do this as well. It’s all network namespaced and IP routing and NAT, isn’t it?

My test application is the mosquitto MQTT broker, it’s small and easy enough to be a low hanging fruit. With IPv4 only and NAT, this is what works:

volumes:
  data:
  log:

services:
  mosquitto_app:
    image: eclipse-mosquitto:latest
    ports:
      - "1883:1883"
    volumes:
      - ${PWD}/mosquitto.conf:/mosquitto/config/mosquitto.conf
      - data:/mosquitto/data
      - log:/mosquitto/log
    restart: always
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"

I THINK that what I want is:

networks:
br196:
name: bridge
external: true

volumes:
data:
log:

services:
mosquitto_app:
image: eclipse-mosquitto:latest
networks:
br196:
ipv4_address: 192.168.196.132
ipv6_address: 2001:db8:1:196:6f:0:6f:203
volumes:
- ${PWD}/mosquitto.conf:/mosquitto/config/mosquitto.conf
- data:/mosquitto/data
- log:/mosquitto/log
restart: always
logging:
driver: json-file
options:
max-size: "10m"
max-file: "3"

the bridge on br196 would need to have the IPv4 address 192.168.196.129 to be used as the gateway; for IPv6 I’d like to have the usual fe80::1 on the br196.

I can’t seem to be able to figure out the correct network definition and how to make docker grok this. Here is what I tried:

mh@corte:~/mosquitto $ docker network create br196 --ipv4 --ipv6 --subnet "196.168.196.128/28" --subnet "2001:db8:1:196:16f::/80" --gateway 192.168.196.129 --gateway fe80::1
no matching subnet for gateway 192.168.196.129

mh@corte:~/mosquitto $ docker network create br196 --ipv4 --ipv6 --subnet "196.168.196.128/28" --subnet "2001:db8:1:196:16f::/80" --gateway 192.168.196.129/28 --gateway fe80::1/80
invalid argument "192.168.196.129/28" for "--gateway" flag: invalid string being converted to IP address: 192.168.196.129/28

Usage:  docker network create [OPTIONS] NETWORK

Run 'docker network create --help' for more information

mh@corte:~/mosquitto $ docker network create br196 --ipv4 --ipv6 --subnet "196.168.196.128/28" --subnet "2001:db8:1:196:16f::/80" --gateway fe80::1
no matching subnet for gateway fe80::1

mh@corte:~/debian $ docker network create br196 --ipv4 --ipv6 --subnet "196.168.196.128/28" --subnet "2001:db8:1:196:16f::/80" 
8dac062020194226cc7a23e055baeac9f196e2548a5f4e1b89d3f1d4b119641e

[64/4797]mh@corte:~/debian $ docker compose up
Error response from daemon: invalid config for network br196: invalid endpoint settings:
no configured subnet contains IP address 192.168.196.132

This looks like docker tries to be smarter than me and is putting restrictions on me that are actually not needed. Which of my wishes am I not going to get with docker?

Or am I better off by just running the containers with macvtap? How woudl I configure that?

Greetings, Marc

P.S.: I have been doing IP routing for 30 years and have 15 years of operational IPv6 experience. I know what I want to do and I think that’s a clean way to do things. I just need somebody to tell me how far docker is going to play along

Apart from testing, I never configured ipv6 for my docker hosts.

Though, I can confirm that a ipv6 enabled bridge network with default values, will use nat node for ipv6 and use a random ULA for the bridge network. Outgoing traffic will use one of the hosts ipv6 addresses.

This behavior can be changed by using the option com.docker.network.bridge.gateway_mode_ipv6 (see: Bridge network driver options) and setting it to the desired gateway mode. An identical setting exists for ipv4 as well.

The smallest subnets I ever used with ipv6 where /64 masks, since the last 64 bit are used for the interface identifier. I would recommend starting with a /64 subnet, before trying exotic masks like /80 that might introduce new problems.

I just saw your note about working for 15 years with ipv6, so you definitly know this, but I will keep it around for everyone else:

I really hope you use 2001:db8 as placeholder to obscure your real fixed GA or ULA addresses. According RFC3849 “IPv6 Address Prefix Reserved for Documentation”, the prefix 2001:db8::/32 is meant to be used as placeholder in docs, instructions, blog posts and likes, but not to be used in a real network configuration.

I played around with it.
My main router has the ULA df00:0:0:200::1

docker network create --ipv6 --subnet fd00:0:0:200::/64 -o com.docker.network.bridge.gateway_mode_ipv6=routed ipv6_test

docker run -ti --rm --network ipv6_test alpine ping -c4 fd00:0:0:200::1
PING fd00:0:0:200::1 (fd00:0:0:200::1): 56 data bytes
64 bytes from fd00:0:0:200::1: seq=0 ttl=64 time=0.132 ms
64 bytes from fd00:0:0:200::1: seq=1 ttl=64 time=0.098 ms
64 bytes from fd00:0:0:200::1: seq=2 ttl=64 time=0.095 ms
64 bytes from fd00:0:0:200::1: seq=3 ttl=64 time=0.071 ms

--- fd00:0:0:200::1 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.071/0.099/0.132 ms

Note: When I tried with a network that has the subnet fd01::/64, created a ipv6 route for fd01::/64 via ipv6 ULA fd00:0:0:200:x:x:x:x of the docker host in my main router, it didn’t work. The network uses fd01::1 as gateway (which doesn’t exist, and isn’t provided by docker), and does not allow setting an ipv6 outside the specified subnet.

Update:

I missed it yesterday, the ip fd01::1 exists on the bridge interface of the network. I can ping it from the container and from other devices (using a route from the main router to the fe80/64 address of the docker host). I am able to ping a test nginx container from the host on fd01::2, but I am not able to ping it from other devices.

me@other-host:~$ traceroute6 fd01::1
traceroute to fd01::1 (fd01::1) from fd00::200:yyyy:yyyy:yyyy:yyyy, 30 hops max, 24 byte packets
 1  router.lan (fd00:0:0:200::1)  0.3723 ms  0.2504 ms  0.2389 ms
 2  fd01::1 (fd01::1)  0.3671 ms  0.3383 ms  0.1789 ms

me@other-host:~$ traceroute6 fd01::2
traceroute to fd01::2 (fd01::2) from fd00::200:yyyy:yyyy:yyyy:yyyy, 30 hops max, 24 byte packets
 1  router.lan (fd00:0:0:200::1)  0.4671 ms  0.3261 ms  0.2260 ms
 2  fd00::200:xxxx:xxxx:xxxx:xxxx(fd00::200:xxxx:xxxx:xxxx:xxxx)  0.3737 ms  0.2824 ms  0.1872 ms
 3  * * *
 4  * * *
 5  * * *
 6  * * *

My test with fd00:0:0:200::1 was also incorrect, as the ping never went to my main router.

It is getting even funnier: when the nginx container publishes a port, it becomes pingable, and the container ports gets accessible from other devices:

Create container with published ports:

docker run -d --rm --network ipv6_test -p 8080:80 nginx

Access from other device:

me@other-host:~$ ping -c4 fd01::2
PING fd01::2(fd01::2) 56 data bytes
64 bytes from fd01::2: icmp_seq=1 ttl=63 time=0.429 ms
64 bytes from fd01::2: icmp_seq=2 ttl=63 time=0.416 ms
64 bytes from fd01::2: icmp_seq=3 ttl=63 time=0.653 ms
64 bytes from fd01::2: icmp_seq=4 ttl=63 time=0.594 ms

--- fd01::2 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3055ms
rtt min/avg/max/mdev = 0.416/0.523/0.653/0.102 ms
me@other-host:~$ curl [fd01::2]
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

Note: traceroute6 still fails.