Separate containers using UDP get separate NAT addresses.
Actual behavior
Both containers are allocated the same source port in the NAT. The NAT has no way to tell which container a reply packet is destined for.
Information
the output of:
Version 1.12.0-rc2-beta17 (build: 9779)
ff18c0c63c5ff3c4a4a925d191d5592d655779d7
host distribution and version ( OSX 10.10.x, OSX 10.11.x, Windows, etc )
OSX El Capitan
Steps to reproduce the behavior
On the host network interface of the mac, run tcpdump
In container A, run
netcat -p 8888 -u 1.1.1.1 8888
Type a few lines to send packets
Repeat netcat operation for container B
In tcpdump output, note that all packet originate from the same IP and port. There is no distinction between the two containers. For bidirectional protocols, the NAT Is unable to return replies to the correct container.
14:34:54.276409 IP hostname.61990 > 1.1.1.1.ddi-udp-1: UDP, length 5
14:35:52.245710 IP hostname.61990 > 1.1.1.1.ddi-udp-1: UDP, length 5
This NAT implementation is broken for UDP. Every combination of private source address and private source port should be mapped to a unique external source address and port. (It is not necessary, and in fact not desirable, to map to a separate source port per destination address - so called “symmetric NAT” versus “full cone”. “Full cone” is greatly preferable for NAT traversal techniques.)
Actually, I have to withdraw that assertion. The problem is not within the linux VM, it’s in how the linux VM is bridge to the host networking.
Looking at tcpdump inside the xhyve host, you can see that two containers have been NAT’ted to separate source ports:
root@moby:~# tcpdump -i eth0 host 1.1.1.1 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes 16:22:12.838729 IP 192.168.65.2.15547 > 1.1.1.1.8888: UDP, length 5 16:22:15.672820 IP 192.168.65.2.15548 > 1.1.1.1.8888: UDP, length 5
Listening at the same time on the mac’s native ethernet, you can see that these two packets have had their source port altered again, and both to the same port:
$ sudo tcpdump -i en4 host 1.1.1.1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on en4, link-type EN10MB (Ethernet), capture size 262144 bytes
11:22:41.324018 IP hostname.56249 > 1.1.1.1.ddi-udp-1: UDP, length 5
11:22:43.382804 IP hostname.56249 > 1.1.1.1.ddi-udp-1: UDP, length 5`
So the first layer of address translation that happens inside linux is fine. It’s the xhyve NAT that’s broken.