Packet Loss Between Containers

I’m having issues with my Docker setup as it relates to networking between them. I noticed the issue by exec’ing into a container and trying to ping another container within the same Docker network.

Seemingly completely random packet loss. When this happens it affects all containers and all pinging/communication between containers and I cannot for the life of me figure out why, it’s almost like Docker’s internal DNS starts to die over time. Recreating a new docker network and moving all containers to it immediately resolves the issue, but only for a finite amount of time. Could be days, could be months.

I have considered that this may be a Caddy issue, but I don’t think it relates to Caddy for 2 reasons:

  1. Exec’ing into the container and pinging another should be bypassing any of Caddy’s actual functionality and should “just work” as far as Docker is concerned.
  2. When the issue appears, it applies to other containers as well, eg. audiobookshelf<->jellyfin

System details:

Debian running OpenMediaVault 7.4.13 running docker-ce 5:27.3.1-1~debian.12~bookworm

Docker network inspect:

root@nas:~# docker network inspect caddy_network
[
{
“Name”: “caddy_network”,
“Id”: “587d44cbc57255715c708a3f05017a26f37050d15bcd3eb21def8b821df0a211”,
“Created”: “2024-11-17T17:35:00.780975271-07:00”,
“Scope”: “local”,
“Driver”: “bridge”,
“EnableIPv6”: false,
“IPAM”: {
“Driver”: “default”,
“Options”: null,
“Config”: [
{
“Subnet”: “172.19.0.0/16”,
“Gateway”: “172.19.0.1”
}
]
},
“Internal”: false,
“Attachable”: false,
“Ingress”: false,
“ConfigFrom”: {
“Network”: “”
},
“ConfigOnly”: false,
“Containers”: {
“977e732afe8b57070ea8f97770ed1e4ed5c82ab08af328996ff7bc45598d618b”: {
“Name”: “audiobookshelf”,
“EndpointID”: “64aab0e717bbb62ecfc76cf952330809e87db30f9a24dd65a17f520850cc9678”,
“MacAddress”: “02:42:ac:13:00:02”,
“IPv4Address”: “172.19.0.14/16”,
“IPv6Address”: “”
},
“b77172af768c69a4aa670c9baa15b2224270723ff5a25890ff4bd61b3bbcf1d2”: {
“Name”: “caddy”,
“EndpointID”: “2ad1a8b38161ee4726d172e16902b72aa5c63e3cc730809c2631b2070c33c0ca”,
“MacAddress”: “02:42:ac:13:00:0a”,
“IPv4Address”: “172.19.0.10/16”,
“IPv6Address”: “”
}
“Options”: {},
“Labels”: {}
}
]

Things that come to mind:

  • collision between the container network ip ranges and physical network ip ranges
  • A container with --privileged or NET_ADMIN capabilities runs wild (like a vpn container?)
  • Host has a failover ip (CARP/VRRP) or it’s dns server or default gateway ip are a failover ip.

Though, there is not realy enough information to get a real idea of what’s going on.

You might want to check the deamon logs: Read the daemon logs | Docker Docs

Thanks for the guidance. I’m not seeing any issues related to the 3 points you made:

  • There are no other network IP ranges on my network with 172.19.x.x
  • I went through all my containers and none are privileged, nor do any have any cap_add’s whatsoever.
  • As far as I’m aware I’ve never set up failover IPs on the host machine, so that should not be the case.

With regards to the daemon logs, they are pretty much getting spammed with “No non-localhost DNS nameservers are left in resolv.conf. Using default external servers” messages. Could that be a clue?

Linked issue with potential cause for anyone with a similar issue:

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.