I noticed failed network operations when adding packages to an ubuntu 18.04 image, then discovered that my host networking had also failed. It appears to be caused by an extra default route that docker installs after some period of network activity (10-30 seconds typically)
Reproducible using a simple bash container and just pinging an external host for a while.
Routes before:
default via 192.168.99.254 dev enp0s31f6
default via 192.168.99.254 dev enp0s31f6 proto dhcp src 192.168.99.193 metric 100
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.99.128/25 dev enp0s31f6 proto kernel scope link src 192.168.99.193 metric 100
run basic bash container: (docker run -it --rm bash) and ping external host, after ~ 30 seconds the routes now:
0.0.0.0 dev vetha636f44 scope link
default dev vetha636f44 scope link
default via 192.168.99.254 dev enp0s31f6
default via 192.168.99.254 dev enp0s31f6 proto dhcp src 192.168.99.193 metric 100
169.254.0.0/16 dev vetha636f44 proto kernel scope link src 169.254.240.111
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
192.168.99.128/25 dev enp0s31f6 proto kernel scope link src 192.168.99.193 metric 100
Note the new routes to dev veth* ā¦ external networking broken at this point, can recover by removing default dev veth* route.
Once the bad route is removed, networking is restored but exiting container and restarting will cause it to fail again the same way.
I donāt see this on an almost identical system running docker-ce 24.0.5, build ced0996
Default routes should not be added, but I donāt think 169.254.0.0/16 should be added to the routes either. That is an IP range which is assigned to machines usualy when the DHCP could not assign an IP address to it. And I think I saw it using for other special cases as well. It should not depend on starting a container. If it does, my guess is that the container could not get an IP address for some reason from the original ip range and it got another one somehow, but I have no idea how and why.
Where is your Debian 12 host machine? Is it in a cloud? I think I saw a similar issue before, but I only have a vague memory about that, I think that issue was related to a cloud provider but I am not sure.
What happens if you create a custom docker network and use that instead of the default network?
It is a local Debian 12 machine. I also saw some issues with cloud instances in searching but the solutions didnāt seem applicable here.
I tried a non default bridge network, same issue as before. Using --networking=host is OK though and might be my fallback ā¦ weird my other server works fine with default (bridge) networking, havenāt found a significant difference yet.
using host networks canāt be a fallback. You can compare your two machines what was installed on one that wasnāt on the other or how the confiuration is different.
Thanks, that link appears to be a very similar problem. Their solution was blacklisting in conman which Iām not using (NetworkManager instead) and when I investigated blacklisting in NM it seems the veth* devices are already unmanaged ?
Comparing with my working host, the primary eth0 was not managed by NM either so I tried that change but no improvement. It turns out network activity in the container is not required, I can start bash and do nothing, eventually those extra routes for veth* appear, despite the fact NM isnāt managing them.
Something must be starting them though for the docker containers, will need to identify what ā¦ always about 30 seconds after the container starts.
I meant comparing instaled packages, running services, maybe the content of /etc. This is one way ro write all the content of etc into a single file so you can use the diff command to see the difference between the two servers etc folder.
It turns out the problem was connman ! ā¦ I didnāt think it was running but apparently it was, likely because I had installed multiple desktop environments on this system.
Fixing the blacklist in /etc/connman/main.conf and restarting the daemon and it is working
It is interesting many of the issues found via google remain unresolved, thanks for your assistance in moving this into the right column !!!
Just installed a clean Debian 12 with Docker, and ran into the very same issue. After some seconds, the connection drops and it only returns if I stop the Docker container(s).
This issue is driving me insane!
What should the value of āNetworkInterfaceBlacklistā be?
EDIT: Figured it out. What worked for me was:
# Open '/etc/connman/main.conf', uncomment NetworkInterfaceBlacklist and change to:
NetworkInterfaceBlacklist = vmnet,vboxnet,docker,veth
# Restart connman daemon
systemctl restart connman.service
# Start your container - issue should be gone