Connectivity Issue with Docker Containers During Daemon Upgrades

Hello everyone,

I’m currently facing an issue with Docker on my virtual machine and would appreciate your insights. Here are the specifics:

Environment: Docker version 26.0.2 running on a virtual machine.
Problem: When upgrading the Docker daemon, there’s a disruption in connectivity to containers within the same network.

During Docker daemon upgrades (with live restore enabled), there’s a period where containers lose connectivity, impacting service availability.

Could this be due to a brief disruption in network connectivity and DNS resolution for containers? It seems the embedded DNS resolver in the Docker daemon, responsible for these services, might be temporarily unavailable during upgrades.

Any suggestions on how to mitigate this issue? I’m open to ideas and best practices to ensure smoother upgrades and maintain continuous connectivity for my containers.

Thank you for your help!

Docker on a single node is not HA (“high availability”).

If CPU, RAM, disk, network or power supply die, it’s gone.

Same for OS and underlying software upgrades.

If you want HA, you should look into multiple machines (with Docker Swarm or k8s).

1 Like

I’m not sure, because I always expected (at least) a very short downtime on a single machine, but yes, that could easily be the reason. The best practice depends on what you want to achieve. @bluepuma77 pointed out what you would need to do to reduce downtime, but even if you have a cluster, some connections could break, but webbrowsers often try again and the next query should work so the user could see a “not found” or similar message and 1 second later the loading website from another instance.

Short connection issues should probably be handled by the application, but I would not upgrade the OS or Docker when you have the most visitors or the most important process running that must not be interrupted.

If you just want to avoid the users noticing error messages, you can prepare a static maintenance webspage with a fallback webserver and use an external proxy service that you redirect to the fallback port or IP so people can see that maintenance is in progress. Since Docker upgrades can fail, that would not be a bad idea at all.

Note that my answer is mostly based on guesses. A better answer would require a better understanding regarding what exactly happens when you see connection issues between containers, which was part of your question :slight_smile:

1 Like