My team is facing an issue with Docker on Windows 11.
Context:
We have a monorepo with several microservices and on localhost we run docker containers for postgres, hasura, weaviate via a couple of docker-compose configs.
The application we develop requires some data to be seeded in all of our db’s (we use our own API to import this data to the databases, we don’t use any native db tools for resting/importing backups/dumps).
Problem:
The problem is that on Windows 11 after the import of this data is done and we start the application - soon after (within first few calls) the docker containers become unreachable. No significant load, no logs indicating a problem, but requests between and to them fail.
It is almost as if something temporarily blocks the network traffic to these containers (we did disable any firewalls and antivirus software). Restarting the deployments, restarting docker, and in some cases, even restarting the machine (if done quick-enough after they become unreachable), didn’t make those containers become reachable again.
Software Versions:
Docker Desktop version: v4.39.0
OS: Windows 11 Home 24H2 26100.3476
It is going to be tough to provide the exact context you will need to reproduce, because it involves (too) much of our codebase, but I’m hoping someone may’ve faced this/similar issue and knows how to solve it or at least can point us in the right direction. We have now battled with this for 2 days and have exhausted all ideas for tests we can do in order to learn more about it and fix it. Any help would be much appreciated!
If I have to use my imagination, since the issue starts after you loaded the data, I would investigate filesystem access and disk IO and speed even if it doesn’t make sense first when the problem seems to be accessing containers over network. Docker Desktop has a special way to load data from the host, so if you bind mounted a folder to which you then imported data (even if through a container), I guess that could affect performance.
How much data are we talking about? Is it large or contains large amount of files?
For performance reasons, I usually recommend storing data inside a WSL distribution which would mean the data is on the same filesystem in the same VM as the container in Docker Desktop when using WSL integration. The WSL distribution of Docker Desktop and your chosen distro are isolated but still in the same virtual machine basically. Assuming of course you are using the WSL2 backend (I see it added as tag), not HyperV.
We will keep this advice in-mind for using larger datasets, for sure. I don’t think the problem we have now is performance-related, though, because within the internal docker network - containers talk to each-other just fine and we are able to reproduce the issue with very low quantities of imported data, too.
We might have been able to work around the problem, using the internal docker network (replacing localhost with an IP) and this seems to work ATM. I have instructed my team to post the details here as soon as we are sure this works.
I’m also not sure how “localhost” is related to the issue. Every container has their own localhost. Docker Desktop supports a kind of fake host network which allows a process in the container to access a process on the host’s localhost and also allows a process or a user on the host access a port on a container’s localhost. Is this what you used or did you mean forwarding a port from the host to the container and using “localhost:PORT” stopped working but “HOSTIP:HOSTPORT” worked?
I think we have found the solution for this problem.
After changing the “localhost” environmental variable to the explicit IP of the Docker Linux VM we are not facing this problem anymore.
How did we got the IP of the Docker Linux VM?
We have run the ipconfig command in the terminal and get the IP of Ethernet adapter vEthernet (WSL (Hyper-V firewall)) .
After retrieving this IP address use it everywhere instead of the localhost for the problematic services.