We’re using Docker in Swarm mode to host a number of services. Recently we’ve hit an issue where we get connection timeouts intermittently (sometimes as much as every other request) when trying to access some services.
We’ve upgraded the environment to the latest version of Docker (currently Docker version 17.03.0-ce, build 3a232c8), done a staggered reboot of all servers (trying to maintain uptime if possible even though this environment is technically a test environment) and tried stopping / starting services as well, but the issue still persists.
I’m confident the issue is not related to the service that’s running in Docker, as we’re seeing it on various services which have until recently been running without issue, I think it’s more likely an environmental issue, or some problem with Docker’s internal routing in the overlay network, but not sure how to prove / solve this.
Any advice on how to diagnose or solve this would be greatly appreciated!