I have a swarm (Docker 1.12.1 Engines in swarm mode) with 3 controllers and 4 workers. Sometimes, one of the workers seems to stop working with the ingress load balancer and just says “Connection refused” to all requests. The only way I’ve been able to get it to work again is to rm and create the service again from one of the controllers. E.g.,
docker service rm nginx
docker service create --name nginx -p 32000:80 nginx
How do I debug this? Or, get the node to work again accepting ingress requests for a running service? I don’t understand what causes it to fail, or how to restore service without completely removing and re-creating a service.