I have a set of services that I am deploying to Docker swarm on a single node via a docker-compose file. There is a particular service that the others cannot reach for 20+ minutes.
- OS Windows Server 2016 Version 1607 (OS Build 14393.2828)
- Docker version 18.09.3, build 142dfcedca
- Windows containers
Reproduction steps
- 1 Init the swarm via
docker swarm init
- 2 Deploy services via
docker stack deploy -c docker-compose.yml stack1
- 3 Observe service status via
docker service ls
- service is running (Replicated 1/1) - 4 Observe service is running via
docker ps
&docker logs <myServiceContainerId>
- 5 Observe the container on the overlay network via
docker network inspect <networkName>
under Containers - 6 Choose another container and run
docker exec -i <containerId> cmd
- 7 From within this container run
ping tasks.myservice
- 8 Ping consistently fails until one of the following is done:
-
- 1 Observe
myservice
's IP from the failed ping and runping <myServiceIP>
. This succeeds and I can then run pingtasks.myservice
successfully going forward.
- 1 Observe
-
- 2 Wait 20+ minutes for the issue to resolve itself and re-run the
ping tasks.myservice
from within another container. It succeeds.
- 2 Wait 20+ minutes for the issue to resolve itself and re-run the
I also discovered that with the above setup ping tasks.stackname_myservice
returns successfully. If I change the docker-compose to point to tasks.stackname_myservice
instead of tasks.myservice
I can ping task.myservice
successfully but not tasks.stackname_myservice
.
Is there something I am missing that is causing this odd behavior?
Example of the docker-compose:
services:
myService:
image: containerRegistry/image:tag
otherService:
image: containerRegistry/image:tag
environment:
- MyServiceAddress=tasks.myservice