The containers can communicate via the overlay network just fine. If I try to nslookup by container ID no dice, if I try by container name no dice either. If I inspect the network:
I see the containers are on there. I’m at a loss because the only bugs I see are related to service discovery instability in swarm, not that it doesn’t work whatsoever. What am I missing?
As it turns out, service discovery for some reason does not work on the default ingress network. I had to create a user-defined overlay network and put containers on that for service discovery to work. If this is in the docs I completely missed it (i don’t think it is in the docs).
I don’t use docker-compose or stacks and I definitely didn’t pass any option to docker service create to alter the default behavior. It’s possible that compose creates a user-defined overlay I guess, so maybe you didn’t notice. Not sure, i have very little experience with compose. My colleague actually recommended this and said he ran into the same thing before and I sort of didn’t want to believe him cause it seems strange.
I do it by hand all the time, and get service discovery reliably. The default ingress network does service name resoltion automaticallly. It does not do container name resolution.
So there is something unusual about your situation. Did you at any point reconfigure the ingress network?
When you create your services using docker stack deploy -c <compose file name> <stack name> you will automatically get created for you an overlay network named <stack name>_default for any services in the compose file that doesn’t explicitly map to another network created in that file. So using the docker stack command enables service discovery in this way. Similarly with the docker compose command.
When setting up services by hand, the services are connected to the network named default which IIRC is not an overlay network, this doesn’t set up service discovery.
Thank you so much for this. I’ve been struggling with this for two days now without so much as a hint of this via Google. I too must have missed this in the docs (if it’s in there). I’ve been setting this all up via ansible, which also does not implicitly create the network. So this had me badly stumped until I finally stumbled across your post. Thank you thank you thank you