Docker CE swarm service discovery just does not work whatsoever

I am using Docker version 17.06.2-ce, build 3dfb8343b139d6342acfd9975d7f1068b5b1c3d3, and docker swarm service discovery just doesn’t work at all.

[:~] > docker service ls
ID                  NAME                       MODE                REPLICAS            IMAGE                                                                      PORTS
kzp3mrevetn9        music-testing             replicated          5/5                 dkr.ecr.us-east-1.amazonaws.com/aws/music:testing            *:30000->3000/tcp
x2r1qogwkv7x        nginx                      replicated          3/3                 nginx:latest                                                               *:30001->55555/tcp
re31ik4t4cpz        primary_frontend-testing   global              3/3                dkr.ecr.us-east-1.amazonaws.com/aws/primary_frontend:master   *:8035->80/tcp

Yet if I go into one of the primary frontend containers:

[:~] > docker exec -it primary_frontend-testing.svkmp85iptccku1h7ib76z06v.qbzla0ww65ehapo7i356gdq9w bash -l
3ed586eadecc:/usr/local/apache2# nslookup nginx 127.0.0.11
Server:    127.0.0.11
Address 1: 127.0.0.11 ip-127-0-0-11.ec2.internal

nslookup: can't resolve 'nginx': Name does not resolve
3ed586eadecc:/usr/local/apache2# nslookup music-testing 127.0.0.11
Server:    127.0.0.11
Address 1: 127.0.0.11 ip-127-0-0-11.ec2.internal

nslookup: can't resolve 'music-testing': Name does not resolve
3ed586eadecc:/usr/local/apache2# nslookup google.com 127.0.0.11
Server:    127.0.0.11
Address 1: 127.0.0.11 ip-127-0-0-11.ec2.internal

Name:      google.com
Address 1: 172.217.15.78 iad23s63-in-f14.1e100.net
Address 2: 2607:f8b0:400d:c03::8a qu-in-x8a.1e100.net

The containers can communicate via the overlay network just fine. If I try to nslookup by container ID no dice, if I try by container name no dice either. If I inspect the network:

[:~] > docker network inspect -v ingress | grep nginx
                "Name": "nginx.1.dcrfggpfhcfa1wiyh2fdy4bcj",
            "nginx": {
                        "Name": "nginx.3.qeukxhcug3bz8mj8euy374j2e",
                        "Name": "nginx.1.dcrfggpfhcfa1wiyh2fdy4bcj",
                        "Name": "nginx.2.2iedjs60264zypl56t8dv0p7a",

I see the containers are on there. I’m at a loss because the only bugs I see are related to service discovery instability in swarm, not that it doesn’t work whatsoever. What am I missing?

As it turns out, service discovery for some reason does not work on the default ingress network. I had to create a user-defined overlay network and put containers on that for service discovery to work. If this is in the docs I completely missed it (i don’t think it is in the docs).

If the containers are running could it be that your endpoint mode is set to dns-rr rather than the default vip?

I don’t recall having to create a separate network when I created my stack.

I don’t use docker-compose or stacks and I definitely didn’t pass any option to docker service create to alter the default behavior. It’s possible that compose creates a user-defined overlay I guess, so maybe you didn’t notice. Not sure, i have very little experience with compose. My colleague actually recommended this and said he ran into the same thing before and I sort of didn’t want to believe him cause it seems strange.

when you use compose or stack it does create a network implicitly.

Yea I’m just using the docker service commands by hand. That explains that.

I do it by hand all the time, and get service discovery reliably. The default ingress network does service name resoltion automaticallly. It does not do container name resolution.

So there is something unusual about your situation. Did you at any point reconfigure the ingress network?

Why would you do it by hand rather than using the docker-compose / stack deploy? Is there an advantage of doing the services manually?

I do it individually while I am still designing the stack, and for unit testing services.

1 Like

When you create your services using docker stack deploy -c <compose file name> <stack name> you will automatically get created for you an overlay network named <stack name>_default for any services in the compose file that doesn’t explicitly map to another network created in that file. So using the docker stack command enables service discovery in this way. Similarly with the docker compose command.

When setting up services by hand, the services are connected to the network named default which IIRC is not an overlay network, this doesn’t set up service discovery.

HTH

1 Like

Thanks jmcdonaghberklee for your response. It’s helpful.

Hello jmcdonaghberklee

Did you solve this issue?
I have a similar issue. Have a look there

Could you help me?
Maybe we could start a Zoom session, so that i can show you my issue

best regards
Ramon

Thank you so much for this. I’ve been struggling with this for two days now without so much as a hint of this via Google. I too must have missed this in the docs (if it’s in there). I’ve been setting this all up via ansible, which also does not implicitly create the network. So this had me badly stumped until I finally stumbled across your post. Thank you thank you thank you :slight_smile: