Container DNS Resolution in Ingress Overlay Network

Expected behavior

DNS resolution within a container should resolve to the same addresses / address set regardless of where you are in the swarm.

Actual behavior

DNS results seem to be split. On different containers / nodes, we end up with a different set of addresses within the swarm. Usually this only shows the addresses of the containers that are contained on the node, instead of the entire swarm.

The Leader’s containers mainly tend to show the entire result set. However, in this particular example, it did not.

This behavior seems to sometimes, not occur at all. However, scaling services up and down dynamically seems to exacerbate the issue.

Additional Information

In this particular case, we’re setup with a swarm that has 3 managers and 3 workers. However, I have experimented locally, and I also get this issue any time another node joins the swarm in general. So as little as 1 manager, and 1 worker has caused me to see this issue.

Here’s the current setup of our AWS swarm

docker node ls

Output:

ID                           HOSTNAME                                      STATUS  AVAILABILITY  MANAGER STATUS
21mxfguzwtolu2j1d2aj8yy09    ip-192-168-34-102.us-west-2.compute.internal  Ready   Active        
2ouwmt4yusc3i3bm7664y4g67    ip-192-168-34-101.us-west-2.compute.internal  Ready   Active        
7skp2j2exwxlj2evgyic76l17    ip-192-168-33-16.us-west-2.compute.internal   Ready   Active        
a9ca38s8jjqapsyqq7hq8xs2d    ip-192-168-34-250.us-west-2.compute.internal  Ready   Active        Leader
cqe8ebej4dagx15gj9i32ebrv    ip-192-168-33-158.us-west-2.compute.internal  Ready   Active        Reachable
e5qcyskt9um0untan9hoyg69l *  ip-192-168-34-251.us-west-2.compute.internal  Ready   Active        Reachable

In this case, to simplify matters, we have two services we’ve created, and that are being run. I’ve configured the desire service to have a mode of DNSRR, in order to avoid VIP issues. This is somewhat necessary, because we do have situations requiring callbacks to specific ip addresses.

docker service inspect desire

Output:

[
    {
        "ID": "brpeop27lyeuxgmbkpjbxlaxi",
        "Version": {
            "Index": 547
        },
        "CreatedAt": "2016-08-31T19:23:55.258811173Z",
        "UpdatedAt": "2016-08-31T19:23:55.258811173Z",
        "Spec": {
            "Name": "desire",
            "TaskTemplate": {
                "ContainerSpec": {
                    "Image": "<custom ecr repository>/desire:1.0.0"
                },
                "Resources": {
                    "Limits": {},
                    "Reservations": {}
                },
                "RestartPolicy": {
                    "Condition": "any",
                    "MaxAttempts": 0
                },
                "Placement": {}
            },
            "Mode": {
                "Replicated": {
                    "Replicas": 12
                }
            },
            "UpdateConfig": {
                "Parallelism": 1,
                "FailureAction": "pause"
            },
            "Networks": [
                {
                    "Target": "ecldlbxrrepztpk8lhct6nszr"
                }
            ],
            "EndpointSpec": {
                "Mode": "dnsrr"
            }
        },
        "Endpoint": {
            "Spec": {}
        },
        "UpdateStatus": {
            "StartedAt": "0001-01-01T00:00:00Z",
            "CompletedAt": "0001-01-01T00:00:00Z"
        }
    }
]

However, I’ve scaled one of the services to 12, in order to highlight the issue, and ensure an even spread across the nodes.

docker service ls

Output:

ID            NAME       REPLICAS  IMAGE                                                                   COMMAND
bn4w5rfpe2mt  awareness  1/1       <custom ecr repository>/awareness:1.0.0  
brpeop27lyeu  desire     12/12     <custom ecr repository>/desire:1.0.0 

In this case, I’m going to be picking a service that exists on the current node, however, to demonstrate the spread, I will show the following results first.

docker service ps desire

Output:

ID                         NAME       IMAGE                                 NODE                                          DESIRED STATE  CURRENT STATE          ERROR
6q06y8l2e6gyhd1b8f7gz0ae5  desire.1   <custom ecr repository>/desire:1.0.0  ip-192-168-33-158.us-west-2.compute.internal  Running        Running 5 minutes ago  
b8jbog3047os34l2x4q52it0t  desire.2   <custom ecr repository>/desire:1.0.0  ip-192-168-34-251.us-west-2.compute.internal  Running        Running 5 minutes ago  
541u33w2hj5el6tunmwhvny1k  desire.3   <custom ecr repository>/desire:1.0.0  ip-192-168-34-102.us-west-2.compute.internal  Running        Running 5 minutes ago  
7vs122a1578vtje9mwlkbtlaz  desire.4   <custom ecr repository>/desire:1.0.0  ip-192-168-33-158.us-west-2.compute.internal  Running        Running 5 minutes ago  
7xu05ai16ik1c007wzks8m4qq  desire.5   <custom ecr repository>/desire:1.0.0  ip-192-168-34-251.us-west-2.compute.internal  Running        Running 5 minutes ago  
dhhubwb4kxnbxovcubt9v2786  desire.6   <custom ecr repository>/desire:1.0.0  ip-192-168-34-102.us-west-2.compute.internal  Running        Running 5 minutes ago  
0sqdqyqvtjwfl0np5o91nnwqz  desire.7   <custom ecr repository>/desire:1.0.0  ip-192-168-34-250.us-west-2.compute.internal  Running        Running 5 minutes ago  
21brjwzuk3tcq0io0z99b7cds  desire.8   <custom ecr repository>/desire:1.0.0  ip-192-168-34-101.us-west-2.compute.internal  Running        Running 5 minutes ago  
6cbl1cfj7wntleg4kf0nuatm9  desire.9   <custom ecr repository>/desire:1.0.0  ip-192-168-34-101.us-west-2.compute.internal  Running        Running 5 minutes ago  
db0h0xljv05mh8jw0w1v42qnu  desire.10  <custom ecr repository>/desire:1.0.0  ip-192-168-33-16.us-west-2.compute.internal   Running        Running 5 minutes ago  
borwe5p4jceyngr9d07k0fxcx  desire.11  <custom ecr repository>/desire:1.0.0  ip-192-168-34-250.us-west-2.compute.internal  Running        Running 5 minutes ago  
0wj1z9dqwo61079k9zn3pr9gp  desire.12  <custom ecr repository>/desire:1.0.0  ip-192-168-33-16.us-west-2.compute.internal   Running        Running 5 minutes ago 

Now I will pick a single container from the node I am currently on, in order to demonstrate the results.

docker ps

Output:

CONTAINER ID        IMAGE                                    COMMAND                  CREATED             STATUS              PORTS                NAMES
d2f70f3ffbd0        <custom ecr repository>/desire:1.0.0     "/usr/sbin/desire"       8 minutes ago       Up 8 minutes        80/tcp               desire.5.7xu05ai16ik1c007wzks8m4qq
3d3fd83909a1        <custom ecr repository>/desire:1.0.0     "/usr/sbin/desire"       8 minutes ago       Up 8 minutes        80/tcp               desire.2.b8jbog3047os34l2x4q52it0t
612d960d97e4        docker4x/controller:aws-v1.12.1-beta5    "loadbalancer run --l"   19 hours ago        Up 19 hours                              editions_controller
9afba9b67a42        docker4x/shell-aws:aws-v1.12.1-beta5     "/entry.sh /usr/sbin/"   19 hours ago        Up 19 hours         0.0.0.0:22->22/tcp   modest_thompson
636401cb0a50        docker4x/guide-aws:aws-v1.12.1-beta5     "/entry.sh"              19 hours ago        Up 19 hours                              prickly_almeida

In this case, d2f70f3ffbd0 looks good enough for my tastes. So let’s exec into it with shell.
Command:

docker exec -it d2f70f3ffbd0 sh
/ # 

now, a simple nslookup to query desire. I am expecting all 12 addresses.

nslookup desire

Output:

nslookup: can't resolve '(null)': Name does not resolve

Name:      desire
Address 1: 10.255.0.13 d2f70f3ffbd0
Address 2: 10.255.0.10 desire.2.4muoyz3tyimf1vm8t1cx3fwza.ingress
Address 3: 10.255.0.11 desire.1.6q06y8l2e6gyhd1b8f7gz0ae5.ingress
Address 4: 10.255.0.18 residual.1.7x9yalx33utte9zl3jsryl7bl.ingress # this is interesting, since no other services are up, and it doesn't even have the same name.
Address 5: 10.255.0.19 desire.4.7vs122a1578vtje9mwlkbtlaz.ingress

Compared to results from a container on the leader. First need to ssh there, so I have at this point.

Welcome to Docker!
~ $ 

Just to double check.

docker node ls

Output:

ID                           HOSTNAME                                      STATUS  AVAILABILITY  MANAGER STATUS
21mxfguzwtolu2j1d2aj8yy09    ip-192-168-34-102.us-west-2.compute.internal  Ready   Active        
2ouwmt4yusc3i3bm7664y4g67    ip-192-168-34-101.us-west-2.compute.internal  Ready   Active        
7skp2j2exwxlj2evgyic76l17    ip-192-168-33-16.us-west-2.compute.internal   Ready   Active        
a9ca38s8jjqapsyqq7hq8xs2d *  ip-192-168-34-250.us-west-2.compute.internal  Ready   Active        Leader
cqe8ebej4dagx15gj9i32ebrv    ip-192-168-33-158.us-west-2.compute.internal  Ready   Active        Reachable
e5qcyskt9um0untan9hoyg69l    ip-192-168-34-251.us-west-2.compute.internal  Ready   Active        Reachable

Time to pick a container again.

docker ps

Output:

CONTAINER ID        IMAGE                                   COMMAND                  CREATED             STATUS              PORTS                NAMES
8323da3da8b8        <custom ecr repository>/desire:1.0.0    "/usr/sbin/desire"       28 minutes ago      Up 28 minutes       80/tcp               desire.7.0sqdqyqvtjwfl0np5o91nnwqz
bde4723b51cd        <custom ecr repository>/desire:1.0.0    "/usr/sbin/desire"       28 minutes ago      Up 28 minutes       80/tcp               desire.11.borwe5p4jceyngr9d07k0fxcx
918d8cb49af1        docker4x/controller:aws-v1.12.1-beta5   "loadbalancer run --l"   19 hours ago        Up 19 hours                              editions_controller
b00a0e9c452b        docker4x/shell-aws:aws-v1.12.1-beta5    "/entry.sh /usr/sbin/"   19 hours ago        Up 19 hours         0.0.0.0:22->22/tcp   desperate_stonebraker
f7e0d5ccfd3e        docker4x/guide-aws:aws-v1.12.1-beta5    "/entry.sh"              19 hours ago        Up 19 hours                              small_archimedes

8323da3da8b8 looks like a good one.

docker exec -it 8323da3da8b8 sh
/ #

And then, another nslookup attempt

nslookup desire

Output:

nslookup: can't resolve '(null)': Name does not resolve

Name:      desire
Address 1: 10.255.0.9 8323da3da8b8
Address 2: 10.255.0.10 desire.2.4muoyz3tyimf1vm8t1cx3fwza.ingress
Address 3: 10.255.0.17 desire.11.borwe5p4jceyngr9d07k0fxcx.ingress

Both of these nodes are getting two different sets of a results. Scaling to 12 in this case may be a little extreme, but this issue is apparent with 1 service with 3 replicas, on 2 different nodes.

Steps to reproduce the behavior

Since the behavior exists in the overlay network in the swarm in general, the following steps work in my local environment to reproduce.

  1. Initialize a Swarm
  2. Add another node to the Swarm
  3. Create a persistent service, and scale it to 3
  4. Attach to a container in each node, and run nslookup for the service name.
1 Like

I have similar or same issue. I’m running beta5 in 5 node cluster (t2.medium / 3 managers, 2 workers). For me it’s not just DNS, it’s swarm networking in general and it seems to be communication between managers and workers. If I start services in just managers, everything is fine (DNS and other communication between services work).

docker service create --name foo1 --constraint="node.role == manager" --network foo alpine sleep 999999
docker service create --name foo2 --constraint="node.role == manager" --network foo alpine sleep 999999
docker service create --name foo3 --constraint="node.role == manager" --network foo alpine sleep 999999
docker service create --name foo4 --constraint="node.role == manager" --network foo alpine sleep 999999
docker service create --name foo5 --constraint="node.role == manager" --network foo alpine sleep 999999

~ $ docker -H 192.168.33.31 exec -it d414 sh

/ # nslookup foo1
Name:      foo1
Address 1: 10.0.0.2 ip-10-0-0-2.eu-west-1.compute.internal
/ # nslookup foo2
Name:      foo2
Address 1: 10.0.0.4 ip-10-0-0-4.eu-west-1.compute.internal
/ # nslookup foo3
Name:      foo3
Address 1: 10.0.0.6 ip-10-0-0-6.eu-west-1.compute.internal
/ # nslookup foo4
Name:      foo4
Address 1: 10.0.0.8 ip-10-0-0-8.eu-west-1.compute.internal
/ # nslookup foo5
Name:      foo5
Address 1: 10.0.0.10 ip-10-0-0-10.eu-west-1.compute.internal

Now, if I start one service in worker node, it’s not able to communicate with services running in manager nodes:

~ $ docker service create --name foo6 --constraint="node.role == worker" --network foo alpine sleep 999999
cssbynaied4ui30dcnjtbiger
~ $ docker service ps foo6
ID                         NAME    IMAGE   NODE                                          DESIRED STATE  CURRENT STATE          ERROR
8gm65mt0fhfmgxe48fug7d9tk  foo6.1  alpine  ip-192-168-34-187.eu-west-1.compute.internal  Running        Running 3 seconds ago  
~ $ docker -H 192.168.34.187 ps
CONTAINER ID        IMAGE                                  COMMAND             CREATED             STATUS              PORTS               NAMES
df0c0d7a9f79        alpine:latest                          "sleep 999999"      57 seconds ago      Up 55 seconds                           foo6.1.8gm65mt0fhfmgxe48fug7d9tk
72e8ae907480        docker4x/guide-aws:aws-v1.12.1-beta5   "/entry.sh"         2 hours ago         Up 2 hours                              pedantic_mirzakhani
ea6f7762860e        docker4x/guide-aws:aws-v1.12.1-beta5   "/entry.sh"         24 hours ago        Up 2 hours                              romantic_perlman
c78ffedd6c14        docker4x/guide-aws:aws-v1.12.1-beta5   "/entry.sh"         44 hours ago        Up 2 hours                              pensive_easley
~ $ docker -H 192.168.34.187 exec -it df0 sh
/ # ping foo1
ping: bad address 'foo1'
/ # ping foo2
ping: bad address 'foo2'
/ # ping foo3
ping: bad address 'foo3'
/ # ping foo4
ping: bad address 'foo4'
/ # ping foo5
ping: bad address 'foo5'
/ # ping foo6
PING foo6 (10.0.0.12): 56 data bytes
64 bytes from 10.0.0.12: seq=0 ttl=64 time=0.034 ms

If I create one more service in another worker, it is not able to communicate with any other services. So it seems that services running in workers are not able to communicate using swarm overlay networks.

I also had a similar issue today. It seemed ok until the ELB changed the instances of the worker nodes. Then my containers on an overlay network could no longer resolve across docker nodes.

We are experiencing the same issue - currently preventing us from using docker for AWS.

Is this issue fixed?