Docker Community Forums

Share and learn in the Docker community.

[RESOLVED] Service name resolution broken on alpine and docker 1.11.1-cs1


(Arnaud de Mouhy) #1

Hello,

it seems that service/container name resolution on alpine (edge or latest) is broken since the upgrade to docker 1.11.1-cs1 on Docker Cloud.

FYI, services ‘alpine’ and ‘linked-service’ are in the same stack.

With docker 1.9.1-cs2

$ docker-cloud container exec alpine-1 ping linked-service
PING linked-service (10.7.0.3): 56 data bytes
64 bytes from 10.7.0.3: seq=0 ttl=64 time=0.072 ms
64 bytes from 10.7.0.3: seq=1 ttl=64 time=0.067 ms
64 bytes from 10.7.0.3: seq=2 ttl=64 time=0.059 ms

$ docker-cloud container exec alpine-1 ping linked-service
search xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.local.dockerapp.io
nameserver 127.0.0.1
nameserver 8.8.8.8
nameserver 8.8.4.4

$ docker-cloud container exec alpine-1 nslookup linked-service
nslookup: can't resolve '(null)': Name does not resolve

Name:      linked-service
Address 1: 10.7.0.3

With docker 1.11.1-cs1

$ docker-cloud container exec alpine-1 ping linked-service
ping: bad address 'linked-service'

$ docker-cloud container exec alpine-1 cat /etc/resolv.conf
search xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.local.dockerapp.io
nameserver 127.0.0.11
options ndots:0

$ docker-cloud container exec alpine-1 nslookup linked-service
nslookup: can't resolve '(null)': Name does not resolve

nslookup: can't resolve 'linked-service': Name does not resolve

$ docker-cloud container exec alpine-1 nslookup linked-service.xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.local.dockerapp.io
nslookup: can't resolve '(null)': Name does not resolve

Name:      linked-service.xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.local.dockerapp.io
Address 1: 10.7.0.45 linked-service-1.stack.xxxxxxxx.dockercloud

Single service named "docs" is not resolving in my stack
(Fernando Mayo) #2

Seems to be related to the options ndots:0 that Docker 1.11 is injecting in /etc/resolv.conf. As a workaround, deleting that line makes it work.

We are going to fix this on our side ASAP.

Thanks for reporting.


(Dvirf) #3

Hi Fernando and Arnaud,

We have the same problem, and it prevents us from using Docker Cloud as a production service (along with High Availability concerns).
Let’s stick with Arnaud’s example:
if we ping linked-service, then we get

ping: bad address 'linked-service'

if we add .xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.local.dockerapp.io to the linked-service then we get a valid response.
for example, we can execute:

search=`cat /etc/resolv.conf | head -n 1 | cut -d' ' -f2`
ping linked-service.${search}

and then we get a response.

This is of course a work around, and not an acceptable long term solution.


(Fernando Mayo) #4

We are aware of this and we are currently working to send a fix to production in the next 24 hours. Thanks for reporting.


(Fernando Mayo) #5

@dvirf @dehy This is now fixed in production. If you redeploy your services based on alpine they should resolve correctly. Can you please confirm?


(Localguest) #6

This is not resolved yet in production; I am still facing the same issue on Docker Cloud. Removing the “ndots:0” from /etc/resolv.conf fixes the issue.

edit: I am using a Debian 8 container “official nginx image”


(Fernando Mayo) #7

@localguest we inject ndots:1 to /etc/resolv.conf which solves the issue. Did you redeploy the service? Can you share the contents of /etc/resolv.conf when you deploy that image?


(Localguest) #8

@fermayo we did try redeploying the service. Here is the /etc/resolv.conf:
# docker exec -it 43de66613a95 cat /etc/resolv.conf
search 5783f1b8-6951-4b57-b73e-aaf2004e249b.local.dockerapp.io
nameserver 127.0.0.11
options ndots:1 ndots:0

I also noticed that the “service hostname” does not get resolved but the container hostname does, e.g. for a service named app, app does not resolve but app-1 does.


(Tyler Ruppert) #9

@fermayo Is this still open? I am seeing this issue as well currently for Docker 1.11.1-cs1

resolv.conf contents:
# docker exec cba228f69dbf cat /etc/resolv.conf
search 616b7045-9fbb-4ba3-bac2-0edf6357747f.local.dockerapp.io
nameserver 127.0.0.11
options ndots:1 ndots:0

attempt to ping by service name:
# docker exec cba228f69dbf ping zookeeper
ping: bad address ‘zookeeper’

ping by full address:
# docker exec cba228f69dbf ping zookeeper.616b7045-9fbb-4ba3-bac2-0edf6357747f.local.dockerapp.io
PING zookeeper.616b7045-9fbb-4ba3-bac2-0edf6357747f.local.dockerapp.io (10.7.0.5): 56 data bytes
64 bytes from 10.7.0.5: seq=0 ttl=64 time=0.265 ms


Docker Cloud Release Notes (09/27/2016)
(Levjj) #10

I don’t think this is fixed yet. After upgrading a docker-cloud node to docker 1.11.1-cs1 on an Ubuntu 14.04.4 LTS host, service names were not resolved anymore.

Removing the line options ndots:1 ndots:0 solved this issue for me.


(Ryan Kennedy) #11

The latest Docker Cloud release is now available with support for Docker Engine 1.11.2-cs5, which introduces service discovery and DNS improvements, along with more reliable networking between containers.

For more information on this release and how to upgrade nodes to Docker Engine 1.11.2-cs5, check out: Docker Cloud Release Notes (09/27/2016)


(Tyler Ruppert) #12

After upgrading docker, I am still seeing this same issue. I can reach the container by its full name, but not by the service name

# docker -v
Docker version 1.11.2-cs5, build d364ea1

# docker exec f0d2000b3ef2 cat /etc/resolv.conf
search 1b6ea601-1204-4a53-beff-b7bfb060b9b8.local.dockerapp.io
nameserver 127.0.0.11
options ndots:1 ndots:0

# docker exec f0d2000b3ef2 ping zookeeper
ping: bad address 'zookeeper'

# docker exec f0d2000b3ef2 ping zookeeper-1.Core.1eee9199
PING zookeeper-1.Core.1eee9199 (10.7.0.5): 56 data bytes
64 bytes from 10.7.0.5: seq=0 ttl=64 time=0.057 ms
64 bytes from 10.7.0.5: seq=1 ttl=64 time=0.078 ms

(Emediadevelopers) #13

I confirm that this issue still exists with 1.11.2-cs5:

# cat /etc/resolv.conf
search 8af608a6-cb6f-4cb9-8f8b-0a9b160c0087.local.dockerapp.io
nameserver 127.0.0.11
options ndots:1 ndots:0

# nslookup hw 
Server:        127.0.0.11
Address:    127.0.0.11#53

Non-authoritative answer:
*** Can't find hw: No answer

#nslookup hw.8af608a6-cb6f-4cb9-8f8b-0a9b160c0087.local.dockerapp.io            
Server:        127.0.0.11
Address:    127.0.0.11#53

Non-authoritative answer:
Name:    hw.8af608a6-cb6f-4cb9-8f8b-0a9b160c0087.local.dockerapp.io
Address: 10.7.0.36

If I remove the “ndots” options from /etc/resolv.conf, it works. However I believe those are there for a reason.


(Laen) #14

How did you edit the resolv.conf? It appears to be generated on the host, and loopback mounted into the docker container, but I can’t figure out where it’s generated.

Thanks.


(Imjosh2) #15

You can edit it temporarily in the terminal or you can use an entrypoint script to change it when the container starts.

are you oshpark laen by chance?


(Laen) #16

Ah. Silly me. Since I had no ‘vi’ in the docker image, I did a “grep -v options resolv.conf > a; mv a resolv.conf” which failed. Sure enough, I am able to edit it in place. Thanks!

I am oshpark laen! :slight_smile: Have we met?


(Patrice FERLET) #17

Right now - docker 1.12.6, same problem with alpine… I will need to rebuild my images because I cannot touch resolv.conf file at start, default user isn’t authorized to do this…


(Schiluveri) #18

@fermayo, can you confirm whether this is fixed and integrated to latest version of Docker?

I’m running Version 17.06.0-ce-mac19 (18663), Channel: stable (c98c1c25e0) and I still face this problem using tomcat:8.5.16-jre8-alpine image.


(Jhovell) #19

I’m also still seeing this issue with Alpine 3.6.2 and docker 17.03.2-ce.

The behavior seems to be that the first DNS resolution works and after that all resolutions fail up to some timeout (perhaps a minute) and then the first resolution will work again.

/ # nslookup consul-agent-8500
nslookup: can’t resolve ‘(null)’: Name does not resolve

Name: consul-agent-8500
Address 1: 172.31.3.173 ip-172-31-3-173.us-west-2.compute.internal
Address 2: 172.31.4.29 ip-172-31-4-29.us-west-2.compute.internal
/ # nslookup consul-agent-8500
nslookup: can’t resolve ‘(null)’: Name does not resolve
nslookup: can’t resolve ‘consul-agent-8500’: Name does not resolve

#cat /etc/alpine-release
3.6.2

#docker --version
Docker version 17.03.2-ce, build 7392c3b/17.03.2-ce

cat /etc/resolv.conf
search service.consul
nameserver 172.17.0.1
nameserver 172.31.0.2
options timeout:2 attempts:5

anyone have any idea what would cause this?


(Deeptigrover) #20

Hey,
havd you figured out teh problem. I am having teh same issue with apline 3.6 . I am stuck on build sonce yesterday. i had tried other repositories from dl-1 to dl-5 , but nothing functions. I am getting the same error .
WARNING: Ignoring http://dl-5.alpinelinux.org/alpine/v3.6/main/x86_64/APKINDEX.tar.gz: DNS lookup error
WARNING: Ignoring http://dl-5.alpinelinux.org/alpine/v3.6/community/x86_64/APKINDEX.tar.gz: DNS lookup error
ERROR: unsatisfiable constraints:
tar (missing):
required by: world[tar]

Any suggestions please?