Docker Community Forums

Share and learn in the Docker community.

DNS issues with local resolver and containers on the same host

This particular issue might have been asked before in one or another form - actually I’m not quite sure, so I’ll just open a discussion here.

I am running a (real) small stack of two Raspberry Pi hosts (assume names “A” and “B”) with Hypriot OS and Docker 20.10.1. Host “A” operates a recursive DNS resolver as a Docker Compose stack attached to a custom bridged network, publishing / mapping port 53 (UDP). The host IP is announced to my local network by my internet router as DNS.

Other containers running on the same host “A” attached to other custom bridged networks are not able to communicate with the containerized DNS resolver, or else: they cannot resolve any domain name. Docker debug logs show information as follows:

Dec 26 14:56:41 A dockerd[22481]: time="2020-12-26T14:56:41.054599831+01:00" level=debug msg="Name To resolve: heise.de."
Dec 26 14:56:41 A dockerd[22481]: time="2020-12-26T14:56:41.055160347+01:00" level=debug msg="[resolver] query heise.de. (A) from 172.29.0.4:58117, forwarding to udp:192.168.178.46"
Dec 26 14:56:45 A dockerd[22481]: time="2020-12-26T14:56:45.055511318+01:00" level=debug msg="[resolver] read from DNS server failed, read udp 172.29.0.4:58117->192.168.178.46:53: i/o timeout"

Containers on host “B” however have no issue at all resolving domain names, everything works as expected.

Any insights in what is actually going on here would be helpful…

1 Like

Hi!

Same problem here (all other hosts resolve properly, DNS server communicated via DHCP router, etc.). I saw that the requests go to the DNS server, but the response can’t get back and I get the same error as you. Thought the problem could be somewhere in iptables, but I hadn’t find a solution nor I’m expert in iptables.

I will install the DNS server without docker and see if the containers can resolve, but it looks like a “problem/feature” in the container-to-container communication.

1 Like

@hmarlo - Thanks for your answer. Yes, I can also see the request being processed by the containerized DNS resolver (see below), however the response does not seem to reach the requesting instance (container):

Dec 27 11:29:49 dnsmasq[781]: query[A] heise.de from 172.20.238.1
Dec 27 11:29:50 dnsmasq[781]: forwarded heise.de to 172.20.238.2
Dec 27 11:29:50 dnsmasq[781]: dnssec-query[DS] heise.de to 172.20.238.2
Dec 27 11:29:50 dnsmasq[781]: reply heise.de is no DS
Dec 27 11:29:50 dnsmasq[781]: validation result is INSECURE
Dec 27 11:29:50 dnsmasq[781]: reply heise.de is 193.99.144.80

A possible solution (tested yesterday) might be to attach other containers to the DNS stack’s bridged network and explicitly setting its gateway as DNS, but that is rather a work-around that introduces too many dependencies between otherwise unrelated containers / stacks, and I’d rather avoid that…

Hi again!

Tested installed locally (without Docker) and it worked, of course.

I found something more trying to resolve using dig from a sibling container dig @192.168.1.30 -p 53 google.com and it returned:

;; reply from unexpected source: 172.18.0.1#53, expected 192.168.1.30#53

;; reply from unexpected source: 172.18.0.1#53, expected 192.168.1.30#53

I searched a little bit more and found this answer at GitHub that suggests to use the full qualified IP on the docker port binding, so I changed my ports from:

---
version: '3.7'
services:
  unbound:
    image: mvance/unbound-rpi:1.13.0
    hostname: unbound
    restart: unless-stopped
    ports:
      - 53:53/udp
      - 53:53/tcp
    volumes: [...]

To:

---
version: '3.7'
services:
  unbound:
    image: mvance/unbound-rpi:1.13.0
    hostname: unbound
    restart: unless-stopped
    ports:
      - 192.168.1.30:53:53/udp
      - 192.168.1.30:53:53/tcp
    volumes: [...]

And then it worked.

Probably adding some mangling to iptables (when the request to here comes from there, change the IP of the response) could save us to specify the host. Also, I do not know any shortcut for that IP binding on the ports since by default is 0.0.0.0 and not the host IP.

Could you test this?

Edit: I’m now testing other things to avoid knowing the IP where the container will be placed.
Edit: Running with docker run ... --net=host did not work, same error, not translating the source from the response.

1 Like

Changing the port mapping to include the fully qualified IP actually works, thanks for the hint. This is also rather a work-around, but it introduces no further dependencies to other (yet unrelated) containers. In fact, the DNS stack I am operating here defines a custom network using a fixed IP range, and it even sets a fixed IP address on the upstream DNS resolver I am using (Unbound) in order to configure the exposed Pi-hole container correctly, so this will work for now.

Despite of that, I’m still curious about what is happening here when using the original setup. A work-around is… well - a work-around. I’d rather have this fixed with a proper solution… :wink:

Steps to reproduce the issue:

Be in a network that prohibits external DNS queries, disable external DNS communication or just use some only-locally available hostname in step 3.

Setup local DNS server/forwarder (e.g. systemd-resolved) so that the local address is in /etc/resolv.conf

Start any container (without --network host) and try to resolve a hostname (e.g. podman run --rm -it fedora curl -v ifconfig.me)

Describe the results you received:
curl: (6) Could not resolve host: ifconfig.me

Describe the results you expected:
No error (some IP address)

Additional information you deem important (e.g. issue happens only occasionally):
The contents of /etc/resolv.conf are:

search virt
nameserver 8.8.8.8
nameserver 8.8.4.4
nameserver 2001:4860:4860::8888
nameserver 2001:4860:4860::8844
nameserver 10.0.2.3
options edns0
Which would normally work (although I might not want to send my DNS requests somewhere else because I might have services available in a local network), but I am in a network that prohibits external DNS queries, so that doesn’t work.

If I leave just the slirp4netns nameserver there (echo nameserver 10.0.2.3 >/etc/resolv.conf) it works in a VM where I am trying to reproduce this issue. However on my original host, where I discovered this, 10.0.2.3 is still inaccessible (even though the version and the command-line of slirp4netns is identical, apart from the PID argument).

Output of podman version:

Version: 1.3.1
RemoteAPI Version: 1
Go Version: go1.12.2
OS/Arch: linux/amd64
Output of podman info --debug:

debug:
compiler: gc
git commit: “”
go version: go1.12.2
podman version: 1.3.1
host:
BuildahVersion: 1.8.2
Conmon:
package: podman-1.3.1-1.git7210727.fc30.x86_64
path: /usr/libexec/podman/conmon
version: ‘conmon version 1.12.0-dev, commit: c9a4c48d1bff85033b7fc9b62d25961dd5048689’
Distribution:
distribution: fedora
version: “30”
MemFree: 2884521984
MemTotal: 4133556224
OCIRuntime:
package: runc-1.0.0-93.dev.gitb9b6cc6.fc30.x86_64
path: /usr/bin/runc
version: |-
runc version 1.0.0-rc8+dev
commit: e3b4c1108f7d1bf0d09ab612ea09927d9b59b4e3
spec: 1.0.1-dev
SwapFree: 644870144
SwapTotal: 644870144
arch: amd64
cpus: 4
hostname: fedora30.virt
kernel: 5.0.9-301.fc30.x86_64
os: linux
rootless: true
uptime: 19m 40.53s
registries:
blocked: null
insecure: null
search:

  • docker.io
  • registry.fedoraproject.org
  • quay.io
  • registry.access.redhat.com
  • registry.centos.org
    store:
    ConfigFile: /home/nert/.config/containers/storage.conf
    ContainerStore:
    number: 0
    GraphDriverName: overlay
    GraphOptions:
  • overlay.mount_program=/usr/bin/fuse-overlayfs
    GraphRoot: /home/nert/.local/share/containers/storage
    GraphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: “false”
    Supports d_type: “true”
    Using metacopy: “false”
    ImageStore:
    number: 1
    RunRoot: /tmp/1000
    VolumePath: /home/nert/.local/share/containers/storage/volumes
    Additional environment details (AWS, VirtualBox, physical, etc.):
    I am trying this in a Fedora 30 VM, clean install, as that is the easiest and cleanest reproducer I can get. I cannot reproduce the issue related to my local environment in there.

Hi again! It is working for my using network_mode: host(or --network=host if a run with docker run) and binding the DNS server interface to the IP (192.168.1.30). However this has the same problem as before as we need to know the interface IP before.

@hmarlo - Yep, thought about network_mode: host as well, but dropped that solution as I do not want to expose the upstream service (Unbound), and both using network_mode: host and attaching a custom bridged network (in order to hide the upstream service) is not possible.

I had checked another detail in the meantime and removed an A-record configuration for my local network from Unbound’s configuration, but that only changed the error message issued by dig (;; reply from unexpected source: 172.30.0.1#53, expected 192.168.178.46#53). There must be some kind of issue with iptables and NAT, yet I am not a network expert and actually do not know what to look for, neither could I say whether this might actually be misbehaviour or is rather based on my own misunderstanding…