<Edited 5/26 11:31 with better steps to reproduce>
Description of issue
I’ve trying to fix an issue where in some devices in the field, DNS works on the host, but doesn’t work on the container. This only happens on a small percentage of the devices that all have local DNS servers on the LAN and 8.8.8.8/8.8.4.4 blocked. For example, I can curl something from the host successfully, but the container fails on the same curl command, due to a DNS lookup failure.
From what I’ve been able to piece together, resolv.conf on the host and container don’t match when this failure shows up (seen this in 100% of the failures). When I run into this problem any of the following: (reboot, docker service restart, docker container restart) will get the resolv.confs in sync and the DNS issue goes away. My pet theory is that DHCP is slow to set up a lease on these devices and Docker starts up before DHCP finishes getting DNS setup, though I have been unable to This would result in Docker using a default resolv.conf and then the host.
Any idea on how to make sure that the container tracks any DHCP changes in the OS?
Here is an example of a device in a problem state. Note that the first resolv.conf is the host and the Docker version that follows does not match.
[REMOVED]:~ $ cat /etc/resolv.conf
_# Generated by resolvconf
nameserver 192.168.101.9
[REMOVED]:~ $ docker exec -it [REMOVED] cat /etc/resolv.conf
_# Generated by resolvconfnameserver 8.8.8.8
nameserver 8.8.4.4
Given that I can see the resolv.conf mismatch, I restart the container. After the restart, the resolv.conf files match.
[REMOVED]:~ $ docker restart [REMOVED]
[REMOVED]
[REMOVED]:~ $ cat /etc/resolv.conf
_# Generated by resolvconf
nameserver 192.168.101.9
[REMOVED]:~ $ docker exec -it [REMOVED] cat /etc/resolv.conf
_# Generated by resolvconf
nameserver 192.168.101.9
How do I ensure that Docker and the host keep in sync with regards to network setup?
OS Version/build
[REMOVED]:~ $ cat /etc/os-release
PRETTY_NAME=“Raspbian GNU/Linux 8 (jessie)”
NAME=“Raspbian GNU/Linux”
VERSION_ID=“8”
VERSION=“8 (jessie)”
ID=raspbian
ID_LIKE=debian
HOME_URL=“http://www.raspbian.org/”
SUPPORT_URL=“RaspbianForums - Raspbian”
BUG_REPORT_URL=“RaspbianBugs - Raspbian”
App version
[REMOVED]:~ $ docker version
Client:
Version: 18.06.3-ce
API version: 1.38
Go version: go1.10.3
Git commit: d7080c1
Built: Wed Feb 20 02:34:35 2019
OS/Arch: linux/arm
Experimental: falseServer:
Engine:
Version: 18.06.3-ce
API version: 1.38 (minimum version 1.12)
Go version: go1.10.3
Git commit: d7080c1
Built: Wed Feb 20 02:30:23 2019
OS/Arch: linux/arm
Experimental: false
Steps to reproduce
I don’t have a clean reproduction of this. Like I said above, I think this is a race condition between between DHCP setting up DNS and Docker setting up the container version of resolv.conf.
Edit:
I’ve been able to reproduce something that looks extremely similar to what I see in the field.
- -Baseline - start with a working system that has os and docker with resolv.conf in sync
-
- Edit /etc/resolv.conf to use another dns (8.8.8.8 on my network, vs. my default 192.168.1.1)
-
- Unplug rpi power
-
- Unplug rpi Ethernet
-
- Plug in power - note time
-
- Wait a few minutes - This is to simulate a slow local DHCP server
-
- Plug in Ethernet - note time. Ideally, resolv.conf will be updated on the OS and then Docker will have the container resolv.conf updated. Docker does not appear to update the container resolv.conf in this case
-
- Check /etc/resolv.conf on docker and OS. OS should be more recent, as seen in ls -l
-
- Check OS and Docker /etc/resolv.conf files. The Docker resolv.conf will have the “modified” DNS server of 8.8.8.8, but the OS will have its DNS updated to 192.168.1.1.
How I start the container
docker run --name [REMOVED] -v /home/pi/certs:/usr/src/app/certs/ -p [REMOVED]:[REMOVED] -dit --log-opt mode=non-blocking --log-opt max-size=10M --log-opt max-file=5 --restart unless-stopped [REMOVED]