DNS in container for other containers in different subnets

Hi Guys,

Seeking for an advise for your side on what looks as very simple problem - get containerized DNS working for other containers in different docker subnets.
Scenario is as following:
1.have local DNS running in container (pihole/dnsmasq, but it doesn’t matter),
2. port 53 is exposed via bridge net (dedicated)
3. all works fine for docker host and outside world
4. this DNS server should be used by all containers

Detailed explanation of above:
The problem is only with the last part - there’s no way I can get it working.
Going into detail we have network:
172.20.0.0/24 where DNS is located, let.s say at ip .5
Port mapping is done from host using bridge subnet (as more ports are used by DNS server I don’t want to expose at host level, or am proxying to them for other reasons - pihole 443, 80, etc.).
192.168.0.5/24 is external IP on docker host.
All hosts can use this DNS using 192.168.0.5.

Docker containers in different subnets, other than 172.20.0.0/24 can’t get access to this DNS using 172.20.0.5 IP for obvious docker reasons.
Idea of using docker Host IP 192.168.0.5 seems to be wise, but it gets DNAT to 172.20.0.5 and reply comes back, without reverting the NAT back to container in i.e. 172.17.0.10/24 and is dropped as has wrong source IP.

For me this is not a feature but a bug, as packet should go through network stack and following typical Linux/iptables get back the original destination IP (192.168.0.5) set as source.

Workaround could be a manual iptables rule, masking any traffic from any containers when going to the DNS IP (either docker host IP or internal docker container, but then forwarding rule would be required too).

What are your odds on it?

Thanks!

I assume you are aware that by default the to hosts /etc/resolve.conf is injected into the containers, and whatever dns servers are listed there are available within your containers?

Though, you can influence the dns setting from the cli ( --dns=192.168.0.5) or in a docker-compose.yml ( dns: 192.168.0.5).

Yup, that I’ve tried, same as other workarounds.
Problem is how networking is handled for containers reaching out to docker host IP.
Don’t have real tcpdump at hand but it was more or less like.

172.17.0.10 -> 192.168.0.5:53 # this is client container in different subnet reaching out to docker host IP with DNS query
172.17.0.5 -> 172.20.0.5:53 # same packet after DNAT as per the fact that :53 from 172.20.0.5 is exposed on host 192.168.0.5:53
172.20.0.5:53 -> 172.17.0.10 # reply with DNS response - note it uses original src IP as DST and it’s own IP (ok)
172.20.0.5:53 -> 172.17.0.10 # this is after it passed docker host iptables. It went thorough as Connection Established/related from iptables point of view, but somehow skipped the NAT table.

This packet is being dropped by nslookup library on querying host as response comes from unexpected IP.

Workaround I’ve found is to set the DNS on docker host to 127.0.0.1 and then different set of iptables rules is being applied at host level.
Same behavior should be implemented for host IP related traffic or to be able to control it somehow.