While I’m still learning about Docker, I haven’t seen an obvious answer to this that isn’t a hack of some form - whether that’s me looking in the wrong place or lack of understanding…
The issue I’m having is with a Pi-Hole configuration but from what I can tell it is Docker networking related rather than Pi-Hole hence my question.
Setup
Separate container instances of pi-hole running on 2 different hosts.
keepalived has been configured as a floating IP across them (issue is repeatable excluding this but included for completeness).
The hosts running pi-hole also run a number of other container based services.
No common “pi-hole network” has been setup or associated with the other containers on the host. While this is a common Google answer and would fix the problem, putting them all in the same network group starts to defeat the point of the containers from an isolation perspective.
Scenario
Containers running on the same host as the primary pi-hole instance fail resolution requests. Looking at the pihole log I can see the requests and responses (including retries) but the responses don’t get back to the originating container.
Pointing the DNS request at a Pi-Hole instance on an alternative host works.
When the environment is run with DHCP handing out multiple DNS IPs (and not using keepalived), things “appear” to work but that’s down to clients retrying on the alternative host after failures. Not great from a resilience perspective as they only have access to a 50% of the DNS server pool. (In fact I hadn’t noticed this behaviour until putting keepalived in the loop giving the clients the impression of a single DNS server).
What I think is happening is that DNS requests are being routed out of docker to the virtual IP and then forwarded back in through Docker to pi-hole. The problem coming when pi-hole tries to send the response as Docker realises it’s between the 2 networks and blocks it rather than letting the response loop back at host level.
So, is there a simple way to deal with this that doesn’t involve either joining all the containers on the same network or bumping Pi-Hole up to host networking?
Since you are talking about multiple hosts, are you actually using Docker Swarm?
Since every container has its own network namespace, processes are still isolated. Containers in different compose projects can1t access eachother by default. In that case it is not optional to create a common network. It doesn’t mean that all of your containers has to be in the same network just that if you have multiple containers (Services) in a compose project, you need to attach the common network to the containers that need to access the srvice which was originally in a different network. A network doesn’t have to have a /24 subnet so if you want to make sure that Container B and Container C can access Container A, but Container B and Container C can’t access eachother, you could create networks like “net-a-b” and “net-a-c” and attach “net-a-b” to Container A and Container B and attach “net-a-c” to Container A and Container C. I never needed such a strict network policy but it is possible. In Kubernetes, network plugins can make it even easier without multiple subnets.
Can you tell more about how you run the containers (swarm, compose, etc…) and how you made pi-hole available (thorugh Docker network or port forwards from the host network)? I couldn’t decide how your containers are running so I can’t really give you a better answer yet. Although the fact that the request arrived indicates it is not just about docker networks.
I get the isolation between containers but hadn’t factored in the slightly different route through the network for request vs response. I’d figured that the response would take the inverse route of the request in a simple home LAN.
On the basis everything needs to get to DNS, is there any benefit to adding all other containers to the Pi-Hole network vs just using host network? (Assuming external tags for the other containers not in this compose.) On the surface it feels a band-aid that doesn’t add any extra isolation to Pi-Hole compared to giving it host network.
So in answer to your questions…
No this isn’t swarm - just compose. Until recently this lot has run on a bunch of Pi4s (volumes held on an NFS share so I could bounce containers between Pis without data loss) and between not wanting to muddy learning between Docker and Swarm and wanting to avoid the overhead it stayed that way. As a side note, last I saw Pi-Hole didn’t play too nicely with Swarm (it was never designed to run multiple instances on a single host for starters - though I know you can configure Swarm to avoid that) but that may have changed.
Compose file included below but Pi-Hole was setup with default network (bridge) and ports. The only original addition was CAP_ADMIN based on a recommendation from somewhere on the Pi-Hole setup.
keepalived is running as a container and adds the virtual IP (.224) to the host.
I still think it’s about how Docker deals with networks but a subtlety on where things are in the network stack. For example, does it check the source IP (say net-a) if a packet comes in on the host interface (as a forwarded keepalived packet)? Or not worry about it then but it does catch it when the response tries to head back from net-b to net-a. I’m not a networking expert though - just trying to build on the knowledge I have and learn as I go.
docker-compose:
version: "3"
services:
pihole:
container_name: pihole
image: pihole:local
build: .
hostname: "pihole_${RUNNING_ON:?err}"
ports:
- "53:53/tcp"
- "53:53/udp"
- "67:67/udp"
- "8081:80/tcp"
environment:
TZ: 'Europe/London'
WEBPASSWORD: 'pihole'
PIHOLE_UID: 801
PIHOLE_GID: 801
volumes:
# Volume structure done here to allow single container/compose config easily run on multiple hosts
# in parallel. (i.e. multi-deployment.)
# Will error unless RUNNING_ON has been set before docker-compose. Suggest this is set by CLI or script
# unless only ever going to run on 1 host and then .env will work
- '/containers/volumes/pihole/${RUNNING_ON:?err}/etc-pihole/:/etc/pihole/'
- '/containers/volumes/pihole/${RUNNING_ON:?err}/etc-dnsmasq.d/:/etc/dnsmasq.d/'
- '/containers/volumes/pihole/fixPermissions.sh:/etc/cont-init.d/30-permissions.sh:ro'
- '/containers/volumes/pihole/fixes/:/fixes/'
cap_add:
- NET_ADMIN
restart: unless-stopped
keepalived:
container_name: pihole-vip
image: pihole-vip:local
build: ./vip
environment:
DNS_ALL_SERVERS: '10.0.0.210 10.0.0.20'
DNS_MASTER: '10.0.0.210'
DNS_VIP: '10.0.0.224'
ADMIN_PORT: '8081'
restart: unless-stopped
network_mode: host
cap_add:
- NET_ADMIN
Sorry for the delay, I couldn’t come back until now.
Without the isolation processes inside containers would listen on host ports. It could be inconvinient and also a security risk in some cases.
Only one container could use a port and if you have just two webserver container listening on port 80, you would need to change the port in the server’s configuration file if it is possible. It usualy is, but it is an additional change you need to make
You could have some more complex containers in which multiple services are running and one expects the other listening on a specific port internally.
Using container networks you allow the container to see all the interfaces which could lead to a situation when the container dumps your network traffic. An attacker could use it too.
Sometimes there are admin services inside the container listening on the container’s IP address. It could be access by the host only, but if you run the container on host network that port could be publicly available from other machines without proper firewall rules. One example that comes to my mind is Traefik.
You can have entire containers that must be available only localy on the host by other containers. Running the container on the host network can make it available from outside and even if it doesn’t, I would find it harder to manage ports instead of container hostnames. See the next point.
Running containers on Docker network (user-defined bridge like docker compose have) allows you to refer to other containers by their container names or in case of Docker Compose by their service names. You can also create aliases (custom hostnames) on each user-defined bridge, so even if your container or service is called “pihole”, you can add an alias called “dns” and all containers can use that to access pihole instead of the original hostname and you can also change it later and remove the dns alias from pihole and add it to another DNS server without changing anything in other containers. Some of your containers could use pihole while other something else. For example some of your containers could be just test containers and you would add those to a “dns-test” network but the processes inside the container still could use the 'dns" alias which would point to a different (test) pihole.
I’m not sure what you mean by checking the source IP. A software inside the container can check the source IP but Docker will not do anything with it. At least not the official Docker CE. The only problem I had regarding source IP is that when you have a proxy container, other containers will see the proxy as a source and you need using proxy protocol or HTTP headers like X-Forwarded-For and target services has to read the IP address from that standard header.
You mean the compose file doesn’t show that pihole now actually uses the default bridge? Or by default network you mean the default in the compose project?
how do your containers send requests to pihole? Sorry if you already mentioned it, I’m in a hurry today too.
Ther eis one thing that comes to my mind and that is our “jolly joker” network issue. The value of MTU. When there is a difference between the MTU of the host and the containers, that can cause issues and using the same MTU helped some people already. The default is usually 1500. I never really now when it can be bigger or smaller, but it worth a try.