I’m trying (and currently failing) to setup a Docker host to run a collection of containers. I want to put firewall rules in place on the host to restrict the ability for containers to communicate between themselves. I am using IPVLAN L3 networking to give each container a ‘real’ IP address on the network.
I have the IPVLAN L3 network setup and working, with all the necessary static routes added. However, no matter what I do, I can’t seem to get the firewall on the host (either FirewallD or iptables) to block the traffic going to or from the containers. It appears to me that the network traffic is simply bypassing the firewall somehow. I have the “iptables” parameter set to false, and this has stopped the exposed ports of containers on the default bridge network, however it’s done nothing at all for the IPVLAN L3 network.
Regarding iptables: are you really willing to set iptables rules for each and every container you create You would need to hook into docker’s event stream to create/remove iptables rules that forward the host port to the target container.
As far as I know L3 works with a separate private subnet, that needs to be routed to your hosts’s subnet.
macvlan and ipvlan L2 can be bridged into the host’s subnet - both are easy to setup. While macvlan child interfaces have a unique mac address, with ipvlan L2 child interfaces share the parent interfaces mac address. in both cases the gateway ip would be the one of an external router that implement the firewall rules for traffic that crosses the subnet boarder.
Anyway, I never managed to get ipvlan L3 up and running, do you mind sharing what you did to make ipvlan L3 work?
Note: from what I remember we had maybe 2 or 3 posts about ipvlan L3 in the last 6 years… it’s an unusual topic.
I just watched https://www.youtube.com/watch?v=eVfOmy71NK0 and it seems this is taken care of. As long as external devices know the route (either through the subnets router, or by a route on the device) they are able to communicate with the ipvlan l3 child interfaces.
Though, I still don’t understand why this magic works for external devices (the video does not explain why this work or i missed it?), as it appears that no routing rule exists to route traffic from the parent ip (=lan subnet) to the child ips(=private subnet). This shouldn’t work anyway since ipvlan parent and child interfaces can not communicate with each other. I can’t wrap my head around why this works.
The video also shows that an ipvlan child interface + a route needs to be added to the host (to my surprise with an arbitrary ip, why does it work like this?), so it can be used to communicate with the child interfaces of the containers. This works around the limitation mentioned earlier.
However the magic works, it seems to be responsible for the routing, and the reason the traffic remains invisible for iptables (and even wireshark in the video).
I have watched that video also, although I didn’t find anything especially new/interesting covered in it.
The L3 mode of the IPVLAN was quite straightforward to setup, but requires you to make changes in the network so that other clients had a route to the subnet where the Docker containers are running. In my case, that was just adding a static route to the Docker subnet that I had configured with the next hop being the IP address of the Docker host. From the host it appears that the traffic bounces out of the host, hits the router, then turns around and comes back again. This is because, by default, the host doesn’t (appear to) have a route configured to the Docker subnet (where the containers are), yet somehow when the traffic hits the network interface, it seems to know how to get to the Docker containers.
The biggest thing that I seem to be learning is that the IPVLAN and MacVLAN network drivers seem to bypass the firewall filter of the Docker host, but I don’t understand how or why. I’d love to understand more about how these two network drivers actually work, although that’s going to be largely academic at this point.
As a result of some further reading on my part, I have discovered that using the Bridge driver but using the direct routing option to disable NAT is likely going to have roughly equivalent functionality to the IPVLAN driver with L3 mode enabled, but because it creates a bridge on the host, it should also result in the host performing routing etc as normal/expected and I assume that means that the firewall rules will also apply. I have not yet tested this yet so I’ll report back once I’ve tested.
I did exactly that, but it fails. I do run my vms with Ubuntu on Proxmox. I am not able to ping a container with ipvlan l3 interface, or make the container ping the external host that has a route to the ipvlan l3 network.
I wish I could expand on it, but I know nothing about it, except that it provides kernel level routing between namespaces. I got stuck in my research here.
It sounds like your Proxmox hypervisor is causing issues there for some reason. I’m surprised, since I’m doing my work on an XCP-ng guest, however there will be some significant differences to KVM and Xen Server. I guess now there is the joy of trying to work out what’s happening in Proxmox that’s causing that behaviour.
Thanks for the nudge, I had seen that, thought it was a good idea, but didn’t try it. I’ve just given it a go and despite the kernel documentation mentioning iptables connection tracking, the traffic still doesn’t seem to hit the firewall filter rules (meaning that all the traffic is just allowed through unconditionally).
In both cases (on host/inside vm) the same bridge (promiscous mode is on) is used, so it shouldn’t be causing the problem. Both test vms (docker host with ipvlan l3 network, one with the route for the l3 network via docker host) are running on the same Proxmox hypervisor, so the traffic should not leave the bridge. The virtio vnic don’t have that many properties to tinker with. Of course, macvlan an ipvlan l2 work like a charm. Since ipvlan l3 is not on my priority list, I give up on understanding what the root cause might be.
I think at this point it’s worth opening an issue in the moby project to ask the devs, what behavior to expect for l3s, since l3s is mentioned, but not explained in the docker ipvlan docs.