MACVLAN - curl from container doesn't work until ping target

Context: in order to create a purple lab, i would like to reproduce a “fake wan area”, with realistic IP from several countries.

My main idea is to pop a Docker container in a random country and doing some action against the main target.

This is my topology:

  • VM pfsense: 10.1.1.114
  • VM DOCKER server: 10.1.1.54 (gateway: 10.1.1.114) aka “EVILKEVINS”
  • VM web server: 10.1.1.39 (gateway: 10.1.1.114)

Static routes have been added on pfsense (37.19.202.0/24 & 102.38.229.0/24, via 10.1.1.54)

With my Docker, i would like to create containers in those subnets:

  • 37.19.202.0/24
  • 102.38.229.0/24

But i don’t want docker MASQUERADE containers’ IP (no NAT).

When a container will try to do a curl http://10.1.1.39:8000, i need the webserver see a connection from 37.19.202.X (or 102.38.229.X).

After reading overs blogs, i think i need to use docker macvlan.
So, that’s what i’ve done:

echo "Création des interfaces virtuelles"
sudo ip link add link eth1 name eth1.37 type vlan id 37
sudo ip link add link eth1 name eth1.102 type vlan id 102

echo "Activation des interfaces virtuelles"
sudo ip link set eth1.37 up
sudo ip link set eth1.102 up


echo "Création du réseau macvlan 37"
sudo ip link add macvlan_NET37 link eth1.37 type macvlan mode bridge
sudo ip addr add 37.19.202.1/24 dev macvlan_NET37
sudo ip link set macvlan_NET37 up

echo "Création du réseau macvlan 102"
sudo ip link add macvlan_NET102 link eth1.102 type macvlan mode bridge
sudo ip addr add 102.38.229.1/24 dev macvlan_NET102
sudo ip link set macvlan_NET102 up


echo "Activation de l'ip forward"
sudo sysctl -w net.ipv4.ip_forward=1

echo "Désactivation du RP FILTER"
sudo sysctl -w net.ipv4.conf.all.rp_filter=0
sudo sysctl -w net.ipv4.conf.eth1.rp_filter=0

echo "Activation du proxy arp"
sudo sysctl -w net.ipv4.conf.all.proxy_arp=1
sudo sysctl -w net.ipv4.conf.eth1.proxy_arp=1

echo "iptables forward accept"
sudo iptables -P FORWARD ACCEPT

echo "Activation du mode promiscious"
sudo ip link set eth1 promisc on


echo "Création du réseau pour vlan_37_net"
docker network create -d macvlan --subnet=37.19.202.0/24 --gateway=37.19.202.1 -o macvlan_mode=bridge -o parent=eth1.37 vlan_37_net

echo "Création du réseau pour vlan_102_net"
docker network create -d macvlan --subnet=102.38.229.0/24 --gateway=102.38.229.1 -o macvlan_mode=bridge -o parent=eth1.102 vlan_102_net

echo "Lancement des containeurs"
docker run  -d --name conteneur_a --ip=37.19.202.5 --network=vlan_37_net evilkevin
docker run  -d --name conteneur_b --ip=102.38.229.5 --network=vlan_102_net evilkevin

Tests, from the container:

curl -k https://10.1.1.114 → works immediatly, got the pfsense connection form
curl http://10.1.1.39:8000 → TIMEOUT
ping 10.1.1.39 → works, receive ICMP replies from VM webserver
(curl again, just after ping) curl http://10.1.1.39:8000 → Now, after ping, it works !!!

TCPDUMP EXECUTED ON DOCKER SERVER

On the docker host (before executing ping from container):

EXEC CURL IN THE CONTAINER, SYN/ACK STAY ON DOCKER SERVER, NOT FORWARED TO CONTAINER

20:44:26.521351 ARP, Request who-has 10.1.1.39 tell 10.1.1.54, length 28
20:44:26.521511 ARP, Reply 10.1.1.39 is-at 00:0c:29:dd:f2:1b, length 46
20:44:26.521520 IP 37.19.202.5.56116 > 10.1.1.39.8000: Flags [S], seq 997435537, win 64240, options [mss 1460,sackOK,TS val 2126385371 ecr 0,nop,wscale 7], length 0
20:44:26.521639 IP 10.1.1.39.8000 > 37.19.202.5.56116: Flags [S.], seq 642962559, ack 997435538, win 65160, options [mss 1460,sackOK,TS val 334644767 ecr 2126385371,nop,wscale 6], length 0
20:44:27.530370 IP 37.19.202.5.56116 > 10.1.1.39.8000: Flags [S], seq 997435537, win 64240, options [mss 1460,sackOK,TS val 2126386381 ecr 0,nop,wscale 7], length 0
20:44:27.530523 IP 10.1.1.39.8000 > 37.19.202.5.56116: Flags [S.], seq 642962559, ack 997435538, win 65160, options [mss 1460,sackOK,TS val 334645776 ecr 2126385371,nop,wscale 6], length 0
20:44:28.536164 IP 10.1.1.39.8000 > 37.19.202.5.56116: Flags [S.], seq 642962559, ack 997435538, win 65160, options [mss 1460,sackOK,TS val 334646782 ecr 2126385371,nop,wscale 6], length 0
20:44:29.546371 IP 37.19.202.5.56116 > 10.1.1.39.8000: Flags [S], seq 997435537, win 64240, options [mss 1460,sackOK,TS val 2126388397 ecr 0,nop,wscale 7], length 0
20:44:29.546518 IP 10.1.1.39.8000 > 37.19.202.5.56116: Flags [S.], seq 642962559, ack 997435538, win 65160, options [mss 1460,sackOK,TS val 334647792 ecr 2126385371,nop,wscale 6], length 0
20:44:31.576166 IP 10.1.1.39.8000 > 37.19.202.5.56116: Flags [S.], seq 642962559, ack 997435538, win 65160, options [mss 1460,sackOK,TS val 334649822 ecr 2126385371,nop,wscale 6], length 0

PING INSIDE THE CONTAINER TO WEBSERVER, GOT PACKETS IN CONTAINER, WORKS WELL

20:44:31.605031 IP 37.19.202.5 > 10.1.1.39: ICMP echo request, id 7, seq 1, length 64
20:44:31.605139 IP 10.1.1.39 > 37.19.202.5: ICMP echo reply, id 7, seq 1, length 64
20:44:31.605289 IP 10.1.1.39 > 37.19.202.5: ICMP echo reply, id 7, seq 1, length 64
20:44:31.605333 IP 10.1.1.114 > 10.1.1.39: ICMP redirect 37.19.202.5 to host 10.1.1.54, length 92

RE-EXECUTING CURL IN THE CONTAINER, NOW WORKS AS EXPECTED…

20:44:31.691616 IP 37.19.202.5.43424 > 10.1.1.39.8000: Flags [S], seq 376612757, win 64240, options [mss 1460,sackOK,TS val 2126390542 ecr 0,nop,wscale 7], length 0
20:44:31.691708 IP 10.1.1.39.8000 > 37.19.202.5.43424: Flags [S.], seq 287789093, ack 376612758, win 65160, options [mss 1460,sackOK,TS val 334649937 ecr 2126390542,nop,wscale 6], length 0
20:44:31.691738 IP 37.19.202.5.43424 > 10.1.1.39.8000: Flags [.], ack 1, win 502, options [nop,nop,TS val 2126390542 ecr 334649937], length 0
20:44:31.691813 IP 37.19.202.5.43424 > 10.1.1.39.8000: Flags [P.], seq 1:79, ack 1, win 502, options [nop,nop,TS val 2126390542 ecr 334649937], length 78
20:44:31.691874 IP 10.1.1.39.8000 > 37.19.202.5.43424: Flags [.], ack 79, win 1017, options [nop,nop,TS val 334649937 ecr 2126390542], length 0
20:44:31.694289 IP 10.1.1.39.8000 > 37.19.202.5.43424: Flags [P.], seq 1:156, ack 79, win 1017, options [nop,nop,TS val 334649940 ecr 2126390542], length 155
20:44:31.694313 IP 37.19.202.5.43424 > 10.1.1.39.8000: Flags [.], ack 156, win 501, options [nop,nop,TS val 2126390545 ecr 334649940], length 0
20:44:31.694372 IP 10.1.1.39.8000 > 37.19.202.5.43424: Flags [P.], seq 156:513, ack 79, win 1017, options [nop,nop,TS val 334649940 ecr 2126390545], length 357
20:44:31.694385 IP 37.19.202.5.43424 > 10.1.1.39.8000: Flags [.], ack 513, win 501, options [nop,nop,TS val 2126390545 ecr 334649940], length 0
20:44:31.694436 IP 10.1.1.39.8000 > 37.19.202.5.43424: Flags [F.], seq 513, ack 79, win 1017, options [nop,nop,TS val 334649940 ecr 2126390545], length 0
20:44:31.694534 IP 37.19.202.5.43424 > 10.1.1.39.8000: Flags [F.], seq 79, ack 514, win 501, options [nop,nop,TS val 2126390545 ecr 334649940], length 0
20:44:31.694598 IP 10.1.1.39.8000 > 37.19.202.5.43424: Flags [.], ack 80, win 1017, options [nop,nop,TS val 334649940 ecr 2126390545], length 0

In the container, if i wait few minutes, new curl requests won’t works, got timeout again. Force to ping webserver again to make curl requests successfull…

Why it doesn’t work as expected from the 1st attempt? How can i fix it without ping it ? Thanks.

Additionaly, curl webserver is 100% successfull from Docker server directly, without ping, normal case.

The Linux kernel does not allow direct communication between MACVLAN parent and child interfaces. This is regardless whether you use it with docker or not.

From what I remember, the gateway ips must be used by the router, and it must have the routes between the networks. .

But it doesn’t explain why (from inside the container) i can curl a target (a vm in the same subnet as the parent) ONLY if i ping it before.

ARP proxing seems working because i got a SYN/ACK on the host, i don’t know why ping unblock the situation et let forward SYN/ACK to the container (cache somewhere?).

I’ve seen that traceroute to target before curl unblock also the communication…

So you are pinging another VM from a container in a Docker host VM. Have you tried checking the logs and traffic on the pfsense VM? I don’t think Docker does anything with your traffic based on pings, but all I found searching for http requests working after ping only is router and firewall issues.

Even if HTTP works from your VM and not from inside a container, I would start with the gateway. If you find anything there that can help to understand what you would need to do to make it work from inside a container.

Ok, thanks for your reply !

You said someting not exact, HTTP/GET works everytime from the host AND from inside the container (BUT, to make it successfull from inside the container, i have to ping the target (inside the container) before doing curl…).

Despite i’m actually looking around Pfsense like you suggest, in my opinion, i think the problem is in Docker host, ARP cache/request/forwarding.

I understood that perfectly. Buf if you need to ping first, that means it doesn’t always work. I don’t know about anything in Docker that would be conditional in that way. Either iptables rules work or not. So even if there is something that you should configure in Docker or something that Docker should fix in the engine to be perfectly compatible, there must be something external that lets traffic through only after a successful ping.

If the ping/http target is in the same subnet, the router shouldn’t be involved at all.

If the problem was permanent, I would have recommended checking whether promiscuous mode can be enabled on the switch on hypervisor level.

Generally: It looks like you add another bridge interface on the same vlan-id, assign it an ip and use it as gateway for the docker macvlan network. I am not sure if the kernel restriction still applies here. In all my attempt with macvlan, I used gateway ips from the network’s router. Still: I doubt this is related to your problem either, since the target device is in the same subnet. But it might be a problem once routing is necessary. Update: the target device is not in the same subnet. So this might already cause your issue.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.