Host Network on Swarm: Service Discovery and Communication with other Services

koyukan · June 13, 2023, 10:18am

Hello Docker community,

I’ve been working with Docker Swarm and have run into an issue regarding service discovery. Specifically, one of my services is utilizing the “host network”, and I’ve learned from a discussion on this GitHub issue (Update service with --network=host failed · Issue #27 · docker/for-linux · GitHub) that I’m unable to simultaneously include my service in the overlay network.

This situation has created a significant roadblock for me because it has prevented the use of Docker Swarm’s DNSRR feature. Previously, I leveraged DNSRR for service discovery, particularly for identifying the IPs of active tasks.I am seeking a solution or feature that allows me to query all the tasks currently running under the service, including their private and/or public IPs.

Furthermore, when my service was attached to the overlay network, I was able to directly access other services using their DNS names. However, now that my service isn’t a part of the overlay network, I am compelled to use private IPs, which isn’t optimal.

Could anyone point me in the direction of a solution or workaround for these challenges? Any advice or insights would be greatly appreciated.

Thank you in advance for your help.

meyay · June 13, 2023, 7:41pm

--network=host declares the absence of network isolation. You can not mix it with networks that require network isolation.

However, the long syntax for port publishing allows publishing ports with mode: host (instead of the default mode: ingress) for overlay networks. It will bind the host port on a node where at least one replica is running, this port will behave like it would with --network=host (e.g. it will retain source ips).

ports:
  - target: 80
    host_ip: 127.0.0.1
    published: "8080"
    protocol: tcp
    mode: host

See: https://docs.docker.com/compose/compose-file/05-services/#long-syntax-3

If combined with node labels and placement constraints, the same port could be used on different nodes for different services.

koyukan · June 14, 2023, 8:46am

Hello @meyay ! Thanks for the response. Yes, this is the best way to do it apparently BUT the service I’m trying to deploy (Janus WebRTC Gateway) uses a large range of ports and unfortunately Docker doesn’t let me to map port ranges to host network, it only supports mapping single ports.

Morever, I further discovered that Exposing a large range of ports are problematic all-together, fur such scenarios (Asterisk, Janus, etc.) there are two viable options;

I feel like the best bet for me to use host-network and keep track of my host-networked services on a database and use the private ip’s for service communication. What do you think?

meyay · June 14, 2023, 10:17pm

Each mapped port will delay the container start. I can’t tell you exactly by how much, but a range of couples of hundred ports will delay the start noticeably.

The macvlan network is also not realy a viable solution, unless you create a macvlan with an ip range with a single ip in it. Swarm services do not support static ipv4 configurations using ipv4_address, so the containers would get random ips within the macvlan ip range. The next obstacle is that there can not be more than one macvlan gateway using the same gateway ip.

In case of a service that requires larger port ranges, it appears to be valid approach.

There is one more thing you can try: declare a top level network element that uses the predefined host network (I doubt it will work, but it’s still worth trying):

networks:
  lan:
    name: host
    external: true

koyukan · June 19, 2023, 10:20am

Thanks for the information. There is a special case where I might need to use the macvlan (when I need multiple containers on a single large host). Having random IPs doesn’t sound too bad as I will be using consul for service discovery, but as you have suggested, assigning a unique gateway IP for each host sounds like a delicate subject.

meyay:

There is one more thing you can try: declare a top level network element that uses the predefined host network (I doubt it will work, but it’s still worth trying):
networks:
  lan:
    name: host
    external: true

This is indeed my current solution, now I have multiple services running on dedicated hosts (one service per EC2) and I used Route 53 for “service discovery” (soon to be replaced by consul)

Thank you so much for your valuable advice @meyay, really appreciate it.

meyay · June 21, 2023, 4:29pm

You can actually create a swarm spanning macvlan: first your create the macvlan configuration on each node (if I am not mistaken, the physical device name , e.g. eth0, must be the same on all nodes). Then you can the network using the configurations.

@ajeetraina wrote a blog post about it: Docker 17.06 Swarm Mode: Now with built-in MacVLAN & Node-Local Networks support – Collabnix. Even though the post is older, the method and used commands still apply.

Have you considered switching to ecs fargate and use cloud map for service discovery?

koyukan · July 10, 2023, 6:04am

Yes, macvlan is a more preferable option, and that’s how we initially deployed our WebRTC servers for testing, however AWS doesn’t support macvlan on their EC2 instances. If I ever have to deploy this on a on-prem I will re-visit MacVlan.

Initially I started with Fargate, thinking it would be easier to implement, but it proved to be difficult for the same reason that I mentioned on this post; Our webrtc server uses 10k+ ports and AWS’s load balancers does not support publishing port ranges, instead we had to publish each port individually. Some people said I could resort to AWS’s network load balancers; but It wasn’t clear if that would solve the problem. I think AWS Fargate was designed for common uses cases, basic Microservice architecture in mind, and WebRTC servers are not in that category.

Another reason I moved away from fargate was I did not want to rely on a specific cloud service provider’s solutions. I want to make it as infrastructure independent as possible. Recently I deployed our swarm on Huawei Cloud, and it only took a few hours. If possible I want to keep it this way.

I am planning to use Kubernetes instead, when I have some time I will make the switch. I might come here more often for help Thanks again for the valuable advises.

meyay · July 10, 2023, 7:01am

Farge ECS supports awsvpc, whihc provides an ip of the vpc subnet the container is running.
I don’t recall if NLB’s allow ranges, but I doubt they do. You would indeed need to create a target group per port and then a listener per port. Nothing you would want to manage via clickOps. This screams for automation

But then again, I can understand that you want to stick to vanilla docker, as ecs indeed is slightly different to use than docker. Back in the day in took me some time to understand that I need ClouldMap for Service Discovery.

Kubernetes is definitely the way to go. Last time I saw Swarm used in enterprise context was roughly 5 years ago. Most of our clients seem to have agreed back then that Kubernetes is the way to go - all container strategies I have seen based on Kubernetes.

koyukan · July 10, 2023, 9:58am

Good to know We’ve been playing around with Docker Compose for local testing, and when it was time to test our system on the cloud, Docker Swarm just felt like the easiest route. But our original intention was to leverage Kubernetes, just didn’t have time to learn it

Topic		Replies	Views
Docker swarm overlay networking, connecting to the swarm from outside or localhost Docker Desktop	0	3039	February 20, 2018
Swarm Mode and Service Discovery General	20	20217	September 29, 2016
Services Launched in Swarm Custom Overlay Network Cannot Connect to Other Services in Overlay Network General swarm	0	569	July 28, 2021
How can I access to host local network from docker swarm in overlay mode? General swarm	0	794	June 5, 2018
Communication between containers running on different hosts General docker , swarm	4	2354	March 20, 2024

Host Network on Swarm: Service Discovery and Communication with other Services

Related topics