Docker swarm HA load balancing

luukth · March 12, 2019, 11:51am

I’m currently experimenting with my raspberry pi cluster and HA. What happens when I point the DNS A record of mycoolwebsite.com to node1 (192.168.1.100). When all of the nodes in the picture are running the ingress network from docker swarm will handle the request to the node with the container on it. But what happens if node1 goes down. The DNS record no longer works because nothing is listening on that ip.

I couldn’t find good documentation on this issue. Or is it an assumption that people running a docker swarm have a reverse proxy like nginx, HAProxy or traefik? I think treafik looked really cool, but that would run on a manager node, when that manager node goes offline. Your still left with nothing?

luukth · July 11, 2019, 3:39pm

Bump.
Still haven’t figured out a solution for this problem.

meyay · July 11, 2019, 3:41pm

This one might help you:

Also: people usualy expose their swarm containers using a reverse proxy. Though, none of them solves the floating ip problem for you. Still, it is quite nice to have a single entrypoint like traefik that takes care of tls offloading and forwarding domain names or context paths to the target containers.

luukth · July 11, 2019, 4:05pm

Hi @meyay, Thanks for your reply. Traefik is a nice reverse proxy to use, only downside is the single point of failure. It is possible to cluster traefik but then my question still remains where do I point the DNS record.

This question might be an answer in the right direction: https://serverfault.com/questions/919349/how-to-setup-traefik-for-ha-need-a-reverse-proxy-in-front-of-traefik

Especially option 3 seems like a good solution for a docker swarm. Just restrict the cloudflare ddns container placement to a manager node and the DNS record will always point to a working swarm manager. Downside is that this doesn’t work in a local development environment. And no 100% uptime.

My hope would be that someone with real world experience would share a strategy, or example use cases that are running in production.

meyay · July 11, 2019, 4:11pm

uhm, did you read the link and access the keepalived container documentation?

luukth · July 11, 2019, 4:29pm

Oh yes of course, I’m sorry I forgot to mention that. I in fact actually tried that today before bumping this post. Both with local vm’s and with a floating IP on Hetzner. It worked great, it only takes a bit of time for the actual switching of the floating IP to happen.

Still hoping to explore more possible solutions to the problem.

meyay · July 11, 2019, 4:58pm

In production people rely on cloud loadbalancers like AWS ELBs or Hardware Loadbalancers.
In a Test environment, I don’t see what’s wrong with keepalived…

luukth · July 12, 2019, 12:03pm

Yes your right, keepalived is perfect for local development. My raspberry pi’s are now running with keepalived and it’s awesome! The floating IP just switches when 1 manager(in a 3 manager node cluster) goes offline.

meyay · July 12, 2019, 2:58pm

I know Once you grow out of of Swarm and should move to Kubernetes (Ranchers k3s should be a good fit for the Pi) the keepalived solution can be replaced with MetalLB (https://metallb.universe.tf/, directly in Kubernetes).

kvdels · August 24, 2024, 4:19am

I use keepalived but I run it as a stack in the swarm, just like how metalLB does it. Its been running for months without issues

bluepuma77 · August 26, 2024, 6:01am

Hi @kvdels , could you share some more details?

Topic		Replies	Views
Docker swarm - high availability / fault tolerance / failover / load balancing Swarm docker , swarm	4	9217	November 15, 2017
Floating IP for a Swarm cluster [SOLVED] Swarm docker	3	3765	April 28, 2018
On-prem docker swarm deployment with HA Swarm swarm	6	2392	May 7, 2025
How to have high availability for hostnames in Docker Swarm General	1	890	June 18, 2020
Docker swarm high availability on nodes Swarm	1	4267	January 12, 2018

Docker swarm HA load balancing

Related topics