Docker swarm HA load balancing

I’m currently experimenting with my raspberry pi cluster and HA. What happens when I point the DNS A record of mycoolwebsite.com to node1 (192.168.1.100). When all of the nodes in the picture are running the ingress network from docker swarm will handle the request to the node with the container on it. But what happens if node1 goes down. The DNS record no longer works because nothing is listening on that ip.

I couldn’t find good documentation on this issue. Or is it an assumption that people running a docker swarm have a reverse proxy like nginx, HAProxy or traefik? I think treafik looked really cool, but that would run on a manager node, when that manager node goes offline. Your still left with nothing?

Web%201920%20%E2%80%93%202

Bump.
Still haven’t figured out a solution for this problem.

This one might help you:

Also: people usualy expose their swarm containers using a reverse proxy. Though, none of them solves the floating ip problem for you. Still, it is quite nice to have a single entrypoint like traefik that takes care of tls offloading and forwarding domain names or context paths to the target containers.

Hi @meyay, Thanks for your reply. Traefik is a nice reverse proxy to use, only downside is the single point of failure. It is possible to cluster traefik but then my question still remains where do I point the DNS record.

This question might be an answer in the right direction: https://serverfault.com/questions/919349/how-to-setup-traefik-for-ha-need-a-reverse-proxy-in-front-of-traefik

Especially option 3 seems like a good solution for a docker swarm. Just restrict the cloudflare ddns container placement to a manager node and the DNS record will always point to a working swarm manager. Downside is that this doesn’t work in a local development environment. And no 100% uptime.

My hope would be that someone with real world experience would share a strategy, or example use cases that are running in production.

uhm, did you read the link and access the keepalived container documentation?

Oh yes of course, I’m sorry I forgot to mention that. I in fact actually tried that today before bumping this post. Both with local vm’s and with a floating IP on Hetzner. It worked great, it only takes a bit of time for the actual switching of the floating IP to happen.

Still hoping to explore more possible solutions to the problem.

In production people rely on cloud loadbalancers like AWS ELBs or Hardware Loadbalancers.
In a Test environment, I don’t see what’s wrong with keepalived…

Yes your right, keepalived is perfect for local development. My raspberry pi’s are now running with keepalived and it’s awesome! The floating IP just switches when 1 manager(in a 3 manager node cluster) goes offline.

I know :slight_smile: Once you grow out of of Swarm and should move to Kubernetes (Ranchers k3s should be a good fit for the Pi) the keepalived solution can be replaced with MetalLB (https://metallb.universe.tf/, directly in Kubernetes).