I have a Docker Swarm with multiple Managers and Workers, as we want to have it as High Availability as possible.
To reach our services we have hostnames that go to one of our managers and have traefik as reverse proxy which takes care of the rest.
This works fine, but our issue is that we can only point our hostnames to one node. What if this this node goes down? We could not reach the services with the hostname anymore, even though they are still running.
I already asked in the Traefik-forum but didn’t get any answer.
If you have a load balancer in your environment:just point it to the traefik port off all nodes - it needs to be able to detect outage and remove unreachable targets.
If you don’t have a loadbalancer, use something that adds a failover ip to your nodes. I use keepalived for this. You can have a “multi active” keepalived configuration, where a single node has the active role and is the owner of the failover ip. If the other keepalived nodes detect that the active is unreachable, the failover ip will switch to the next node, which will update the switch that ip packages for the failover ip (more specific the mac address underneath) is not reachable at this machine.
I started with a containerized keepalived and quickly switched to an os level snapd package. It is more reliable since.