Hi, I have two data centers, and I want to set up a Docker Swarm cluster with fault tolerance and high availability. I’ve created three nodes, two in the first data center and the third in the second data center. There will be three manager nodes. However, I’m concerned that if the first data center goes down, the entire cluster might stop working. How can I set up high availability across two data centers with three nodes?
I don’t have experience in HA Swarm clusters, so everyone feel free to correct me if I’m wrong. My answer will be a general one.
Everything depends on the level of HA you want to achieve. For HA managers you need at least three manager nodes. The best would be if those nodes would only be managers and not workers, so you would need more than three nodes to run workers too. Let’s say you run nodes that are managers and workers too.
You would have 2 nodes in one datacenter and one in another. If the communication between the two datacenters breaks, the containers in the single node datacenter would probably keep running, but you couldn’t communicate with that as manager. The two nodes in the other datacenter would still work as managers, but could not change anything in the single node datacenter. If you have containers trying to communicate between the datacenters, that would make them fail too.
Let’s say something happens in one datacenter. For example power loss or fire and the nodes cannot be restored. If the problem happens in the single node datacenter, you could lose the node and containers forever and add a new node to the remained two managers and reschedule the missing containers.
If something happens in the two node datacenter which cannot be restored, you have lost your cluster forever.
Of course a cluster is not highly available if only the managers are in HA but not the applications like a database and the requorements for those could be different.
So I guess you could practice HA mode in two datacenters using 3 nodes and it would be probably better than only one datcenter, but it would not be really highly available in all senses. Not to mention if the network between the two datacenters are not really fast so it would even slow your cluster down or make your applications unstable
So whether it is good for you or not depends on what you want to use it for and what you want to run in the cluster.
The Raft consensus algorithm used by Docker Swarm (and Kubernetes) requires low latency network connections in order to work reliable. For instance, it works reliable across availability zones in the same region of cloud providers, but does not work reliable due higher latency across regions of the same cloud provider.
Let’s assume for a minute your datacenters have a low latency network connection, so that it doesn’t really matter if all nodes are in the same DC or not.
A single node in a three node cluster will always be headless. Raft requires at least floor(n/2)+1 healthy nodes for quorum on state changes within the cluster.
Like @rimelek wrote: in a headless cluster containers will keep running, but overlay traffic to the other nodes can’t succeed. And of course no state change can be applied to the cluster.
Are you sure trying to solve this challenge on the container orchestrator level is the right approach?
I appreciate your insights. Let me outline my current project: I have two datacenters operating at Layer 2, with a monitoring server that includes Prometheus, Grafana, etc. I’m aiming to Dockerize this setup and ensure high availability using Docker Swarm.
My plan involves setting up a cluster with two nodes and consistently pausing one of them. However, I would prefer to establish a standard cluster with three nodes. My concern is that if one datacenter, housing two nodes, goes down, the third node might struggle to reach quorum, leading to a cluster outage.
I’m seeking advice on how to handle this scenario effectively. Any recommendations or insights would be greatly appreciated.
If you run two manager nodes, turning off one of them will render the cluster headless. There is a reason why I mentioned it in my previous response.
This is not a might situation → it is guaranteed.
Please define outage. We already shared what will happen.
There is not much I can recommend. You can not run an odd number of nodes on an equal number of datacenters and have ha in both datacenters independently. It is not designed for that. You could run a cluster in each datacenter and handle the problem with replication, if the application supports it.
It’s been a couple of years that I have seen anything else than Kubernetes clusters running across 3 availability zones in a cloud region. Typically combined with autos calling groups that allow the cluster to create/destroy nodes when necessary.
I still believe neither Swarm, nor Kubernetes alone are the answer to your challenge.
Thank you for your response. Okay, I will attempt to set up a two-node configuration, with one node as active and the other as passive.
I believe I might face challenges setting up a Kubernetes cluster in my particular case. Can you please provide guidance or correct me if I’m mistaken?
If you have two data centers and require a quorum, why not add a 3rd manager from an independent cloud VM?
Even if you do have the HA Docker Swarm setup, how do you ensure availability for external clients? You got a load balancer or reverse proxy in front of your workers?
The main question is what you want to achieve. You want to make your setup survive even a full datacenter outage, not only a server outage?
Then you need to have a third “datacenter”. Instead of a full datacenter you could just use a cloud VM (“virtual machine” from a third provider) to have a third Docker manager node, that ensures the surviving datacenter still has a majority.
Recently I saw a nice diagram how a startup created redundancy, it may have been in this forum. They used AWS and Google for a full setup each, additionally Azure for services requiring a quorum.
I would say with 2 data centers you can’t really do HA across data centers.
It may seem possible with something like a PostgresDB, which can run in master/slave mode (have they updated wording?).
But if internet connectivity between the DCs fail, each DB will become M. You might save data to the former S. When connectivity is back, how do you sync between two Ms?
You need to ensure HA for all components (LB, proxy, app, DB), but some use raft and need 3 instances. I would say if you want HA across DCs, you need a third DC (or at least an additional tiny node to be the deciding factor for a quorum).