Docker container replication in different geographical regions

Heeey Y’all,

I have own three bare metal proxmox servers in three different geographical locations A, B and C. Each of these machines has a docker VM installed on them. The goal is to create a high availability application like Nextcloud for instance that is run on locations A and B. Location C will be used for a load balancer to balance the traffic between location A and B. The question is what would be the best and simplest way to have an active-active or active-passive setup with locations A and B ?

  1. One kubernetes cluster where the nodes are spread over locations A and B ? (Latency issues ?)
  2. A cluster of kubernetes clusters be the best option ? That is a kubernetes cluster on location A that is replicated to location B and vice versa. This could include for instance a Mariadb replication and a storage replication with Longhorn or Syncthing
  3. Create one docker container at location A and replicate that to location B
  4. Create a VM of an application e.g Nexrcloud for and replicate a VM to B instead of a container.

Please let me know what you guys think is best and most simple to setup ? 1,2,3,4 or something completely different

Thank you!!!

How you can replicate an application depends on the replication. A proxy server, a web app with uploaded files and a database are completely different.

If you can replicate an app without Docker, and the app also supports running in a container, you can replicate it with Docker too simply using Docker Compose. Is it the easiest? Probably, if you already know how to replicate properly without Docker. Is it the best? Not sure.
The need of a VM depends on what the app needs. If you can run it in a container and Docker is not in conflict with any other app on the host, feel free to use Docker. Otherwise you could need a VM, but that does not mean you can’t use Docker too for easier deployment.

ON Kubernetes, you could use annotations and rules to schedule replicas in different regions, if you already know how to replicate with Docker and you don’t know Kubernetes well, don’t create a Kubernetes cluster just for replicating some services. If you are good at it or you have someone who is, then sure, why not. Especially if you already have Kubernetes clusters.

Or since this is a Docker forum, there is also Swarm

which I don’t use, so I can’t say it it is good for what you want in practice.

1 Like

Kubernetes and Swarm use the RAFT consensus algorithm, which is designed to work in low latency networks. Spanning clusters across regions means no low latency network.

Individual installations per region, and a replication on application level could work if done right.

Though, active <> active setups usually require the application to have built-in support for it, e.g. if data and file uploads would be stored in something like a AWS dynamodb global table, the data would be replicated to each region < 1 sec (If i am not mistaken).

2 Likes

Somehow my brain converted “geographical regions” to “specific nodes” during my answer :slight_smile: but maybe some kind of multi-cluster application if there are existing clusters already?

1 Like

@rimelek @meyay Thank you guys both for your replies! Yeahh so docker swarm and kubernetes clusters both don’t work across regions. However, I simply don’t have enough knowledge to understand why it is a problem ? I mean if the latency is let’s say 1 second then it will just take one second for the other node to synchronize with the other nodes. Please help me in my misunderstanding

I also read that it is not recommended to containerise your database if you are replicating this database ? any thoughts on this ?

Also, what kind of setup would I need to have to replicate one cluster to another cluster ? I have a hard time finding a solution for this on the internet ? Is this different for each application or can I take a general setup that automatically replicates a change in one cluster to both clusters. In the case of kubernetes rke2 for instance, would the following be enough ?

  1. Persistent volumes e.g Longhorn (datastorage) be enough ?
  2. Replicating the databases that are used for applications e.g mariadb galera cluster for Nextcloud for instance ?

Finally, in the case of not using kubernetes or swarm, how do you guys feel about storing all docker applications on one big VM vs storing each docker container on a seperate VM ? I say this because if something goes wrong on my big VM all my services will drop v.s separating them

Sorry for all the questions and maybe it is out of scope or this particular post :wink:

But when we didn’t just mean traffic between your pod replicas. The cluster itself require low latency to operate. If you search for “multi-region kubernetes” on the internet, you will find some thoughts about why it is not recommended. You will also find projects that are already archived and recommendations for using Multi-cluster solutions. Often recommending some kind of service mesh like istio for a multi-cluster network and load balancing: Istio / Deployment Models

But again, since it is a Docker forum, not a Kubernetes forum, I recommend visiting the Kubernetes forum for discussing Kubernetes questions: https://discuss.kubernetes.io/

Do you create a separate VM for every single service normally? You could run many things in a container. Just because something runs in a container it doesn’t mean it needs a very strong isolation. Have enough resources on the VM and use separate VMs whenf for example different apps require different kernel optimization, kernel modules, or anything that could not be done in a container which is a process running on the host now knowing about the whole environment.

I feel your questions are general and we can’t always give you a general answer. Using a single VM for many and dvery different apps could be a bad idea, but using a separate VM could be a waste of time and resources mainly. You wouldn’t run a single process in every Kubernetes node either, but sometimes it is required.

Depends on the database and the people who talks about it :slight_smile: There are container based production solutions for databases, but if a database server was made for running on its own machine and most guide describes that, running it in a container can have a big affect on how the database server works or replicates and how you operate it. And since databases have to handle lot of data efficiently, it is extremely important using an appropriate storage for it. How databases support storage can also be different. It would not be different without containers, but since a container is isolation and restricted capabilities, if you don’t know what it needs, there is a bigger chance to make mistakes. So it is not enough to know how you can start a containr, you need to know what the app inside the container requires. Which sometimes can require a database specialist. But it depends on also where you run it for what purpose.

1 Like

Thank you again so much @rimelek,

I know it is a docker forum indeed, but you guys seem to be the only ones to respond. I did ask the same questions on the kubernetes forum but nobody replied. Since, kubernetes is docker related I tried it here and I was lucky that you guys at least replied.

I agree with you that in IT it always depends haha what is optimal. Maybe from a safety point of view each application in a different VM is the best way since it does not affect other applications if something happens to it. But as you sad it does take up a lot of more unnecessary resources. I will try figure out what the right balance is.

Cheers!!

Aren’t there any blueprints available from the NextCloud community?

One thing is for sure: even if you are in full control of the application code, designing and running a highly available application with active instances in multiple region is not an easy task if persistence needs to be synced across the instances.

If the goal is just high availability, and you could live with a single k8s cluster, things could be much easier to operate reliably. The biggest drawback would be latency for end users in other regions, which could be mitigated by using content delivery network that has a low latency, high bandwidth connection to the location of your k8s cluster.

I have no idea whether longhorn allows cross cluster replication.