Docker swarm with one google compute engine vm and one home computer as nodes

I’m trying to setup the swarm described in the title, to make a MongoDB replicaSet, keeping a local copy of the database automatically.

I create firewall rules on google compute engine, allowing traffic on ports 2376, 2377, 7946 for tcp protocol and
7946, 4789 for udp protocol, on both directions, ingress and egress, and I managed to join the home computer to the swarm without any problem.

But, after deploy a stack, the service discovery didn’t work and the containers on different nodes coudn’t communicate, not even using the containers ips from the default overlay network. On one try, I could ping a service on another node, but only for a few seconds, and then it stopped.

Is that supposed to happen? What can be wrong?

Docker Swarm is not a client/server type of setup. It uses a raft protocol to synchronize state between nodes. While manager nodes can change global state details, worker nodes can only receive them. The Raft quorum requires n/2+1 health nodes to find consensus about changes. Usualy they require low latency network connections to work reliable. Most consensus algorithms are implemented this way.

So if both of your nodes are manager nodes, latency could result in frequent leader election, rending both manager nodes useless until a leader is elected. If your remote node would be worker node, at least the manager node would be able to still manage itself when the remote node is considered uncreachable. With both beiing manager nodes, beeing one of two nodes beeing uncreachable will result in a cluster that won’t be able to change state.

Creating overlay networks on high latency connections also doesn’t seem like something that helps the reliability. Though, I am not realy sure about this beeing a problem: 20 years ago peer2peer networks (torrent and such) already worked and still do work with logical overlay networks on top of the physical networks on high latency connections … so maybee this isn’t a problem at all. But raft vs. high latency is a big one.

Overall, if you want the cluster to be reliable, put all nodes in a near by network. Otherweise you might end up beeing challenged by unreliable behavior.

Thanks for the quick reply.

Yes… I tried to define the home node as a manager and it didn’t work. So… this is about latency? Not some complex firewall or network configuration? Right now, the google data center and my computer are on two different countries, but would work if they were on the same city?

For now, I just testing and playing… but I would deploy the app on a three node swarm, with three MongoDb instances, each VM on different datacenters for reliability. These data centers would be close, with low latency between them. I thought about this setup because: 1) I could scale, if things start to grow; 2) It is reliable; 3) Service discovery is great in the way I don’t need to write ip’s; 4) I can use the overlay to connect to services, instead of install MongoDb instances directly to the VMs and open their ports; 5) I could deploy new versions of the app without stopping everything. This setup seems to be good or bad?

According your OP, you opened the required ports:

Are you sure that you actualy deployed an overlay network? Since you didn’t post any of the commands or the compose.ymls you used, there is plenty room for assumptions. People tend to abstract what they think is sufficient to describe their case and usualy end up hiding the relevant details.

What’s wrong about using multi AZ from a cloud provider of choice in the same region? Usualy their links between their AZs are fast with low latency. Though, be sure your Cloud provider has the understanding of AZs that you are looking for. Some declare a DC on the other side of the street as a seperate AZ. From marketing perspective this helps to show a high AZ count, though I would not realy consider it beeing reliable against disaster scenarios.

Having your DCs in the same city is no guaranty for low latency. If the DCs are owned by the same provider, it is high likely they will have a fast low latency link between places. Even if the DCs are operated by different providers, using the same professional providers for fail over internet connections will high likely result in lower latency (especialy if all providers use the same backbone).

Multi AZ clusters (for instance with AWS) are feasable. Swarm is simple and easy to learn. That’s what makes it so great! Your case is covered by Swarm’s comfort zone :slight_smile:

Thanks again for the tips. I wouldn’t even consider the latency between the DCs.

It was an overlay, because I’ve deployed with “docker stack deploy -c docker-compose.yml my_stack” without specify any network and the default network it’s an overlay for this command… I would consider create a custom encrypted network to deploy, but I didn’t, yet. The compose file was this one:

version: '3'
services:

  #primary
  mongo1:
    image: mongo:4.0.10-xenial
    deploy:
      replicas: 1
      placement:
        constraints:
            - node.hostname == node1
      restart_policy:
        condition: on-failure
    volumes:
      - db:/data/db
    networks:
      default:
    command: mongod --replSet "rs0" --bind_ip 0.0.0.0

  #arbiter
  mongo2:
    image: mongo:4.0.10-xenial
    deploy:
      replicas: 1
      placement:
        constraints:
            - node.hostname == node1
      restart_policy:
        condition: on-failure
    volumes:
      - arb:/data/db
    command: mongod --replSet "rs0" --bind_ip 0.0.0.0

  #local secondary
  mongo3:
    image: mongo:4.0.10-xenial
    deploy:
      replicas: 1
      placement:
        constraints:
            - node.hostname == node2
      restart_policy:
        condition: on-failure
    volumes:
      - db_sec:/data/db
    command: mongod --replSet "rs0" --bind_ip 0.0.0.0

volumes:
  db:
  arb:
  db_sec:

Inside mongo1 container I could ping mongo2, but not mongo3, despite the last normally started on node2.

I would paste some “inspect” results, but I turned it off already.

I’m more focused on development and I don’t know much about networks, so, forgive me if I asking something too basic. What I understand from the overlay networks is that docker creates like a tunnel between the nodes, in the way I don’t need to use the VM public ip, neither the domain. Is it right?

The ping result was:

PING <vm-public-ip> (<vm-public-ip>): 56 data bytes
64 bytes from <vm-public-ip>: icmp_seq=0 ttl=56 time=141.472 ms
64 bytes from <vm-public-ip>: icmp_seq=1 ttl=56 time=141.061 ms
64 bytes from <vm-public-ip>: icmp_seq=2 ttl=56 time=140.958 ms
^C--- <vm-public-ip> ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max/stddev = 140.958/141.164/141.472/0.222 ms

Is this too much? ~140ms?

Your compose.yml does neither declare a network private to the stack, nor does it declare to use an existing network. As your wrote yourself, as fallback a default network private to the stack is used. This is not realy a problem.

If you realy want it to be encrypted, you need to declare your network and open up another port for ESP traffic (the anchor of my last link seems broken, scroll to the bottom).

An overlay network is another network spun on top of real networks. Services attached to an overlay network can communicate with other services in the same overlay network, without having to know anything about the physical network. Each overlay network will have its own DNS-Server and name resolution. As the service/container ip will change on every (re)start you will want to use the serice/container names, and never, never the service/container ip!

I don’t know what treshholds Docker’ uses for heartbeat and consensus timeouts. According Wikipedia the typicall values for raft implementations are around 150 to 300ms. If you put block-chains asside, there are not many distributed consensus algorithms (Egalitarian Paxos, HashGraph) that are designed to work reliable across DCs. High latency IS a problem for the most used algoriths (Raft, Paxos) - 140ms is far away from beeing low.

Beeing a developer does not prevent you from building up holistic knowledge about the needs of your application and what needs to be done to statisfy them. I consider my self JEOPS (just enough ops, to get the stuff that me an my team develop up and running)

Let me repeat myself: high latency is a problem, your currenct setup will be unreliable at best.