Docker Swarm and 3 different compose files

I’m thinking about dockerizing an app on a 4 worker node swarm.
It would consist of:

  • PartnerA-compose-limited-to-single-active-node.yml
  • PartnerB-compose-to-run-on-all-4-nodes.yml
  • MultiPartner-compose-to-run-on-all-4-nodes.yml

The app stacks in these compose files are identical, but PartnerA has a limitation of only one active TCP connection to their host, and MultiPartner and PartnerB have different databases(I’d like to keep the stacks separate). Services in each of the three compose files should be connected to their own respective RabbitMQ, so Node1.PartnerB.service1 should not be allowed to connect to Node2.PartnerB.rabbitmq service and vice versa.
Each stack should be self contained with one service instance per stack and connected to the stack’s own rabbitmq service.

So basically I’m asking is it possible to setup a stack per node with its own rabbitmq service as the pivot point.
PartnerA stack should only run on one node at a time, and heal on available node during node fail, while allowing the PartnerB and MultiPartner stacks to run on all 4 nodes but completely separated with their own rabbitmq.

You can setup multiple stacks, give each their own project name, even use own env files. To limit them to one node, just use constraints with the hostname.

But if I set replicas: 4 for all services in the stack, how do I make each stack’s replica set connect to the same RabbitMQ? I was thinking of using max_replicas_per_node, because I need one of each service replica to connect to one RabbitMQ. I was hoping I can somehow pass the RabbitMQ host name to the replica, but if nothing else works I’ll try mapping the node hostname {{.Node.Hostname}} to the name of the RabbitMQ vhost.

node1 node2 node3 node4
rabbit1 rabbit2 rabbit3 rabbit4
r1.svc1 r2.svc1 r3.svc1 r4.svc1
r1.svc2 r2.svc2 r3.svc2 r4.svc2
r1.svc3 r2.svc3 r3.svc3 r4.svc3
r1.svc4 r2.svc4 r3.svc4 r4.svc4

Each swarm stack will have its own node-spanning overlay network. Every container created by a service task attached to the same overlay network will be able to communicate with every other service is the same network using dns-based service discovery, regardless on which node they are running.network.

If you want all services of a stack to be placed on a specific node, you will need to use placement constraints on each of the services, to make sure they are pinned to a host. The servicenames inside the stacks do not have to be different (unless they share a network, which could be mitigated by using an alias for the service in the shared network).

If you share one of your stacks, we can take a look at the compose file and make recommendations based on it. It can’t get more precise/less ambiguous than that :slight_smile:

This is my stack for which I’d like to have one of each service running on a separate node, and each “replica set” (by that I mean one set of services replicated) on a separate node, so I can name/create a rabbitmq vhost per node name. The RabbitMQ is running on a different stack inside the same network.
It’s a bit of a closed circuit system, only one of each stack service should be running on one RabbitMQ vhost.

version: "3.9"
services:
  logger:
    image: private_image_1
    isolation: 'process'
    environment:
      - rabbitmq_host=rabbitmq-1
      - rabbitmq_port=5672
      - rabbitmq_username=******
      - rabbitmq_password=******
      - rabbitmq_vhost=/     
    deploy:
      endpoint_mode: dnsrr
      # mode: global
      replicas: 1      
      restart_policy:
        condition: any
      placement:
        max_replicas_per_node: 1
        constraints:
          - node.labels.os == windows
    networks:
      - net2

  svc1:
    image: private_image_2
    isolation: 'process'
    environment:
      - rabbitmq_host=rabbitmq-1
      - rabbitmq_port=5672
      - rabbitmq_username=******
      - rabbitmq_password=******
      - rabbitmq_vhost=/
    deploy:
      # mode: global
      replicas: 1      
      restart_policy:
        condition: any
      placement:
        max_replicas_per_node: 1
        constraints:
          - node.labels.os == windows
    networks:
      - net2

  svc2:
    image: private_image_3
    isolation: 'process'
    ports:
      - 5455:5454
#      - target: 5454
 #       published: 5455
  #      protocol: tcp
   #     mode: host
    environment:
      - rabbitmq_host=rabbitmq-1
      - rabbitmq_port=5672
      - rabbitmq_username=******
      - rabbitmq_password=******
      - rabbitmq_vhost=/
    deploy:
      #endpoint_mode: dnsrr ### error: port published with ingress mode can't be used with dnsrr mode
      # mode: global
      replicas: 1      
      restart_policy:
        condition: any
      placement:
        max_replicas_per_node: 1
        constraints:
          - node.labels.os == windows
    networks:
      - net2

  svc3:
    image: private_image_4
    isolation: 'process'
    environment:
      - rabbitmq_host=rabbitmq-1
      - rabbitmq_port=5672
      - rabbitmq_username=******
      - rabbitmq_password=******
      - rabbitmq_vhost=/
      - logger_uri=tcp://logger:21212
    deploy:
      # mode: global
      replicas: 1      
      restart_policy:
        condition: any
      placement:
        max_replicas_per_node: 1
        constraints:
          - node.labels.os == windows
    networks:
      - net2

  svc4:    
    image: private_image_5
    isolation: 'process'
    environment:
      - rabbitmq_host=rabbitmq-1
      - rabbitmq_port=5672
      - rabbitmq_username=******
      - rabbitmq_password=******
      - rabbitmq_vhost=/
      - logger_uri=tcp://logger:21212
    deploy:      
      # mode: global
      replicas: 1      
      restart_policy:
        condition: any
      placement:
        max_replicas_per_node: 1
        constraints:
          - node.labels.os == windows
    networks:
      - net2

  svc5:
    image: private_image_6
    isolation: 'process'
    environment:
      - rabbitmq_host=rabbitmq-1
      - rabbitmq_port=5672
      - rabbitmq_username=******
      - rabbitmq_password=******
      - rabbitmq_vhost=/
      - logger_uri=tcp://logger:21212
    deploy:
      # mode: global
      replicas: 1      
      restart_policy:
        condition: any
      placement:
        max_replicas_per_node: 1
        constraints:
          - node.labels.os == windows
    networks:
      - net2

  svc6:    
    image: private_image_7
    isolation: 'process'
    environment:
      - rabbitmq_host=rabbitmq-1
      - rabbitmq_port=5672
      - rabbitmq_username=******
      - rabbitmq_password=******
      - rabbitmq_vhost=/
      - logger_uri=tcp://logger:21212          
    deploy:            
      replicas: 1      
      restart_policy:
        condition: any
      placement:
        max_replicas_per_node: 1
        constraints:
          - node.labels.os == windows          
    networks:
      - net2  

networks:
  net2:
    external: true

Your placement constraint does not pin the services to a node. It only pins them to nodes that run windows. So if all your nodes are on windows, they would be spread amongst all nodes.

To pin the services to a node, you can use node labels, and use them as placement constraint or use the node names as placement constraint.

If you want to use a node label:

Add a node label:

docker node update --label-add mycustomlabel=stack1 myhost1

Use it as placement constraint:

svc
  deploy:
    placement:
      constraints:
        - node.labels.mycustomlabel == stack1

if you want to use the node hostname:

svc
  deploy:
    placement:
      constraints:
        - node.hostname == myhost1

If all service share the same deployment constraints, you could introduce yaml anchors to deduplicate configuration (and prevent inconsistency):

---
x-deploy:
  &default-deploy
  mode: replicated
  replicas: 1
  restart_policy:
    condition: any
  placement:
    max_replicas_per_node: 1
    constraints:
      - node.labels.os == windows
      - node.hostname == myhost1

services:

  svc:
    ...
    deploy:
      <<: *default-deploy
    ....

This way you declare the settings once underneath the anchor &default-deploy and simply use it underneath the deploy element of the service.

1 Like

But I’d still like to have 4 replicas of each service, just each replica on its own node. Ideally, vertically on one node (replica index on same node index).

Node1 = svc1.replica1
Node1 = svc2.replica1
Node1 = svc3.replica1
...
Node2 = svc1.replica2
Node2 = svc2.replica2
Node2 = svc3.replica2
...
Node3 = svc1.replica3
Node3 = svc2.replica3
Node3 = svc3.replica3
...

But as far as I’ve read documentation this won’t be an easy task, so maybe I can have replicas: 4 with max_replicas_per_node: 1 scattered over all nodes, but having

environment:
  - rabbitmq_vhost='{{.Node.Hostname}}'

In that way even though replica index won’t be fixed to the same node index, there should be a single replica of one service on every node, and on one node all stack services would share a single rabbitmq vhost.

Do you see a problem in that?

I missed that point.

Then forgot what I wrote so far, as what you plan is not how replicas work. They are used to scaling replicas of the same application (as in same state!). You on the other hand want to run separate applications with different states using replicas.

I strongly recommend to no use that approach. This is going to be messy, if it works at all.

Can you provide an example of failure with this setup? I know it’s dirty, but a container is a container still. And they only communicate through RabbitMQ broker, and rabbit is on another stack.