Docker swarm networking , containers on nodes can't communicate

Two issues

  1. Docker Compose .env files which sets up enviornment variables does not work with Docker Swarm. Currently I am sourcing file manually and exporting variables to make it work but looking what is proper way for moving from Docker Compose to Docker Swarm running on multiple nodes.
  2. Networking between containers between nodes isn’t working
cat .envSwarm 
COMPOSE_PROJECT_NAME=docker-trino
#DISCOVERY_URI=http://${COMPOSE_PROJECT_NAME}-trino-master:8080
DISCOVERY_URI=http://trinomaster:8080
#HIVE_METASTORE_URI=thrift://${COMPOSE_PROJECT_NAME}-hive-metastore:9083
HIVE_METASTORE_URI=thrift://hivemetastore:9083
INCLUDE_COORDINATOR=false
export COMPOSE_PROJECT_NAME DISCOVERY_URI HIVE_METASTORE_URI INCLUDE_COORDINATOR
cat swarm-compose.yml 
version: "3.8"
services:
  hive-metastore:
    build: 
      dockerfile: ../hive-metastore/Dockerfile
      context: ../hive-metastore
    image: 10.208.1.5:5000/hive-metastore:3.1.3
    hostname: hivemetastore
    networks: 
      trino-network-docker:
        aliases:
          - hivemetastore
    environment:
      - DATABASE_HOST=mysql-hivemeta.aws.com
      - PROJECT_NAME=${COMPOSE_PROJECT_NAME}
    ports:
      - "9083:9083"
    restart: unless-stopped
    extra_hosts:
      - "host.docker.internal:host-gateway"

  hiveservice2:
    build: 
      dockerfile: ../hive/Dockerfile
      context: ../hive
    image: 10.208.1.5:5000/myhive:latest
    networks:
      - trino-network-docker
    environment:
      - DATABASE_HOST=mysql-hivemeta.aws.com
      - PROJECT_NAME=${COMPOSE_PROJECT_NAME}
      - IS_RESUME="TRUE"
      - HIVE_CUSTOM_CONF_DIR=/hive_custom_conf
      - DB_DRIVER=mysql
      - HIVE_METASTORE_URI=${HIVE_METASTORE_URI}
      - SERVICE_OPTS=-Dhive.metastore.uris=${HIVE_METASTORE_URI}
      - VERBOSE="true"
      - SERVICE_NAME=hiveserver2
    volumes:
      - ../hive/conf/:/hive_custom_conf
    ports:
      - "10000:10000"
      - "10002:10002"
    restart: unless-stopped
    extra_hosts:
      - "host.docker.internal:host-gateway"
    depends_on:
       - hive-metastore  
  trino-master:
    build: 
      dockerfile: ../trino/Dockerfile
      context: ../trino
    image: 10.208.1.5:5000/trinodb/trino:latest 
    hostname: trinomaster
    networks: 
      trino-network-docker:
        aliases:
          - trinomaster
    volumes:
      - ../trino/master/conf/:/etc/trino/
      - ../trino/data:/data/trino/
     #- /.aws/:/root/.aws/
     #- /.aws/:/home/trino/.aws/
    ports:
      - "8080:8080"
    restart: unless-stopped
    extra_hosts:
      - "host.docker.internal:host-gateway"
    environment:
      - PROJECT_NAME=${COMPOSE_PROJECT_NAME}
      - DISCOVERY_URI=${DISCOVERY_URI}
      - HIVE_METASTORE_URI=${HIVE_METASTORE_URI}
      - INCLUDE_COORDINATOR=${INCLUDE_COORDINATOR}
    command: sh -c 'sed -i "/coordinator=/d" /etc/trino/config.properties && 
                    echo coordinator=true >> /etc/trino/config.properties && 
                    sed -i "/discovery.uri/d" /etc/trino/config.properties && 
                    echo discovery.uri=${DISCOVERY_URI} >> /etc/trino/config.properties && 
                    sed -i "/hive.metastore.uri/d" /etc/trino/catalog/hive.properties && 
                    echo hive.metastore.uri=${HIVE_METASTORE_URI} >> /etc/trino/catalog/hive.properties && 
                    sed -i "/node-scheduler.include-coordinator/d" /etc/trino/config.properties && 
                    echo node-scheduler.include-coordinator=${INCLUDE_COORDINATOR} >> /etc/trino/config.properties && 
                    sed -i "/discovery-server.enabled/d" /etc/trino/config.properties && 
                    echo discovery-server.enabled=true >> /etc/trino/config.properties && 
                    /usr/lib/trino/bin/run-trino'
    healthcheck:
      test: /usr/lib/trino/bin/health-check 
      interval: 60s
      retries: 5
      start_period: 30s
      timeout: 30s

    depends_on:
       - hive-metastore  
  trino-worker:
    build: 
      dockerfile: ../trino/Dockerfile
      context: ../trino
    image: 10.208.1.5:5000/trinodb/trino:latest 
    hostname: "Trino-Worker-{{.Node.Hostname}}"
    networks:
      - trino-network-docker
    volumes:
      - ../trino/worker/conf/:/etc/trino/
      - ../trino/master/conf/catalog/:/etc/trino/catalog
      - ../trino/data:/data/trino/
     #- /.aws/:/root/.aws/
     #- /.aws/:/home/trino/.aws/
    ports:
      - "8081-8099:8080"
    restart: unless-stopped
    extra_hosts:
      - "host.docker.internal:host-gateway"
    environment:
      - PROJECT_NAME=${COMPOSE_PROJECT_NAME}
      - DISCOVERY_URI=${DISCOVERY_URI}
      - HIVE_METASTORE_URI=${HIVE_METASTORE_URI}
      - INCLUDE_COORDINATOR=${INCLUDE_COORDINATOR}
    command: sh -c 'sed -i "/coordinator=/d" /etc/trino/config.properties && 
                    echo coordinator=false >> /etc/trino/config.properties && 
                    sed -i "/discovery.uri/d" /etc/trino/config.properties && 
                    echo discovery.uri=${DISCOVERY_URI} >> /etc/trino/config.properties && 
                    sed -i "/hive.metastore.uri/d" /etc/trino/catalog/hive.properties && 
                    echo hive.metastore.uri=${HIVE_METASTORE_URI} >> /etc/trino/catalog/hive.properties && 
                    sed -i "/node-scheduler.include-coordinator/d" /etc/trino/config.properties && 
                    echo node-scheduler.include-coordinator=${INCLUDE_COORDINATOR} >> /etc/trino/config.properties && 
                    /usr/lib/trino/bin/run-trino'
    healthcheck:
      test: /usr/lib/trino/bin/health-check 
      interval: 60s
      retries: 5
      start_period: 30s
      timeout: 30s

    depends_on:
       - hive-metastore  
       - trino-master
networks:
  trino-network-docker:
    name: trino-network-docker
    driver: overlay
    attachable: true

Docker Swarm consists of 3 nodes.

source .envSwarm

. .envSwarm

docker stack deploy

docker stack deploy --compose-file swarm-compose.yml $COMPOSE_PROJECT_NAME

Issue is containers unable to communnicate with each other by hostname ( hivemetastore & trinomaster or by their IP)

Inbound Traffic for Swarm Management

  • TCP port 2377 for cluster management & raft sync communications
  • TCP and UDP port 7946 for “control plane” gossip discovery communication between all nodes
  • UDP port 4789 for “data plane” VXLAN overlay network traffic
  • IP Protocol 50 (ESP) if you plan on using overlay network with the encryption option

Looks like I have missing UDP port 7946 which was causing issue in EC2 security

For Docker Swarm you usually use docker stack deploy.

You should set mode: global or replicas: x to have a service/container running on every node or the number you want overall. Both can be further restricted with constraint.

Note that Docker Swarm does not support inline Dockerfile, so you do need to pull from a registry.

Note when using docker stack deploy

The build option is ignored when deploying a stack in swarm mode The docker stack command does not build images before deploying.

Source: docs

Swarm also does not support depends_on, as a Swarm is more loosely coupled. It seems you want to make a local compose file work on Swarm, but some concepts are a little bit different. You should have a deploy section.

It’s also not best practice to open ports on all services/containers, but rather use an internal Docker overlay network for the services to interact. That way you can run multiple instances on the same node without port conflict. Inside the Docker network the ports are automatically exposed.

AFAIK the env file should be .env or you need to specify a different filename on the command line.

I recommend to read or watch some Swarm tutorials to get more used to the concepts.

Thanks @bluepuma77
I did have .env file as well but somehow it was not read by docker swarm ( docker stack deploy …) so that’s reason I have created another one and added export command in the end.
Are you sure .env file is read in Docker Stack deploy ?

Seems docker stack deploy does not support using a .env file.

One discussion, some workarounds.