I’m setting up a new Docker Swarm mode cluster on three Linux machines - all brand new installations. All three machines have their firewalls and security modules (AppArmor) disabled and I could confirm they can communicate over ports
7946 (TCP and UDP) and
For example, one of the stacks I’m bringing up:
... wikijs_db: image: postgres:11-alpine deploy: replicas: 1 environment: POSTGRES_DB: *** POSTGRES_PASSWORD: *** POSTGRES_USER: *** HA_ACTIVE: 1 restart: unless-stopped volumes: - db_data:/var/lib/postgresql/data - db_assets:/root/assets wikijs_wiki: image: requarks/wiki:2.5 deploy: replicas: 1 depends_on: - wikijs_db environment: DB_TYPE: postgres DB_HOST: wikijs_db DB_PORT: 5432 DB_USER: *** DB_PASS: *** DB_NAME: **** HA_ACTIVE: 1 volumes: - "/etc/localtime:/etc/localtime:ro" - data:/backup/data - assets:/root/assets ...
When creating a cluster, containers are spread out between the nodes - and can communicate with
ping , as well as resolve their hostnames:
/wiki $ ping wikijs_db PING wikijs_db (10.0.1.4): 56 data bytes 64 bytes from 10.0.1.4: seq=0 ttl=42 time=0.113 ms 64 bytes from 10.0.1.4: seq=1 ttl=42 time=0.134 ms
However containers cannot communicate over TCP:
/wiki $ nc -vz wikijs_db 5432 nc: wikijs_db (10.0.1.4:5432): Operation timed out
This occurs only between nodes in the swarm.
I’ve tried recreating the swarm, reinstalling the operating systems, running different containers as well as assigning the default network a predefined subnet, all with no success. I can’t see any errors in the Docker daemon’s log on any of the nodes, either.
I’d greatly appreciate help in solving this issue.