Hi all,
I’m setting up a new Docker Swarm mode cluster on three Linux machines - all brand new installations. All three machines have their firewalls and security modules (AppArmor) disabled and I could confirm they can communicate over ports 7946
(TCP and UDP) and 4789
(UDP).
For example, one of the stacks I’m bringing up:
...
wikijs_db:
image: postgres:11-alpine
deploy:
replicas: 1
environment:
POSTGRES_DB: ***
POSTGRES_PASSWORD: ***
POSTGRES_USER: ***
HA_ACTIVE: 1
restart: unless-stopped
volumes:
- db_data:/var/lib/postgresql/data
- db_assets:/root/assets
wikijs_wiki:
image: requarks/wiki:2.5
deploy:
replicas: 1
depends_on:
- wikijs_db
environment:
DB_TYPE: postgres
DB_HOST: wikijs_db
DB_PORT: 5432
DB_USER: ***
DB_PASS: ***
DB_NAME: ****
HA_ACTIVE: 1
volumes:
- "/etc/localtime:/etc/localtime:ro"
- data:/backup/data
- assets:/root/assets
...
When creating a cluster, containers are spread out between the nodes - and can communicate with ping
, as well as resolve their hostnames:
/wiki $ ping wikijs_db
PING wikijs_db (10.0.1.4): 56 data bytes
64 bytes from 10.0.1.4: seq=0 ttl=42 time=0.113 ms
64 bytes from 10.0.1.4: seq=1 ttl=42 time=0.134 ms
However containers cannot communicate over TCP:
/wiki $ nc -vz wikijs_db 5432
nc: wikijs_db (10.0.1.4:5432): Operation timed out
This occurs only between nodes in the swarm.
I’ve tried recreating the swarm, reinstalling the operating systems, running different containers as well as assigning the default network a predefined subnet, all with no success. I can’t see any errors in the Docker daemon’s log on any of the nodes, either.
I’d greatly appreciate help in solving this issue.
Thank you!