I’ve run into an issue that seems similar too this one; Can't access service in swarm. My setup is a little bit different though and I haven’t found a solution to my problem yet.
The minimal, reproducible example
-
Build a swarm cluster between atleast 3 Ubuntu 20.04 docker swarm managers.
-
Deploy a service
docker service create --name test_web --replicas 3 --publish published=8080,target=80 nginxdemos/hello
-
Check that the containers and services were created properly and observe the failure of connecting to that service:
demi-ubu01:~/stacks$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d4a12a3c5448 nginxdemos/hello:latest "nginx -g 'daemon of…" About a minute ago Up About a minute 80/tcp test_web.2.yul33wdycarig3qoxnehgrjrz
demi-ubu01:~/stacks$ docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
0yqd7gvggwuh test_web replicated 3/3 nginxdemos/hello:latest *:8080->80/tcp
# External test:
demi-ubu01:~/stacks$ curl -I 10.100.4.5:8080
curl: (7) Failed to connect to 10.100.4.5 port 8080: Connection refused
# Inside container to published service port:
demi-ubu01:~/stacks$ docker exec -it d4a12a3c5448 wget http://test_web:8080
Connecting to test_web:8080 (10.0.4.2:8080)
wget: can't connect to remote host (10.0.4.2): Host is unreachable
# Inside container to apps exposed port:
demi-ubu01:~/stacks$ docker exec -it d4a12a3c5448 wget http://localhost:80
Connecting to localhost:80 (127.0.0.1:80)
index.html 100% |****************************| 7217 0:00:00 ETA
The expected result of the first curl command should be a Status 200 Ok.
The detailed report
My setup is 4 nodes in total. They are identical Ubuntu 20.04 KVM virtual machines all on the same network. There are no firewalls between them. I have 3 Managers and 1 Worker (which i’ve only added as a step during troubleshooting).
:~/stacks$ docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
kcm5v64psntjxngnqkfdj1jzh * demi-ubu01 Ready Active Reachable 20.10.1
uo3rljg6ax5qkjm898pyym9t1 demi-ubu02 Ready Active Leader 20.10.1
pysnl8sohdp4fv67gui156z4k demi-ubu03 Ready Active Reachable 20.10.1
rp2otsqpnxkgbmxbpkv21yjs6 demi-ubu04 Ready Active 20.10.1
I can run a container normally and reach it on the local host fine.
demi-ubu01:~/stacks$ docker run -p 8080:80 -d nginxdemos/hello
de4d0a937710acb1d6d8ae3b7eb9175860b6614dfd9ce92bc972efe619ae095f
demi-ubu01:~/stacks$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
de4d0a937710 nginxdemos/hello "nginx -g 'daemon of…" 4 seconds ago Up 2 seconds 0.0.0.0:8080->80/tcp pedantic_wiles
demi-ubu01:~/stacks$ curl -I 10.100.4.5:8080
HTTP/1.1 200 OK
Server: nginx/1.13.8
Date: Sat, 19 Dec 2020 17:59:23 GMT
Content-Type: text/html
Connection: keep-alive
Expires: Sat, 19 Dec 2020 17:59:22 GMT
Cache-Control: no-cache
However the same app deployed as a service using the following compose file:
demi-ubu01:~/stacks$ cat test.yml
version: "3.6"
services:
web:
image: nginxdemos/hello:latest
deploy:
replicas: 3
resources:
limits:
cpus: "0.1"
memory: 50M
restart_policy:
condition: on-failure
ports:
- target: 80
published: 8080
protocol: tcp
mode: ingress
networks:
- webnet
networks:
webnet:
driver: overlay
It does not become reachable from any of the hosts at all:
demi-ubu01:~/stacks$ docker stack deploy -c test.yml test
Creating network test_webnet
Creating service test_web
demi-ubu01:~/stacks$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
05030ef897a1 nginxdemos/hello:latest "nginx -g 'daemon of…" 10 seconds ago Up 7 seconds 80/tcp test_web.1.kobrpkp68f2qbs4jhd6o8aebg
# Trying on all of the hosts in the cluster. No firewalls here.
demi-ubu01:~/stacks$ curl -I 10.100.4.5:8080
curl: (7) Failed to connect to 10.100.4.5 port 8080: Connection refused
demi-ubu01:~/stacks$ curl -I 10.100.4.9:8080
curl: (7) Failed to connect to 10.100.4.9 port 8080: Connection refused
demi-ubu01:~/stacks$ curl -I 10.100.4.10:8080
curl: (7) Failed to connect to 10.100.4.10 port 8080: Connection refused
demi-ubu01:~/stacks$ curl -I 10.100.4.11:8080
curl: (7) Failed to connect to 10.100.4.11 port 8080: Connection refused
demi-ubu01:~/stacks$ docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
elvfm7o4v4zo test_web replicated 3/3 nginxdemos/hello:latest *:8080->80/tcp
I also don’t see any port bindings being made on those hosts at all, so it doesn’t look like any ports are being published.
INeed2Poo@demi-ubu01:~/stacks$ docker service inspect test_web
[
## https://pastebin.com/WqqyDnVS ##
]
demi-ubu01:~/stacks$ netstat -na | grep LISTEN
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:49152 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:24007 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.53:53 0.0.0.0:* LISTEN
demi-ubu01:~/stacks$ docker network ls
NETWORK ID NAME DRIVER SCOPE
6e5f7e7cebc3 bridge bridge local
7a1155f87a62 docker_gwbridge bridge local
ab32da8ac1ec host host local
46id8wzw4ayf ingress overlay swarm
a24a40ef78f4 none null local
d9l7msysdx8m test_webnet overlay swarm
INeed2Poo@demi-ubu01:~/stacks$ docker network inspect 46id8wzw4ayf
[
https://pastebin.com/JPA0ZBjE
]
I also can’t reach the service while exec’ed into a container for that service. Execing into a container, I’m able to hit the LOCAL app port, however I cannot hit the service by name. The container CAN resolve the service name.
## Testing the app's service from the local container fails:
demi-ubu01:~/stacks$ docker exec -it 05030ef897a1 wget http://test_web:8080
Connecting to test_web:8080 (10.0.4.2:8080)
wget: can't connect to remote host (10.0.4.2): Host is unreachable
## Testing the app's local port from the local container is sucessful:
demi-ubu01:~/stacks$ docker exec -it 05030ef897a1 wget http://localhost:80
Connecting to localhost:80 (127.0.0.1:80)
index.html 100% |****************************| 7217 0:00:00 ETA
demi-ubu01:~/stacks$ docker --version
Docker version 20.10.1, build 831ebea
I’ve gone and made sure that I’m not using any overlapping networks that might be causing this and have gone so far as to completely redeploy the cluster. I’ve just about exhausted all of my troubleshooting idea’s. Any Idea’s?