Hi,
I am experiencing a strange behavior with my swarm services.
I’m using a swarm of 2 nodes, to which I am deploying some service on one node (n1) and some on another node (n2). I chose to do so because I know that some services will require more resources and some less, thus placing them on specific node would allow me to balance it. It is to note that the nodes are virtual machines.
I deploy a stack using the following composer :
version: '3.7'
networks:
test_nodes_intercomm_net:
driver: overlay
services:
serviceN1_ng:
image: nginx
ports:
- 8889:80
deploy:
placement:
constraints:
- node.labels.cloud == true
networks:
- test_nodes_intercomm_net
logging:
driver: json-file
labels:
- "stack=nodes_intercomm"
restart: always
serviceN2_ng:
image: nginx
expose:
- "80"
deploy:
placement:
constraints:
- node.labels.stream == true
networks:
- test_nodes_intercomm_net
logging:
driver: json-file
labels:
- "stack=nodes_intercomm"
restart: always
The file above is a reduced representation of the real file, because I cannot share the original, but the issues are the same.
I can confirm that the services are created on the proper node with the command: docker service ps <serviceID>
The problem is that service on different nodes cannot seem to share data and requests.
I can ping in between them:
# From n2 to n1:
root@aa53baafe938:/# ping serviceN1_ng
PING serviceN1_ng (10.0.4.2): 56 data bytes
64 bytes from 10.0.4.2: icmp_seq=0 ttl=64 time=0.724 ms
64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=0.175 ms
64 bytes from 10.0.4.2: icmp_seq=2 ttl=64 time=0.187 ms
# From n1 to n1
root@844afa392516:/# ping serviceN2_ng
PING serviceN2_ng (10.0.4.5): 56 data bytes
64 bytes from 10.0.4.5: icmp_seq=0 ttl=64 time=1.078 ms
64 bytes from 10.0.4.5: icmp_seq=1 ttl=64 time=0.187 ms
64 bytes from 10.0.4.5: icmp_seq=2 ttl=64 time=0.138 ms
64 bytes from 10.0.4.5: icmp_seq=3 ttl=64 time=0.157 ms
And I can nslookup in between them:
# From n2 to n1
root@aa53baafe938:/# nslookup serviceN1_ng
Server: 127.0.0.11
Address: 127.0.0.11#53
Non-authoritative answer:
Name: serviceN1_ng
Address: 10.0.4.2
# From n1 to n2
root@844afa392516:/# nslookup serviceN2_ng
Server: 127.0.0.11
Address: 127.0.0.11#53
Non-authoritative answer:
Name: serviceN2_ng
Address: 10.0.4.5
But when I CURL, it reaches a timeout:
# From n2 to n1
curl http://serviceN1_ng:80/
However, when I curl from the container, using the real IP of the VM, with the proper port, I get a response.
And I don’t understand why?
All the suggested ports are open and they seem to be able to communicate (ping and nslookup).
However, a CURL request using the service name, nor VIP is not working.
With my real setup, that uses nginx with upstream and reverse proxies I have the same issue. My reverse proxy request do not reach and timeout.
A difference is that my personal image are build and manually loaded on the second node. But even with images pulled from public registry, I can’t seem to make them interconnect.
Any idea?