Hello,
I’m using docker swarm (docker 17.05.0) on linux.
I have a docker-compose file in which I defined several services. Each service has an alias on a common network, orangephones_private. They are defined as follows:
services:
peer0suppliers:
...
networks:
orangephones_private:
aliases:
- peer0.suppliers.orangephones.net
app:
...
networks:
orangephones_private:
The various applications in the containers use these aliases in order to communication among them.
I deploy the stack successfully. When I inspect the network I get something like the example below:
"Name": "orangephones_peer0suppliers.1.h4z593xmzuagnxk4aygntnvcr",
"IPv4Address": "10.0.4.23/24",
However, if I execute
host peer0.suppliers.orangephones.net
in any container of my stack I’m getting the address 10.0.4.22 (and not 10.0.4.23 as listed by “network inspect”). 10.0.4.22 does not appear in the output of “network inspect”.
A client application in a container of the stack (the app service) connects to a server application running on peer0.suppliers.orangephones.net. If I run
netstat -tan64
in the client container (app) then I see an established connection between the client and 10.0.4.22:
tcp 0 0 10.0.4.30:42787 10.0.4.22:7053 ESTABLISHED
If I run the same command in the server container (peer0.suppliers), then I see an established connection between the client and 10.0.4.23:
tcp6 0 0 10.0.4.23:7053 10.0.4.30:42787 ESTABLISHED
The client endpoint (10.0.4.30:42787) is the same in both listings so it is the same connection. Note that the client reports a TCP connection while the server sees a TCP6 connection.
This puzzles me but let us say that I’m not bothered as long as it works.
However after some days of idleness on the machines I’ve noticed that the client still sees the connection as established to 10.0.4.22 but the server does not see any connection from the client. That is, netstat run in the client container (app) still gives the output
tcp 0 0 10.0.4.30:42787 10.0.4.22:7053 ESTABLISHED
but run in the server container (peer0.suppliers) it gives nothing. So the client thinks the connection is ok and does not re-establish it. The server sees no connection and thus does not send any application notifications to the client (as from its point of view there is no client).
I have sniffed the network communication both in the client (app) container and on the server container (peer0.suppliers). I’ve noticed two things:
FIrst what I found strange was that in both pcap files the IP address of the peer0.suppliers endpoint was 10.0.4.23. So I don’t understand why netstat reports 10.0.4.22 in the app container.
Second, after some hours of idleness, the server program (running in the peer0.suppliers container) sends a message to the client. I can see it in both captures. In response the server application gets a RST TCP packet from 10.0.4.30. I see this packet in both captures. This would mean that the connection is not open any more. But why? There was no FIN packet sent during all these hours.
What happened? How can it be fixed? Why does the client still see the connection as open when the server lost it? Why does the container have two IP addresses, 10.0.4.22 and 10.0.4.23?
I’ve attached the docker-compose file and the output of network inspect.
Thank you in advance,
Sorn