Help debugging issue with Docker Proxy

Hi folks,

I’m observing a weird issue on my Docker installation that started recently, but I’m not yet confident submitting a bug report til I get some more information about what’s going wrong. However, I’m a bit stumped as to how to get this info- all the logs on my host and container report no issues whatsoever.

What I’m seeing is a single instance of docker-proxy silently exits without warning after the port is used. None of the other ports have this problem, even after they are used (to a far greater extent than the trouble port). The port which is having issues is always the same one- the port of a redis server listening on 6379. I can verify that the port is still open on the container and I have no problem connecting to the service from inside the container or from the host via the docker container’s private IP address; the only issue is trying to use the proxy.

How can I go about figuring out what caused docker-proxy to exit? Are there events in the container that would cause the proxy to close? My understanding is that the proxy is backed by a simple iptables NAT rule, but are there events other than the container shutting down that could cause this?

To demonstrate, when the container is started I can see all the proxy processes:

   CGroup: /system.slice/docker.service
           ├─4629 /usr/bin/docker daemon --exec-opt native.cgroupdriver=systemd --selinux-enabled=false --log-driver=journald
           ├─4794 docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 26379 -container-ip 172.17.0.2 -container-port 26379
           ├─4804 docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 16380 -container-ip 172.17.0.2 -container-port 16380
           ├─4814 docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 16379 -container-ip 172.17.0.2 -container-port 16379
           ├─4824 docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 9042 -container-ip 172.17.0.2 -container-port 9042
           ├─4834 docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 6380 -container-ip 172.17.0.2 -container-port 6380
           └─4844 docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 6379 -container-ip 172.17.0.2 -container-port 6379

I can connect, as expected, to the trouble port:

[bcheng@zanzibar-iv ~]$ redis-cli -p 6379
127.0.0.1:6379> info
# Server
redis_version:2.8.17
redis_git_sha1:00000000
redis_git_dirty:0
[snipped]

I run a few jobs against this container, making use of all forwarded ports extensively. Partway through this job, the connection on 6379 starts refusing connections:

[bcheng@zanzibar-iv test]$ redis-cli
Could not connect to Redis at 127.0.0.1:6379: Connection refused

And the docker proxy for that port, but only that port, has exited:

   CGroup: /system.slice/docker.service
           ├─4629 /usr/bin/docker daemon --exec-opt native.cgroupdriver=systemd --selinux-enabled=false --log-driver=journald
           ├─4794 docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 26379 -container-ip 172.17.0.2 -container-port 26379
           ├─4804 docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 16380 -container-ip 172.17.0.2 -container-port 16380
           ├─4814 docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 16379 -container-ip 172.17.0.2 -container-port 16379
           ├─4824 docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 9042 -container-ip 172.17.0.2 -container-port 9042
           └─4834 docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 6380 -container-ip 172.17.0.2 -container-port 6380

However, the service is still listening fine and the container still has connectivity, as I can target the private IP directly:

[bcheng@zanzibar-iv test]$ redis-cli -h 172.17.0.2
172.17.0.2:6379> info
# Server
redis_version:2.8.17
redis_git_sha1:00000000
[snipped]

So far I’ve checked the nofile ulimit on both the container and host, but it’s large enough that it shouldn’t be an issue. None of the logs in the container or host are showing anything abnormal whatsoever.

Any ideas or next troubleshooting steps to try would be greatly appreciated!