Question On Docker Networking

I have a docker (version 17.05.0-ce-rc3) container “logger” running on a host in a VPC (for simplicity, call the underlying host “worker”) that is unable to communicate with a host in the same VPC. The target host is running kafka, so I’ll refer to that as “kafka”.

A few points:

  1. worker can connect to kafka on port 9092.

  2. logger can connect to worker if I open a port on worker

  3. when logger attempts to connect to kafka, it appears to be routing it outside the vpc, to the external internet, which is obviously wrong.

traceroute 172.31.24.83
traceroute to 172.31.24.83 (172.31.24.83), 30 hops max, 46 byte packets
1 ip-172-18-0-1.us-west-2.compute.internal (172.18.0.1) 0.005 ms 0.004 ms 0.003 ms
2 * * *
3 *…

  1. Executing route on logger yields:

Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default ip-172-18-0-1.u 0.0.0.0 UG 0 0 0 eth1
172.18.0.0 * 255.255.0.0 U 0 0 0 eth1
172.31.115.0 * 255.255.255.0 U 0 0 0 eth0

  1. Logger is able to resolve kafka via hostname.

  2. I believe it isn’t a security group setting, as if that were the case, worker wouldn’t be able to communicate with kafka either.

  3. executing docker network ls on worker yields: docker network ls

NETWORK ID NAME DRIVER SCOPE

ca66dc7f7326 bridge bridge local

f4e66ecf1f5e docker_gwbridge bridge local
29946d1fe867 host host local
npm3fm1gkipp ingress overlay swarm
8d369d123025 none null local
bdprje9fq1nq ops_default overlay swarm

logger is running within the context of the ops_default network above.

  1. I’m running all of this in AWS, I’m using Swarm as my orchestration engine.

  2. Both worker and kafka are in the same VPC.

Can anyone tell me what I need to do to ensure that logger can connect to kafka? Has anyone else seen something like this? I’ve been searching and haven’t turned up anything.

Thanks!

looks like the default route on logger is thru eth1 (172.18)… while the rest of the network is thru eth0 (172.31)

I was wrong about this, connectivity wasn’t the issue, it was a red herring.

any short summary of the cause? for others that may fall into the same ditch…

Alpine Linux doesn’t always report that a connection has been established (using Telnet) when in fact it has.