HTTP requests within container failing when hosted in OpenStack

I’ve currently got Docker running on a couple of OpenStack-hosted VMs.

When attempting to make HTTP requests from within containers on those machines, the request invariably hangs waiting for a response. When making HTTPS requests, the request invariably times out because it fails to complete the TLS handshake. In either case, it seems as though the requests are failing after only receiving a couple of hundred bytes of response data (if any). The same requests complete without any trouble from the host machine (i.e it isn’t a host-level firewall that’s the problem).

I’ve come across this (and some related blog posts) that suggest its due to the Docker network bridge having a different MTU to the host’s network interface, but I’ve verified that both MTUs are the same in my case. I’ve also tried toying with the --mtu switch to the Docker daemon to see if I could get it to work, with no success.

I’ve also come across a few similar cases that suggest that this may be due to TCP checksum and/or segmentation offloading (e.g. here), but no amount of toying around with ethtool -K {interface} tx off rx off or similar produce any positive result.

The behaviour seems to be specific to the network bridge - using --net=host when running containers solves the problem. However, for security reasons I want to avoid having to use this workaround in our production system.

Note also that the exact same setup (same Docker version, same configuration parameters) works fine in my development environment or when running on an instance hosted in AWS - whatever the problem is, it seems to only manifest itself when running under OpenStack.

For reference, I’m using the following Dockerfile for testing:

FROM alpine:3.3
ENTRYPOINT wget http://ipv4.download.thinkbroadband.com/5MB.zip

This should (in theory) download a 5MB test file when run. However, the wget command just hangs. It’s also probably worth noting that it’s not the guest OS that’s the issue - using a Ubuntu image (for example) still exhibits the same behaviour.

Also, in case it’s relevant, here’s the output of docker version on the system in question:

Client:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        Thu Aug 18 05:22:43 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        Thu Aug 18 05:22:43 2016
 OS/Arch:      linux/amd64

The host VM is running Ubuntu 14.04, 64-bit.

Can anyone shed some light on what might be happening?

UPDATE 2016-09-30

It turns out that everything works as expected if Docker is hosted under a Ubuntu 16.04 image.

At this point I can move ahead by rebuilding all my VMs to use a Ubuntu 16.04 image, but I still don’t know what the original issue was. I’m still interested in hearing any suggestions as to what the cause of the original problem was, and how to fix it.

UPDATE 2016-11-04

This behaviour has now re-appeared in a system hosted under Ubuntu 16.04. This system has previously worked as expected, but has regressed without warning to the incorrect behaviour described above. It’s not clear when exactly the regression occurred, or why.

Given that my previous workaround has now proven to be ineffective, I’m more eager than ever for any suggestions on how to fix the issue.

1 Like