Can't transmit more than 1348 bytes between two nodes

I’ve got Docker Swarm working on some cloud VPS instances on Vultr. Containers on the same node can communicate between each other without any issue, but containers on one node can only make “short” network requests to containers on another worker node; if the response is more than 1348 bytes it just hangs (never completes).

The following response is what I get when using httpie to call nginx on a different node, where nginx is serving up a static file. However if I add just one more character to the file, all that is received are the headers; the connection isn’t closed but nothing more is received. I get exactly the same result if I use the Python’s simple http server: the headers are shorter but it can still only receive a content length of 1348 bytes.

HTTP/1.1 200 OK
Accept-Ranges: bytes
Connection: keep-alive
Content-Length: 1348
Content-Type: application/octet-stream
Date: Fri, 24 Jun 2022 02:15:09 GMT
ETag: "62b51e2a-544"
Last-Modified: Fri, 24 Jun 2022 02:15:06 GMT
Server: nginx/1.21.6

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vestibulum ornare justo lectus, sit amet sollicitudin arcu sodales vitae. Donec commodo purus id nunc aliquet ultrices. Aenean felis elit, commodo id purus et, viverra congue velit. Vivamus sem tortor, aliquam vitae ullamcorper nec, egestas id sem. Nulla lobortis vel ligula sed condimentum. Nullam a bibendum augue, quis ullamcorper enim. Morbi tempus felis ante, in vestibulum nibh porta eget. Duis euismod finibus nunc quis facilisis. Nullam sollicitudin augue vulputate ante luctus, a tristique lacus consequat.
Sed vel justo auctor, commodo nulla vitae, sodales neque. Curabitur ornare volutpat pellentesque. Vestibulum eu laoreet ligula, id cursus neque. Ut egestas, mi ut semper elementum, nisl nunc viverra justo, elementum lobortis augue diam non eros. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Aenean laoreet ante vel metus consectetur, et elementum mauris tincidunt. Morbi dictum diam ipsum, a viverra nisi fringilla id. Integer ornare erat ante, a tempor tortor gravida sit amet. Mauris faucibus viverra sapien, eget bibendum massa suscipit ac.
Ut pharetra magna faucibus commodo sagittis. Sed aliquam ornare nisi sed lobortis. Morbi at condimentum urna, vel faucibus libero. Lorem ipsum dolor sit amet, consectetur adipiscing..

More info:

  • I’ve disabled the firewall of the base VPS instances (UFW) entirely.

  • This behaviour isn’t restricted to one container type; I’ve replicated it in quite distinct containers.

  • This happens both between two worker nodes and also between a worker node and a manager node.

  • If I make requests between two containers on the same node I don’t have any issues. A long request returns fine.

  • If I replicate the test outside of the docker containers (between the base VPS instances), I don’t have any issues. A long request returns fine.

  • I tried a version where I added a delay to the response (inserted a sleep time), but even a 10 second response came back fine if it was a small number of bytes.

What might it be? I’m totally clueless.

1 Like

This is a good example how people should describe their problem and show how they tried to solve it. Thanks for that :slight_smile: I almost asked something and realized the answer was in your post already.

When it comes to network I am always uncertain. You could compare the parameters of the two networks. The difference of MTU could cause problems, but I don’t know if that could cause this. MTU is usually 1500. I checked the value on my machine, and it is 1500 indeed. On Linux I use ip link to see the MTU. In a container you need to install the “iproute2” package if I emember well, but you can just use the nicolaka/netshoot image like this to test it.

docker service create nicolaka/netshoot sleep inf

Then go to the host on which the service is running and use docker exec to to check the network.

Unfortunately I don’t have other ideas for now, so I hope it helps or you can figure it out on the way, or someone else can help.

For HTTP 1.x, headers are really just part of the response. A blank line simply indicates the end of the headers and the start of the content. This means that on the network level the HTTP headers are not special at all.

So: receiving only the headers but not (part of) the content makes me think the issue is not on network level, even though you have no problems when all is in a single node. :thinking: Of course, maybe HTTPie simply does not show if partial content was already received, but you also say that the same maximum applies when the headers are shorter. That again makes me think this is not related to pure networking problems. Or could you be wrong there, and do shorter headers actually allow for slightly larger content? You could even try to add many more fake headers to increase the total size.

If the maximum content length does change (slightly) with different headers then all of the following is moot.

If it does not change, then I’d think the content is not sent (unlikely, as you tested using Nginx, Python and your own code), the response is somehow not interpreted correctly (also unlikely, as your tests are not doing anything special), or something that knows about HTTP is interfering with the transfer (also unlikely…).

And anyhow, as you’re saying that both Nginx and Python’s http server have the same effect, my earlier hints probably will not give any unexpected results. Leaving them here just in case.

Earlier HTTP debugging hints (slightly extended)

Are the headers of the long responses (without content) also returning HTTP/1.1 200 OK (same 200 code and exact same version 1.1), and similar other headers too? (The cache ETag and dates may be different. It does not add some Content-Encoding: gzip, right?)

What is the value for Content-Length for the failing longer responses? (It’s not changed into Transfer-Encoding: chunked, right?) For shorter responses, is Content-Length actually correct? I wonder if your example should not be 1347 but maybe you’re missing a trailing newline in your post.

And probably very much unrelated, but who knows: why is Content-Type: application/octet-stream for what seems to be plain text? (Maybe HTTPie simply does not specify a more specific Accept header or your test file does not have a .txt extension.)

If indeed the supported maximum content length is slightly longer if the HTTP headers are slightly shorter (I’ve edited my response above), then you may be on to something.

Not my cup of tea either, but maybe related: Docker lowers interface MTU with 50 less than connected network, why? And as for a possible fix, Setting Container MTU in Swarm Mode may help. (Again, I’ve no idea.)

Thanks @rimelek and @avbentem for your help!

On a friend’s recommendation, I used termshark (from the nicolaka/netshoot container) to inspect the network connections being made (termshark -i eth0). Termshark showed that headers are transferred as seperate packets, explaining why header length didn’t seem to matter. @avbentem I understand that HTTP 1.x headers are part of the response, but I guess that at the networking layer they must be transferred separately.

Termshark showed a different length for the maximum HTTP packet that would transfer successfully (1414 bytes), but it seems that’s because that length includes headers. When the content was too long, termshark showed repeated ICMP packets for “Destination unreachable (Fragmentation needed)”.

But your suggestion @rimelek that MTU could be to blame was correct! Using the netshoot container (with ip link or ifconfig) I could see that interface eth0 had an MTU of 1450. Turns out that this came from the MTU of the private network that my host (Vultr) configures: their help page mentions 1450 as the ‘optimal value’, which they set as default. I forced docker to use the same MTU value by setting com.docker.network.driver.mtu:

networks:
  backend:
    driver: overlay
    driver_opts:
      com.docker.network.driver.mtu: 1450

And that completely solved my problem!

Gotta say, it was really quite an amazing feeling to see my website working perfectly after spending so many days agonising over what was going wrong!!

1 Like

Wow!

I really don’t think that, but implementations may be smart about it. And I could be wrong, of course.

Good find!