Slow ss/scp download speeds with published ports in mesh network

Hi,

The scp download (but not upload) speed is very slow from a container to a external pc over the mesh network, is there a way to configure the mesh network to be faster?

Background:

We have a docker swarm network with multiple nodes where we expose a unique port for each user of our system so they can ssh into their own container.

However, when downloading files with scp or rsync (over ssh) the download speed are very low (100 KB/s from a 10 GBps link).

Uploading over scp/rsync works fine (max out the network connection) so it is only in downlink it is a issue.

I have ruled out:

The setup is from a docker swarm perspective pretty standard but we have increased the ingress network from /24 to /16 to overcome the max number of published ports problem (as in our case it restricted the number of concurrent users to much). Currently we have 27 active users.

I have tried searching both docker issues and the forum but only found very old issues (2014) that had fixes merged already (Incredibly unreliable/slow networking using port forwarding · Issue #4827 · moby/moby · GitHub and related).

Best Regards
Jonas

I cannot claim to be a network expert, so I would expect the speed to be the same in both direction which is clearly not the case.

Have you also checked the disk on the client side to which you download the files? I don’t think that is the problem, but I have to ask to really rule out that. I also assume you tried to download files to multiple machines and the download speed was always slow, right?

Can you try to run a simple Docker container without Swarm? Or is it what magic-wormhole did? I would test the scp with different Docker networks like the default Docker bridge, a custom docker bridge like the one that docker compsoe creates, amnd maybe also with host network if that is possible in your environment for testing purposes.

You can also check your networks on the host and make sure there is no big differentce between the MTU settings.

And my last idea for now is using tcpdump or tshark to see if there is any rejected or retried package in the network traffic that can slow it down.

Hi,

Yes also the disk on the client side is “fast” (tested on different clients as well) and since I tested the workaround (wormhole) on the same client and got good speed I am pretty sure it is not a client limitation.

Wormhole is a software that (somewhat simplified) creates a “p2p tunnel” (much like webrtc) between two computers to eg send files. wormhole is not aware of that it runs in docker or use any docker specific “tricks”, I only used it to see if I could avoid the download limitation by scp.

We will do some test in the coming days and see if what tcpdump can show. However, unless there is some other way to share a port in docker swarm network (with no direct access to the swarm workers for clients) than to route it over the ingress network, we need to use a mesh network.

If it’s WebRTC, it might use TURN (“Traversal Using Relays around NAT”) servers if direct communication fails. That would explain slowness.

Hi,

To avoid misunderstandings.

  • wormhole is fast in both uplink and downlink
  • scp/rsync is fast in uplink but slow in downlink

The wormhole was only used as a workaround to avoid the mesh network but the final goal is the get the speed of scp to work as expected.

I wouldn’t recommend alternative networks just because one is slower than expected, so the network tests are only for figuring out where the real issue is.Then we can point you to the right direction to report the issue if we can’t help.

Hi All,

I have now done some more investigation and I thought I should share the results but first some background.

We have two docker swarms, lets call them prod and dev running on the same HW (ie same links and switches but different vlans and on similar but different servers).
All network links are 10 Gbit (sfp+ mikrotik CRS326-24S+2Q+), the network cards vary between Intel x710 to some cheap asus AQC100 card.
There is however three main differences between these environments; in the prod environment we 12 workers and in the dev we have 2, dev is running docker engine 28.1.1 (also tested with 28.3.3) on kernel 5.15.0-152-generic (Ubuntu jammy) while the prod is running 28.2.2 on kernel 6.8.0-60-generic (Ubuntu noble). 28.2.2 Some Swarm services are not discoverable over DNS · Issue #50129 · moby/moby · GitHub is effecting 28.2.2 but it is not exactly the issue we have.

On the dev environment all work more or less as expected (same results with both 28.1.1 and 28.3.3) with iperf3 I get around 6 Gbit/s node to node (ie without mesh) and around 4 Gbit/s over the mesh (ie container to container). There seams to be a cap on the mesh network around 5-6 Gbits (tested with both docker nodes on the same vm host) and raw capacity around 18 Gbits between the docker nodes. The speed is the same in both directions.

However on the prod environment I still get around 6 Gbit/s node to node (in both directions) but container to container I get 4Gbit/s in one direction and around 8 Mbit/s in the other.

Our next test will be to upgrade the dev environment to the same kernel version and os as the prod and see if we can replicate the issue.

Can the mesh network route the traffic in different paths for up and downlink traffic ? and how to debug this ?

Regards

Short update;

Updated dev and prod to latest version so they are now on same docker kernel and os version.
This did not help, dev works good (as before) and prod have good performance( 4 Gbit) in one direction but not in the other (10 Mbit) over the overlay network. Node to node communication is still around 6-7 Gbit/s

I forgot to mention that 3 nodes in the prod swarm and 1 node in dev have 1 Gbit links but I have not tested from any of them but they are part of the swam in case that matters.

Regards

I can only recommend the same as before

MTU would be shown in the output of

ip link

I rarely have to solve network issues by myself and I don’t use Swarm, so I don’t have a lot of ideas, but I would definitely try tshark to see what happens with the packets. It is like wireshark just from terminal.

Hi,

All nodes have the same MTU setting (1500) so that should be ok.

I have confirmed that it is only between containers running on the manager and a worker node there is issues (on containers running on two worker nodes there is no issue.

Furthermore, if I run more than one stream , ie iperf3 -P2 the throughput of both streams are 2-3 Gbit/s per stream while if I run only one stream the performance is 10 Mbit/s.
I have also tested to run two containers and 1 stream from each and both get around 10 Mbit/s.

I have captured dumps on the dev and prod system and something is odd:

On the dev system I see a lot of UDP (vxlan) packets while on the prod system it is mostly TCP and alot of dupacks.

I start to suspect something is fishy with our manager node but what speak against that is that the manager on both dev and prod are virtual machines running on the same physical host.

I am confused.

Regards

Then it is above my networking skills. I don’t know if it is a bug or any kind of configuration issue, so I hope someone finds this topic with more Swarm overlay and networking experience.

Have you checked and compared native and also Docker network settings?

We see some issues in the forum where people use VLANs but the MTU is not set to 1400, so they have issues with larger packets getting corrupted.

A docker network create not always correctly recognizes the need to set a smaller MTU, so it needs to be set manually.

Also note a default Docker Swarm network only enables 254 service containers because of the default /24.

Also native network drivers might have some jumbo frame settings. Maybe they are not the same on all machines.

  • MTU on all hosts 1500
  • MTU inside containers 1450
  • Switches have jumbo frame enabled so link layer PDU >=9000.

Swarm network created as a /16 network.

Could try to increase host MTU on hosts but unfortunately I will not be able to test modifications in the prod system for some time as students are coming back to classes this week. I will try to replicate the issue in the dev system.

Has anyone had issues with network card offloading or similar proxmox network issues (we recently migrated from VMware to proxmox)

I migrated my homelab from VMware to Proxmox 6 years ago and ran a Swarm cluster there. I use 10G nics in my homelab, but never bothered to switch to jumbo frames. I do remember that I started with lift and shift using the VMware vmxnet3 vnics, but eventually replaced the vms with Ubuntu cloud images ([distro]-server-cloudimg-amd64.img) to leverage CloudInit. I also switched the vnics to virtio. Never had any performance Issue since.

Hi, for those interested I think we have reached the end of the road on this one and will (when time allows) reinstall the docker environment (at least the manager).

Anyway I will share some final results of the analysis.
The network are configured in the same way:
(On the left the network that works fine in our dev environment on the right the prod environment)

The speed between the docker hosts are similar (no screenshot of that unfortunately) looking at the tcptraces we can in the prod environment see a long delay after a build up large packet drops.

This does not happen in the dev environment or in the other direction.

Could be caused by faulty checksum offloading.

Try disabling it in the vm: ethtool -K eth0 tx-checksum-ip-generic off (of course you need to replace the interface with the affected one)

1 Like