Swarm mode not load balancing

cbo3 · November 14, 2016, 1:44pm

I’m trying to get a swarm running on AWS instances but can’t get even basic load balancing to work so I must be doing something wrong. My end goal is to use an external ELB and hopefully only put the public IP addresses for the drained manager nodes in there and have them spread the load on the worker nodes. I can’t get that to work either so I tried a very simple test with no ELB involved, I’ll describe what I have so please let me know what I have wrong:

A 5 node swarm set up on t2.micro instances. One manager node that has been drained. 4 worker nodes. A test nginx service has been created to run on 2 nodes that just returns the hostname of the server it is running on. I can see by using docker service ps which nodes the service is actually running on and which nodes don’t have anything running. All that seems to be working fine. But when I use curl to hit the service, I have to use the public IP address of the 2 nodes that actually have the service running to get a response - in that case it works and the hostname of the node is returned. If I hit either of the other 2 nodes or the manager node, I get a connection refused error.

I’ve seen from other examples posted on-line that this type of test should demonstrate round-robin load-balancing but it clearly isn’t for me. Thanks for any help!

cbo3 · November 14, 2016, 3:08pm

Small update. I had been running docker version 1.12.1 but just upgraded to 1.12.3 and see the same behavior. Have to go directly to a node to get the service to respond. Any other node, including the manager, returns “Connection refused”.

cbo3 · November 14, 2016, 3:48pm

Here are the command I have run:
start swarm on manager node:
docker swarm init

join swarm from 4 worker nodes:
docker swarm join --token XXX IP:port

drain manager:
docker node update --availability drain MGR_NAME

create service from manager:
docker service create
–name test
–replicas 2
–mount type=bind,src=/etc/hostname,dst=/usr/share/nginx/html/index.html,readonly
–publish 80:80
nginx

That is it. Thanks again.

cbo3 · November 14, 2016, 7:24pm

Solved my own problem!

The issue was the ports that need to be open between all nodes. I mis-read the documentation and only had ports 7946 and 4789 open for UDP traffic. But they need to be open both for UDP and TCP. If they are only open for UDP then you get the behavior that I was seeing - your service only works by going directly to the nodes that are running it. But with TCP enabled too, you can get to your service from any node even a drained manager.

Topic		Replies	Views
Struggling with Swarm load balancing on AWS Swarm aws , docker	1	1415	March 10, 2019
Ingress load balancing Swarm	2	5570	July 4, 2016
Docker Swarm Mode network and load balancing doesn't work for a service in worker node Swarm	2	2633	November 6, 2017
Load balancer for swarm managers Swarm	0	879	December 21, 2016
Docker Swarm Ingress Network Load Balancing Issue Swarm	4	1611	September 22, 2020

Swarm mode not load balancing

Related topics