Swarm mode not load balancing

I’m trying to get a swarm running on AWS instances but can’t get even basic load balancing to work so I must be doing something wrong. My end goal is to use an external ELB and hopefully only put the public IP addresses for the drained manager nodes in there and have them spread the load on the worker nodes. I can’t get that to work either so I tried a very simple test with no ELB involved, I’ll describe what I have so please let me know what I have wrong:

A 5 node swarm set up on t2.micro instances. One manager node that has been drained. 4 worker nodes. A test nginx service has been created to run on 2 nodes that just returns the hostname of the server it is running on. I can see by using docker service ps which nodes the service is actually running on and which nodes don’t have anything running. All that seems to be working fine. But when I use curl to hit the service, I have to use the public IP address of the 2 nodes that actually have the service running to get a response - in that case it works and the hostname of the node is returned. If I hit either of the other 2 nodes or the manager node, I get a connection refused error.

I’ve seen from other examples posted on-line that this type of test should demonstrate round-robin load-balancing but it clearly isn’t for me. Thanks for any help!

Small update. I had been running docker version 1.12.1 but just upgraded to 1.12.3 and see the same behavior. Have to go directly to a node to get the service to respond. Any other node, including the manager, returns “Connection refused”.

Here are the command I have run:
start swarm on manager node:
docker swarm init

join swarm from 4 worker nodes:
docker swarm join --token XXX IP:port

drain manager:
docker node update --availability drain MGR_NAME

create service from manager:
docker service create
–name test
–replicas 2
–mount type=bind,src=/etc/hostname,dst=/usr/share/nginx/html/index.html,readonly
–publish 80:80
nginx

That is it. Thanks again.

Solved my own problem!

The issue was the ports that need to be open between all nodes. I mis-read the documentation and only had ports 7946 and 4789 open for UDP traffic. But they need to be open both for UDP and TCP. If they are only open for UDP then you get the behavior that I was seeing - your service only works by going directly to the nodes that are running it. But with TCP enabled too, you can get to your service from any node even a drained manager.