I set up a test Swarm cluster (2 manager and 2 worker nodes)
I then created a service based on a simple Node.JS REST service listening on 8080 port: it’s a custom image that simply replies to a “ping” GET method with a HTTP 200 response.
I scaled this service by 5 and then I tested Swarm inner load balancing with:
curl http://[node IP]:8080/ping
I tried with all the node IPs (managers and workers), several times for each node.
what I came up is: worker nodes replies correctly every single time in a straight round robin, manager nodes fail every now and then.
the latters start with a round robin request on every swarm service task, but soon or later (usually sooner…) the request stucks until time-out (and I am quite sure that the specific task is up & running because it replies without any problem through workers)
if I abort the request (ctrl+C) and reissue another curl command, usually the next task in the round robin sequence replies correctly
usually managers fail every 5/7 continuos requests (too ferquently IMHO)
someone experienced the same?