Swarm is not round robin routing requests

I’m having some difficulty seeing the round robin behavior I saw demonstrated during the 1.12 Docker Swarm demos.

I have an example project I’m working with where you can replicate the behavior.

git clone https://github.com/AndrewBell/docker-swarm-demo.git
docker-compose build
docker swarm init
docker network create -d overlay mynet
docker service create --replicas 3 --name adjective --publish 8010:8010/tcp --network mynet recursivechaos/adjective
docker service create --replicas 3 --name noun --publish 8020:8020/tcp --network mynet recursivechaos/noun
docker service create --replicas 1 --name gateway --publish 8000:8000/tcp --network mynet recursivechaos/gateway
curl localhost:8000\adjectives\random

Just a quick overview of what’s happening here. We have a REST resource service (adjectives) and an Zuul API Gateway (gateway) that routes to the adjective services based on it’s DNS entry (adjective). It’s able to route, and we receive a response with the container/host name. But, I expect subsequent requests to be routed to different containers. What I’m seeing is that it is routed to a single container every time.

What configuration am I missing to get round robin DNS routing to work? What can I do to help troubleshoot this?

INFO:

I’m using Docker for Mac: Docker version 1.12.0-rc4, build e4a0dbc, experimental

$ docker network inspect mynet [ { "Name": "mynet", "Id": "9adci90466aesxms5wo4tcrfx", "Scope": "swarm", "Driver": "overlay", "EnableIPv6": false, "IPAM": { "Driver": "default", "Options": null, "Config": [ { "Subnet": "10.0.0.0/24", "Gateway": "10.0.0.1" } ] }, "Internal": false, "Containers": { "050ae4c7a3f84779cb3df1ab9e74233450dbd8c7eea6e87760885294c4169810": { "Name": "gateway.1.2d7e7tweh71fcin61oabptj8g", "EndpointID": "bc009b36c671b6e240f84f45d6cbf5303bd9f828f513224278a969c91aa1df9c", "MacAddress": "02:42:0a:00:00:0f", "IPv4Address": "10.0.0.15/24", "IPv6Address": "" }, "2d979ff29c230b5277a60fa7b99f9e346bfdccf9ef4d9e776236be53b37fce61": { "Name": "adjective.1.e0aeuqd93xc6w55bowjbyim22", "EndpointID": "502cf438dc7e6f0d67a9c1705b7e2411d4dfd7a5f1387ed6596f2ae69af9759f", "MacAddress": "02:42:0a:00:00:11", "IPv4Address": "10.0.0.17/24", "IPv6Address": "" }, "439b86e68f4ed83b6b40f4e9f58d611cb52bb94e05955f7c0d6a39658915f014": { "Name": "adjective.3.d4at1wm3xubd21uzngzvkm63d", "EndpointID": "cc17dac39e1d89f19ad4d66069c061b22e72455eceecf2253124d770857ba113", "MacAddress": "02:42:0a:00:00:10", "IPv4Address": "10.0.0.16/24", "IPv6Address": "" }, "8832eed295424d7d700e6c42d359ec760978ec9e5414ca69ad15d8245ff2885f": { "Name": "noun.2.6m3z8yl7ez1ywuxlm4wiw4z9j", "EndpointID": "63f5e36dae6e6dc6a27f6c41552200ab8c070ef560862dfbf334437c4585c73c", "MacAddress": "02:42:0a:00:00:03", "IPv4Address": "10.0.0.3/24", "IPv6Address": "" }, "b3a8e95971f9873ef4dd0a5ed37f9d20cb5d58764f2c2901655cf2ea4fc12864": { "Name": "adjective.2.8jc7hb2hwobnwwz6ueqz37qye", "EndpointID": "240e0031bd4b5695700c41ddb4be8db99800d847684f7488e6e96777540da134", "MacAddress": "02:42:0a:00:00:0d", "IPv4Address": "10.0.0.13/24", "IPv6Address": "" }, "cf310057a3d39c2556474e77133b66427c268bf1065f866c4477427ae63d2d1b": { "Name": "noun.1.4mi6rt0baocazj28ry7fpg5km", "EndpointID": "9e53785091e39a58aded539f0dcca3798cc46ce67ee58564dee1b8b625382b68", "MacAddress": "02:42:0a:00:00:0c", "IPv4Address": "10.0.0.12/24", "IPv6Address": "" }, "fde5ae87b4920518d6ccde639894d30ea2e8edc3e96dbf2862b23b5df2447b10": { "Name": "noun.3.01xxhi8gfajeos9lad1fw0qth", "EndpointID": "75a5e1a851c40578b080e9843d8914a095d964007c000d5ebdb0d3b48ac65492", "MacAddress": "02:42:0a:00:00:0e", "IPv4Address": "10.0.0.14/24", "IPv6Address": "" } }, "Options": { "com.docker.network.driver.overlay.vxlanid_list": "257" }, "Labels": {} } ]

$ docker service inspect adjective --pretty ID: 03xvb5oc692eczgey4qhc6aev Name: adjective Mode: Replicated Replicas: 3 Placement: Strategy: Spread UpdateConfig: Parallelism: 0 ContainerSpec: Image: recursivechaos/adjective Resources: Reservations: Limits: Networks: 9adci90466aesxms5wo4tcrfxPorts: Name = Protocol = tcp TargetPort = 8010 PublishedPort = 8010

cc @mavenugo please take a look at this – it may be a known issue

There was a suggestion that you might be mistaking the virtual IP for the service (which does not change per-request) with the actual container IP that requests end up getting routed to. Have you tried returning some sort of tell-tale information such as hostname in your responses to verify that it is the same container responding every time? And/or what do access logs for each container say?

(I don’t see how you’re verifying that the response is returned from the same container above – sorry if I overlooked something).

Thanks for checking up on this. I have an Zuul API Gateway running in the gateway docker container. It routes requests from my local machine from it’s address localhost:8000 to the individual services (defined as adjective:8010). In the response I’m returning the hostname (container ID in this case). Whenever I curl the gateway multiple times, I am always returned the same container, leading me to believe that the adjective service is always being routed to the same container.

As far as I know, Zuul is only routing to the service name, and doesn’t necessarily lookup an IP and cache. Although, honestly, I’m not sure. I’m starting to see a few Docker Swarm service routing tutorials, I’ll give one of them a shot and see if it’s just not an implementation mistake on my part.

I am not able to see round-robin behavior too. I am using the example-voting-app demoed in dockercon. Created 4 replica’s for the voting app and accessing on port :5000 - I keep refreshing the page to observe the container names at the bottom of the page are random and NOT round-robin

vagrant@smgr:~/example-voting-app/vote$ docker service ls ID NAME REPLICAS IMAGE COMMAND 4x9gbh0vm9iw postgres 1/1 postgres:9.4 az5azqgu4799 redis 1/1 redis:3.2.1-alpine bn2yuzk627e7 vote 4/4 vote:latest python app.py vagrant@smgr:~/example-voting-app/vote$ docker service create --replicas 4 --name vote -p 5000:80/tcp vote:latest python app.py

@machzqcq If you’re accessing the Swarm port directly through a browser there was one “gotcha” I recall that had something to do with the browser caching / re-using existing TCP connections (a bit fuzzy on the exact details, sorry). So you’d be routed to the same task each time even with a non-cache refresh. It’d be worth trying with curl or other clients to see if you get the same result.

1 Like