Consul health checks fails in swarm mode. Containers are not accessible from host when running in swarm mode

I have setup docker swarm with one manager and two worker nodes. I have bring up new node for consul server (installed on host). On each worker node i am running consul agent (running on host not in docker). I have opened all necessary ports for consul and docker swarm to work correctly.

All nodes are reachable from consul server. If i start containers directly via docker run or through docker-compose then consul service health checks works. Containers are running on default bridge network. But in swarm mode, containers are not accessible from host so consul service check fails. I can’t use host network mode because i want to run multiple container of same service on node.

I have created custom overlay network and all service containers are attached with custom network. I haven’t published any port (no ingress network).

Given: you run consul on a dedicated a server and multiple agents on the host os of your worker nodes.

Can you elaborate on why container not beeing accessible would fail the consul service check? Are you using consul for service discovery?

Yes i am using consul for service discovery.

When i start containers using docker run or by using docker-compose, since containers are by default attached with default bridge network and accessible from host via Container IP.

But in swarm mode, container are not accessible via IP from host. Since consul agent runs on host, it is not able to ping container. and health checks failed.

Oh, I see now how this causes a problem with health checks. From the agent perpective there seem to be no route to the interfaces of the local containers attached to an overlay network.

I actualy never used consul for service discovery. Standalone swarm (!= swarm mode) depended on an exsternal k/v store for serviced discovery. Since docker 17.05 (might have been 1.13, I don’t recall the exact version) a dns-based service discovery is build-in and available in all user defined networks. Of course this does not apply to the default bridge network, as it is not user defined.

I will need to leave this one to someone who actualy has experience with your use case.

Thanks for the prompt response.

YES Docker already provides DNS based service discovery. Docker internally setups its own DNS server and responds to DNS queries. So no consul k/v required to store docker swarm state and also for DNS base service recovery.

My previous setup - Running HAPROXY as a load balancer (in container) and by passed swarm service mesh completely by setting endpoint mode to DNSRR. I have pointed docker local DNS server in HAPROXY configuration and everything works fine. HAPROXY is running is container, its health checks works completely fine.

Current setup is a little bit different. I am using envoy with consul connect as a sidecar (running envoy process in container (not a good practice to start/run multiple processes in container)). I am communicating with other services through sidecar.

Issue is that, since consul marks service as unreachable/unhealthy everything breaks. If i run everything with docker compose or docker run then everything works fine).