Customized FQDN resolution for services deployed w/ docker swarm mode

hafan · November 18, 2023, 3:25am

Suppose I now deploy 3 services on top of the docker swarm mode and these services belong to the same overlay network so they can reach out to each other via referencing each service’s name

services:
  a:
    image: alpine:latest
    command: sleep infinity
    networks:
      - overlay-net
  b:
    image: alpine:latest
    command: sleep infinity
    networks:
      - overlay-net
  c:
    image: alpine:latest
    command: sleep infinity
    networks:
      - overlay-net
networks:
  overlay-net:
    driver: overlay

Now suppose services a, b and c together form a cluster at the application level. Any of them could be the leader and the leadership might change over the time. Now I want to achieve the following goal: I want to create another hostname, let’s call it “d”. d should be always mapped to the leader, which could be a, b or c. Later if I include other services in the same overlay network, I want to have them talk to the leader service by referring the hostname “d”, instead of “a”, “b” or “c”.

I’m wondering how to achieve this goal using docker provided utilities? If there is no suitable built-in tools, can you please also suggest some third-party tool for this goal?

Thanks!

bluepuma77 · November 18, 2023, 7:13am

Another approach is to use a single service:

You should be able to reach every instance by their virtual hostname, which can be set for each service instance by a template string, see simple Traefik Swarm example, more variables available.

You can also set additional hostnames for each service, not sure if templating works there.

If you change the dns mode of the service to dnsrr, you will get a list of all IPs of the running services, instead of the default single virtual IP, for which Docker is internally doing round-robin automatically.

But that way you will probably never get a nice a, b, c.

What service do you want to deploy?

rimelek · November 18, 2023, 8:34am

Isn’t there a way to get the current leader before sending the actual request to it? If you always want to communicate with the leader, you can’t rely on an external service to tell you which one is the leader unless that service receives the request, determines which one is the leader, forwards the request and sends the response back to you. If you use something that periodically determines which one is the leader and somehow sets the hostname to point to that leader, there will always be some delay and sometimes you will send the request to another container.

You could also accept the request in any instance and that instance could forward the request to the actual leader if it wasn’t the leader. As far as I know, this is how some (or all?) databases work when only the leader can write data and replicas or workers can be used only to read the data.

So I think this funcionality should be supported by the application that wants to form a cluster or something made specifically for it. Therefore, @bluepuma77 asked a good question.

meyay · November 18, 2023, 9:08am

How would this be solved without containers? How would you determine who is leader, and how do you spontaneously assign the new leader after a change to the fixed name?

Since you seem to be using something that uses a consensus algorithm, you should take a closer look on how the solution you use actually handles writes. Many solutions forward write operations send to follower nodes to the leader nodes internally. For instance that’s how the raft implementation in Swarm itself works.

Is it the most efficient way? No. But it’s still a very good mix between reliability and efficiency As quorum is only required for cluster membership and leader election. If the consensus allowed writes on all nodes, it would be less efficiency as a quorum would not only be required for cluster membership and leader election, but also for every(!) written change in the cluster.

For those implementations that don’t forward write operations from follower to the leader node, you would need to implement something on your own. In this case I would assume that you should be able to find plenty of discussions in the community of the solution, as it would be a fundamental problem everyone would need to solve. @rimelek made suggestions on how this might look like.

meyay · November 18, 2023, 9:22am

Some additions:

In case the consensus implementation forwards write to the leader: you can assign a network alias to all standalone services that make up the cluster, or preferably use a single service, like @bluepuma77 suggested.

Though, make sure you understand how the solution handles dns caching: if it caches the ip for resolved dns names, then dnsrr might not be a good idea, as queries will end up on the same service task(=container) as long as the ip is cached, instead of distributing the request amongst all service tasks. Some applications even cache the ip for a resolved dns name indefinitely.

hafan · November 18, 2023, 7:03pm

You could also accept the request in any instance and that instance could forward the request to the actual leader if it wasn’t the leader. As far as I know, this is how some (or all?) databases work when only the leader can write data and replicas or workers can be used only to read the data.

My application is not database but very similar to it. Out of 3 services (nodes), one of them is the leader which can handle reads and writes whereas the other two nodes (follower) just sit in standby mode. Later when I add more services to the same overlay network, I would like them to always talk to the leader instead of any of the two followers.

You could also accept the request in any instance and that instance could forward the request to the actual leader if it wasn’t the leader.

I evaluate this option but it is not straightforward to do it at the application level. I want to explore the option to do it at docker level

hafan · November 18, 2023, 7:09pm

Unfortunately the application-level clustering is very primitive and does not offer the capability to internally forward the request to the leader node.

Does docker provide any hook such that whenever there is a leadership changes, the new leader can invoke this hook inside the container. Once this hook is invoked, docker will update the mapping of hostname “d” to this new leader. I saw people using the trick of mounting /var/run/docker.sock into the container so within the container we can perform certain operations that become visible outside

meyay · November 18, 2023, 7:24pm

Word of advice: do not try to implement your own consensus algorithm. Try sticking to an existing implementation of Raft or Paxos for your programming language.

People write doctoral theses about this kind of stuff

Docker does not provide such a hook, and to my knowledge it does not allow managing dns entries at all.

hafan · November 18, 2023, 7:53pm

Got it. This is a good advice

So in this particular situation, docker can’t help and it looks like I have to bring in some extra component such as load balancer or reverse proxy for doing this task?

meyay · November 18, 2023, 9:04pm

Unfortunately, docker does not provide an out-of-the-box solution for what you want to do.

You would need to write your own reverse proxy implementation, that accepts the incoming request, checks your backends for the leader, then proxies the request to the leader and returns the response to the client.

Something like this: Making a Custom Reverse Proxy with Golang | by Ramazan Demir | Trendyol Tech | Medium + plus the logic to determin the leader

bluepuma77 · November 21, 2023, 8:11am

Traefik reverse-proxy works well with Docker and Docker Swarm, it supports automatic Configuration Discovery. Simple Traefik example.

rimelek · November 21, 2023, 9:19am

It is not about service discovery anymore, but automatically determining what which service is the leader when someone try to access a domain name. Can Traefik help with that somehow?.

bluepuma77 · November 21, 2023, 10:50am

My understanding was that d should manually be set to point to the primary. That can be done by a Traefik router and service in dynamic config file.

Topic		Replies	Views
Host Network on Swarm: Service Discovery and Communication with other Services Swarm docker , swarm	8	3970	July 10, 2023
Docker Swarm DNS resolution (WAS: multiple networks) Swarm	6	5597	September 3, 2023
Service names & DNS - Im so close Swarm	2	3273	October 7, 2017
Services Launched in Swarm Custom Overlay Network Cannot Connect to Other Services in Overlay Network General swarm	0	528	July 28, 2021
Fully qualified service name in Swarm? Swarm	2	1163	February 25, 2019

Customized FQDN resolution for services deployed w/ docker swarm mode

Related topics