Docker Swarm DNS resolution (WAS: multiple networks)

sstt · September 1, 2023, 8:35am

I have set up an overlay network “public”. This functions as I would expect.

I have a stack file with two services, a and b. a is an auxiliary service for b. These also work fine, as I would expect.

I now desire b to be a part of the “public” network.

However, once I attach b to the public overlay network (which is then used by traefik), using,

networks:
  - default
  - public

within b, b is no longer able to access a.

What is the best way to set this up? I know I could simply also put a on ‘public’, but I don’t know if there is a better solution.

bluepuma77 · September 1, 2023, 9:40pm

You have a and b on the same Docker network x, then you add a to network y and a can’t talk to b anymore?

How should a talk to b? Via name or via IP? Do the two Docker networks have overlapping IP ranges?

sstt · September 1, 2023, 10:57pm

Hmm, my apologies, I have missed a small thing in my testing - it appears DNS resolution between Swarm containers is the thing that isn’t working correctly.

The containers are meant to talk to each other via name.

No, they don’t have overlapping IP ranges (as docker service inspect shows).

As an MVP, I’m running this stack file;

version: "3"
services:
  redis:
    image: docker.io/library/redis:7
    deploy:
      replicas: 1

  paperless:
    image: ghcr.io/paperless-ngx/paperless-ngx:latest
    depends_on:
      - redis
    environment:
      - PAPERLESS_REDIS=redis://redis:6379
    deploy:
      replicas: 1
    networks:
      - default
      - public

networks:
  public:
    external: true

Looking at the logs of paperless-mvp_paperless, i get this error “Error: Error -2 connecting to redis:6379. Name or service not known…”, whereas I would expect it to work correctly.

public was created by docker network create -d overlay public.

meyay · September 2, 2023, 8:11am

You can always check the effective compose configuration with docker compose config.
I created the compose file in a folder called ab-network-test, that’s why the (project) name is ab-network-test:

name: ab-network-test
services:
  paperless:
    depends_on:
      redis:
        condition: service_started
    deploy:
      replicas: 1
    environment:
      PAPERLESS_REDIS: redis://redis:6379
    image: ghcr.io/paperless-ngx/paperless-ngx:latest
    networks:
      default: null
      public: null
  redis:
    deploy:
      replicas: 1
    image: docker.io/library/redis:7
    networks:
      default: null
networks:
  default:
    name: ab-network-test_default
  public:
    name: public
    external: true

Note: the null setting for the networks, just means there is no further customization (like setting an alias name). It is the default value.

So both services share a common network: the default network. Therefore, the dns-based service discovery should be able to resolve the containers by the service name, and the containers should be able to communicate using the default network.

Make sure there is no other redis service attached to the public network, otherwise the paperless service might pick randomly between the service from the public and default network.

Is it possible that the redis container is crashing? You have no restart policies configure for your services, thus if they die, they stay dead.

bluepuma77 · September 2, 2023, 12:43pm

It seems redis is missing the network

And I personally wouldn’t use depends_on when going into a larger Searm cluster. This is great for single host, but I wouldn’t use it in a distributed environment. But that’s just my personal opinion.

meyay · September 2, 2023, 5:56pm

Is it? The default network is added to a service by default, unless the service’s network is specifically configured → then it must be explicitly added, like seen with the paperless service.

I just deployed the stack and tested it: docker run -it --net container:$(docker ps -q --filter=name=paperless) nicolaka/netshoot ping redis
Of course, it’s working

If this doesn’t work for you, then there must be something wrong with the overlay communication.
Could be easily tested by sticking the services to the same node using deplyoment constraints.

As long as I remember, depends_on was not implemented for Swarm services. Did they implement it recently?

sstt · September 3, 2023, 2:19am

Thank you both for your help. Upon further investigation, I noticed that there is another root problem with my docker swarm setup, that I believe is causing this issue (as you correctly surmised haha). On every worker node, dockerd is failing to communicate to other nodes over port 7946, which would explain the inability to communicate between services in instances where each service is deployed on a different node.

This is happening independently of Linux distribution.

I only see in netstat -tulpn:

tcp6       0      0 :::7946                 :::*                    LISTEN      22249/dockerd

And no ipv4 equivalent. ~~Changing --listen-addr and --advertise-addr on each node does not help.~~
Not sure where to go from here.

EDIT: so, doing --listen-addr 192.168.x.y instead of --listen-addr 0.0.0.0, where 192.168.x.y is the IP address of the node, does seem to work ?!

Topic		Replies	Views
Services Launched in Swarm Custom Overlay Network Cannot Connect to Other Services in Overlay Network General swarm	0	569	July 28, 2021
Network Overlay - Connection between two nodes doesn't work Swarm swarm	1	572	January 12, 2024
Communication between two overlay networks General swarm	1	531	September 28, 2022
Customized FQDN resolution for services deployed w/ docker swarm mode General dns , docker , swarm	12	771	November 21, 2023
Host Network on Swarm: Service Discovery and Communication with other Services Swarm docker , swarm	8	5957	July 10, 2023

Docker Swarm DNS resolution (WAS: multiple networks)

Related topics