Docker Community Forums

Share and learn in the Docker community.

Failed resolving host

swarm
docker
dns

(Worp) #1

When I spin up my etcd (don’t worry, this is not going to focus on etcd. It will become a general question) as a docker stack service, It kicks in with the following log entries during bootstrap:

percona_etcd.1.75c31t5jilmq@test-010    | 2018-04-24 20:56:05.954663 C | etcdmain: failed to resolve http://galera_etcd:2380 to match --initial-cluster=etcd0=http://galera_etcd:2380 (failed to resolve "http://galera_etcd:2380" (lookup galera_etcd on 127.0.0.11:53: no such host))

percona_etcd.1.82rqzgq3tjo3@test-011    | 2018-04-24 20:56:55.359039 I | pkg/flags: recognized and used environment variable ETCD_ADVERTISE_CLIENT_URLS=http://galera_etcd:2379,http://galera_etcd:4001
percona_etcd.1.82rqzgq3tjo3@test-011    | 2018-04-24 20:56:55.359094 I | pkg/flags: recognized and used environment variable ETCD_DATA_DIR=/opt/etcd/data
percona_etcd.1.82rqzgq3tjo3@test-011    | 2018-04-24 20:56:55.359115 I | pkg/flags: recognized and used environment variable ETCD_INITIAL_ADVERTISE_PEER_URLS=http://galera_etcd:2380
percona_etcd.1.82rqzgq3tjo3@test-011    | 2018-04-24 20:56:55.359122 I | pkg/flags: recognized and used environment variable ETCD_INITIAL_CLUSTER=etcd0=http://galera_etcd:2380
percona_etcd.1.82rqzgq3tjo3@test-011    | 2018-04-24 20:56:55.359126 I | pkg/flags: recognized and used environment variable ETCD_INITIAL_CLUSTER_STATE=new
percona_etcd.1.82rqzgq3tjo3@test-011    | 2018-04-24 20:56:55.359132 I | pkg/flags: recognized and used environment variable ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster-1
percona_etcd.1.82rqzgq3tjo3@test-011    | 2018-04-24 20:56:55.359159 I | pkg/flags: recognized and used environment variable ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379,http://0.0.0.0:4001
percona_etcd.1.82rqzgq3tjo3@test-011    | 2018-04-24 20:56:55.359173 I | pkg/flags: recognized and used environment variable ETCD_LISTEN_PEER_URLS=http://0.0.0.0:2380
percona_etcd.1.82rqzgq3tjo3@test-011    | 2018-04-24 20:56:55.359192 I | pkg/flags: recognized and used environment variable ETCD_NAME=etcd0
percona_etcd.1.82rqzgq3tjo3@test-011    | 2018-04-24 20:56:55.359291 I | etcdmain: etcd Version: 3.3.4
percona_etcd.1.82rqzgq3tjo3@test-011    | 2018-04-24 20:56:55.359302 I | etcdmain: Git SHA: fdde8705f
percona_etcd.1.82rqzgq3tjo3@test-011    | 2018-04-24 20:56:55.359305 I | etcdmain: Go Version: go1.9.5
percona_etcd.1.82rqzgq3tjo3@test-011    | 2018-04-24 20:56:55.359308 I | etcdmain: Go OS/Arch: linux/amd64
percona_etcd.1.82rqzgq3tjo3@test-011    | 2018-04-24 20:56:55.359313 I | etcdmain: setting maximum number of CPUs to 4, total number of available CPUs is 4
percona_etcd.1.82rqzgq3tjo3@test-011    | 2018-04-24 20:56:55.359385 I | embed: listening for peers on http://0.0.0.0:2380
percona_etcd.1.82rqzgq3tjo3@test-011    | 2018-04-24 20:56:55.359413 I | embed: listening for client requests on 0.0.0.0:2379
percona_etcd.1.82rqzgq3tjo3@test-011    | 2018-04-24 20:56:55.359432 I | embed: listening for client requests on 0.0.0.0:4001
percona_etcd.1.82rqzgq3tjo3@test-011    | 2018-04-24 20:56:55.663728 W | pkg/netutil: failed resolving host galera_etcd:2380 (lookup galera_etcd on 127.0.0.11:53: no such host); retrying in 1s
percona_etcd.1.82rqzgq3tjo3@test-011    | 2018-04-24 20:56:56.664949 W | pkg/netutil: failed resolving host galera_etcd:2380 (lookup galera_etcd on 127.0.0.11:53: no such host); retrying in 1s

....

percona_etcd.1.82rqzgq3tjo3@test-011    | 2018-04-24 20:57:25.663175 C | etcdmain: failed to resolve http://galera_etcd:2380 to match --initial-cluster=etcd0=http://galera_etcd:2380 (failed to resolve "http://galera_etcd:2380" (lookup galera_etcd on 127.0.0.11:53: no such host))
percona_etcd.1.kgs14ewb1z22@test-011    | 2018-04-24 20:58:11.355570 W | pkg/netutil: failed resolving host galera_etcd:2380 (lookup galera_etcd on 127.0.0.11:53: no such host); retrying in 1s

This is a two manager-node swarm setup with two photon-os hosts.

For reference, these are the environment variables that etcd gets:

ETCD_DATA_DIR=/opt/etcd/data
ETCD_NAME=etcd0
ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379,http://0.0.0.0:4001
ETCD_ADVERTISE_CLIENT_URLS=http://galera_etcd:2379,http://galera_etcd:4001
ETCD_LISTEN_PEER_URLS=http://0.0.0.0:2380
ETCD_INITIAL_ADVERTISE_PEER_URLS=http://galera_etcd:2380
ETCD_INITIAL_CLUSTER=etcd0=http://galera_etcd:2380
ETCD_INITIAL_CLUSTER_STATE=new
ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster-1

I have already aimed an etcd-specific question at the coreos guys at coreos/etcd - issue 7798 but I have a gut feeling that something else is involved here.

I am not too knowledgeable with Docker-Swarm DNS management and which entry should be in what file for it to work correctly. I might have a wrongly configured Photon OS at hand.

Port-wise, Docker describes: https://docs.docker.com/engine/swarm/swarm-tutorial/

The following ports must be available. On some systems, these ports are open by default.

TCP port 2377 for cluster management communications
TCP and UDP port 7946 for communication among nodes
UDP port 4789 for overlay network traffic

Which I did:

# iptables -L
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:2376
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:2377
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:7946
ACCEPT     udp  --  anywhere             anywhere             udp dpt:7946
ACCEPT     udp  --  anywhere             anywhere             udp dpt:4789

I figure this must be some DNS configuration issue on my side, as etcd is trying to lookup itself to establish the initial cluster as per “ETCD_INITIAL_CLUSTER=etcd0=http://galera_etcd:2380” (env var from above).

The docker-compose file from which I am spinning this service up reads:

etcd:
  image: quay.io/coreos/etcd
  command: etcd
  volumes:
    - etcd_data:/etc/ssl/certs
  ports:
    - "2379:2379"
    - "2380:2380"
  env_file: etcd.env
  networks:
    - cluster
  deploy:
    mode: replicated
    replicas: 1
    placement:
      constraints: [node.role == manager]

Any help is greatly appreciated, if you need more info let me know.
Thanks a lot in advance!
Worp