Cannot get zookeeper to work running in docker using swarm mode

Hi All,
I have been trying out deploying Zookeeper cluster in docker swarm mode.

I have deployed 3 machines connected to docker swarm network. My requirement is to, try running 3 Zookeeper instance on each of those nodes, which forms ensemble.
Have gone through this thread, got few insights on how to deploy Zookeeper in docker swarm.

As @junius suggested, I have created the docker compose file.
I have removed the constraints as the docker swarm ignores it. Refer Docker swarm constraints being ignored

My Zookeeper docker compose file looks like this

version: '3.3'

services:
    zoo1:
        image: zookeeper:3.4.12
        hostname: zoo1
        ports:
            - target: 2181
              published: 2181
              protocol: tcp
              mode: host
            - target: 2888
              published: 2888
              protocol: tcp
              mode: host
            - target: 3888
              published: 3888
              protocol: tcp
              mode: host
        networks:
            - net
        deploy:
            restart_policy:
                condition: on-failure
        environment:
            ZOO_MY_ID: 1
            ZOO_SERVERS: server.1=0.0.0.0:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888
        volumes:
            - /home/zk/data:/data
            - /home/zk/datalog:/datalog
            - /etc/localtime:/etc/localtime:ro
    zoo2:
        image: zookeeper:3.4.12
        hostname: zoo2
        ports:
            - target: 2181
              published: 2181
              protocol: tcp
              mode: host
            - target: 2888
              published: 2888
              protocol: tcp
              mode: host
            - target: 3888
              published: 3888
              protocol: tcp
              mode: host
        networks:
            - net
        deploy:
            restart_policy:
                condition: on-failure
        environment:
            ZOO_MY_ID: 2
            ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=0.0.0.0:2888:3888 server.3=zoo3:2888:3888
        volumes:
            - /home/zk/data:/data
            - /home/zk/datalog:/datalog
            - /etc/localtime:/etc/localtime:ro
    zoo3:
        image: zookeeper:3.4.12
        hostname: zoo3
        ports:
            - target: 2181
              published: 2181
              protocol: tcp
              mode: host
            - target: 2888
              published: 2888
              protocol: tcp
              mode: host
            - target: 3888
              published: 3888
              protocol: tcp
              mode: host
        networks:
            - net
        deploy:
            restart_policy:
                condition: on-failure
        environment:
            ZOO_MY_ID: 3
            ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=0.0.0.0:2888:3888
        volumes:
            - /home/zk/data:/data
            - /home/zk/datalog:/datalog
            - /etc/localtime:/etc/localtime:ro
networks:
    net:

Deployed using docker stack command.

docker stack deploy -c zoo3.yml zk
Creating network zk_net
Creating service zk_zoo3
Creating service zk_zoo1
Creating service zk_zoo2

Zookeeper services comes up fine, each in each node without any issues.

docker stack services zk
ID NAME MODE REPLICAS IMAGE PORTS
rn7t5f3tu0r4 zk_zoo1 replicated 1/1 zookeeper:3.4.12 0.0.0.0:2181->2181/tcp, 0.0.0.0:2888->2888/tcp, 0.0.0.0:3888->3888/tcp
u51r7bjwwm03 zk_zoo2 replicated 1/1 zookeeper:3.4.12 0.0.0.0:2181->2181/tcp, 0.0.0.0:2888->2888/tcp, 0.0.0.0:3888->3888/tcp
zlbcocid57xz zk_zoo3 replicated 1/1 zookeeper:3.4.12 0.0.0.0:2181->2181/tcp, 0.0.0.0:2888->2888/tcp, 0.0.0.0:3888->3888/tcp

I have reproduced this issue discussed here, when i stop and started the zookeeper stack again.

docker stack rm zk
docker stack deploy -c zoo3.yml zk

This time zookeeper cluster doesn’t form. The docker instance logged the following

ZooKeeper JMX enabled by default
Using config: /conf/zoo.cfg
2018-11-02 15:24:41,531 [myid:2] - WARN  [WorkerSender[myid=2]:QuorumCnxManager@584] - Cannot open channel to 1 at election address zoo1/10.0.0.4:3888
java.net.ConnectException: Connection refused (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:558)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:534)
        at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:454)
        at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:435)
        at java.lang.Thread.run(Thread.java:748)
2018-11-02 15:24:41,538 [myid:2] - WARN  [WorkerSender[myid=2]:QuorumCnxManager@584] - Cannot open channel to 3 at election address zoo3/10.0.0.2:3888
java.net.ConnectException: Connection refused (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:558)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:534)
        at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:454)
        at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:435)
        at java.lang.Thread.run(Thread.java:748)
2018-11-02 15:38:19,146 [myid:2] - WARN  [QuorumPeer[myid=2]/0.0.0.0:2181:Learner@237] - Unexpected exception, tries=1, connecting to /0.0.0.0:2888
java.net.ConnectException: Connection refused (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:204)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:229)
        at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:72)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:981)
2018-11-02 15:38:20,147 [myid:2] - WARN  [QuorumPeer[myid=2]/0.0.0.0:2181:Learner@237] - Unexpected exception, tries=2, connecting to /0.0.0.0:2888
java.net.ConnectException: Connection refused (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:204)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:229)
        at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:72)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:981)

On close observation found, that first time when i deploy this stack, ZooKeeper instance with id: 2 running on node 1. this created a myid file with value 2.

cat /home/zk/data/myid
2

When i stopped and started the stack again, I found this time, ZooKeeper instance with id: 3 running on node 1.

docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
566b68c11c8b zookeeper:3.4.12 “/docker-entrypoin…” 6 minutes ago Up 6 minutes 0.0.0.0:2181->2181/tcp, 0.0.0.0:2888->2888/tcp, 0.0.0.0:3888->3888/tcp zk_zoo3.1.7m0hq684pkmyrm09zmictc5bm

But the myid file still have the value 2, which was set by the earlier instance.

Because of which the log shows [myid:2] and it tries to connect to instances with id 1 and 3 and fails.

On further debugging found that the docker-entrypoint.sh file contains the following code

# Write myid only if it doesn't exist
if [[ ! -f "$ZOO_DATA_DIR/myid" ]]; then
    echo "${ZOO_MY_ID:-1}" > "$ZOO_DATA_DIR/myid"
fi

This is causing the issue for me. I have edited the docker-entrypoint.sh with the following,

if [[ -f "$ZOO_DATA_DIR/myid" ]]; then
    rm "$ZOO_DATA_DIR/myid"
fi

echo "${ZOO_MY_ID:-1}" > "$ZOO_DATA_DIR/myid"

And mounted the docker-entrypoint.sh in my compose file.

With this fix, I am able to stop and start my stack multiple times and every time my zookeeper cluster is able to form ensemble without hitting the connect issue.

My docker-entrypoint.sh file as follows

#!/bin/bash

set -e

# Allow the container to be started with `--user`
if [[ "$1" = 'zkServer.sh' && "$(id -u)" = '0' ]]; then
    chown -R "$ZOO_USER" "$ZOO_DATA_DIR" "$ZOO_DATA_LOG_DIR"
    exec su-exec "$ZOO_USER" "$0" "$@"
fi

# Generate the config only if it doesn't exist
if [[ ! -f "$ZOO_CONF_DIR/zoo.cfg" ]]; then
    CONFIG="$ZOO_CONF_DIR/zoo.cfg"

    echo "clientPort=$ZOO_PORT" >> "$CONFIG"
    echo "dataDir=$ZOO_DATA_DIR" >> "$CONFIG"
    echo "dataLogDir=$ZOO_DATA_LOG_DIR" >> "$CONFIG"

    echo "tickTime=$ZOO_TICK_TIME" >> "$CONFIG"
    echo "initLimit=$ZOO_INIT_LIMIT" >> "$CONFIG"
    echo "syncLimit=$ZOO_SYNC_LIMIT" >> "$CONFIG"

    echo "maxClientCnxns=$ZOO_MAX_CLIENT_CNXNS" >> "$CONFIG"

    for server in $ZOO_SERVERS; do
        echo "$server" >> "$CONFIG"
    done
fi

if [[ -f "$ZOO_DATA_DIR/myid" ]]; then
    rm "$ZOO_DATA_DIR/myid"
fi

echo "${ZOO_MY_ID:-1}" > "$ZOO_DATA_DIR/myid"

exec "$@"

My docker compose file as follows

version: '3.3'

services:
    zoo1:
        image: zookeeper:3.4.12
        hostname: zoo1
        ports:
            - target: 2181
              published: 2181
              protocol: tcp
              mode: host
            - target: 2888
              published: 2888
              protocol: tcp
              mode: host
            - target: 3888
              published: 3888
              protocol: tcp
              mode: host
        networks:
            - net
        deploy:
            restart_policy:
                condition: on-failure
        environment:
            ZOO_MY_ID: 1
            ZOO_SERVERS: server.1=0.0.0.0:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888
        volumes:
            - /home/zk/data:/data
            - /home/zk/datalog:/datalog
			- /home/zk/docker-entrypoint.sh:/docker-entrypoint.sh
            - /etc/localtime:/etc/localtime:ro
    zoo2:
        image: zookeeper:3.4.12
        hostname: zoo2
        ports:
            - target: 2181
              published: 2181
              protocol: tcp
              mode: host
            - target: 2888
              published: 2888
              protocol: tcp
              mode: host
            - target: 3888
              published: 3888
              protocol: tcp
              mode: host
        networks:
            - net
        deploy:
            restart_policy:
                condition: on-failure
        environment:
            ZOO_MY_ID: 2
            ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=0.0.0.0:2888:3888 server.3=zoo3:2888:3888
        volumes:
            - /home/zk/data:/data
            - /home/zk/datalog:/datalog
			- /home/zk/docker-entrypoint.sh:/docker-entrypoint.sh
            - /etc/localtime:/etc/localtime:ro
    zoo3:
        image: zookeeper:3.4.12
        hostname: zoo3
        ports:
            - target: 2181
              published: 2181
              protocol: tcp
              mode: host
            - target: 2888
              published: 2888
              protocol: tcp
              mode: host
            - target: 3888
              published: 3888
              protocol: tcp
              mode: host
        networks:
            - net
        deploy:
            restart_policy:
                condition: on-failure
        environment:
            ZOO_MY_ID: 3
            ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=0.0.0.0:2888:3888
        volumes:
            - /home/zk/data:/data
            - /home/zk/datalog:/datalog
			- /home/zk/docker-entrypoint.sh:/docker-entrypoint.sh
            - /etc/localtime:/etc/localtime:ro
networks:
    net:

With this I am able to get zookeeper instance up and running in docker using swarm mode, without hard coding any hostname in the compose file. If one of my node goes down, services are started on any available node on swarm, without any issues.

Thanks