New Docker swarm 1.12, automation and single point of failure

Hi

I was doing some POCs on the new in built docker swarm feature. I was able to set up 6 nodes as follows:

  1. docker init on node1
  2. docker join on node 2 & 3 as manager
  3. docker join on 4, 5 & 6 as workers.
    All worked fine. But once started to automate got stuck with a couple of doubts which I am not able to find anywhere, if anyone can help me with this would be real greatful:
  4. I get the join command from my manager (node1) during init, and I use it to join other workers and manager nodes. Say if the main manager (node1) gets killed then the join command I have is no more going to work. What should I do about this ??
  5. Also since I am trying to automate this, is there an external discovery service I can use to update the worker nodes status so that any new nodes joining can refer that instead of me providing manually.
  6. In Docker 1.12RC2, I saw something like auto allow workers/managers and predefined secret key for workers and managers to join, is that still available ? Docker site init doesnt show those arguments now.

Any help is much appreciated.

regards
Achuth

Hi

In the process of searching for an answer to exactly this i found a github project that would help with the automation part: https://github.com/srikalyan/aws-swarm-init

However the initial setup step still have a single point of failure, and i’m still in search for details around how to avoid this.
Currently working on a terraform setup to automate this for aws.

1 Like

Thanks for the link, it looks fine. But the once fascinating thing is docker team brought in docker swarm mode to remove dependency on other cluster manager, but in turn brought in lot more dependency on other services and add on complexity. :slight_smile: