Getting pretty frustrated trying to build a swarm

Hi there, I must be missing something, but I have really really struggled to get anywhere with setting up a development swarm.

I have followed the guides and used docker toolbox, and I can get to the end of the guides etc. However, I want every developer on my team to have a consistent swarm on their machines. Our company uses Hyper-V rather than virtual box.

I want to be able to pull my service repo, and have a compose file that will bring up all the required dependencies for that service (DB, message queue etc), and I want to run that compose against a local swarm.

So I have build 3 Ubuntu 14.04 machines, installed docker etc, and at first everything was going well, until I tried to point my docker client on my windows host at the swarm running in my VMs. While I could do that with the -H switch, I was struggling to run compose. In the end I decided I needed TLS.

So I went through the TLS guide, and I could access the swarm simply by typing docker info on my local box, however, all the swarm nodes were reporting their status as pending, with no errors.

after digging into the containers, I thought it was consul not using TLS, so I worked to enable TLS for consul. Eventually though I realised that nothing could talk to consul, as it’s port was not being mapped correctly, so I resolved that.

So… here is the command I am using to start consul

docker run -d -v /var/data/macbeth/node_discovery:/data -v /home/administrator/.certs:/certs:ro -v /home/administrator/.config:/config -p 8080:8080 --name=Discovery --restart=always progrium/consul -server -bootstrap -config-dir=/config -data-dir=/data

here is a node

docker run -d -v /home/administrator/.certs:/certs:ro --name=SwarmNode --restart=always swarm:1.1.0 join --ttl “180s” --discovery-opt kv.cacertfile=/certs/ca.pem --discovery-opt kv.certfile=/certs/cert.pem --discovery-opt kv.keyfile=/certs/key.pem --advertise 192.168.137.3:2376 consul://192.168.137.2:8080

and here is the manager

docker run -d -p 3376:3376 -v /home/administrator/.certs:/certs:ro --name=SwarmManager --restart=always swarm:1.1.0 manage --tlsverify --tlscacert=/certs/ca.pem --tlscert=/certs/cert.pem --tlskey=/certs/key.pem --discovery-opt kv.cacertfile=/certs/ca.pem --discovery-opt kv.certfile=/certs/cert.pem --discovery-opt kv.keyfile=/certs/key.pem --host=0.0.0.0:3376 consul://192.168.137.2:8080

and my consul config file

{
“ca_file”: “/certs/ca.pem”,
“cert_file”: “/certs/cert.pem”,
“key_file”: “/certs/key.pem”,
“verify_incoming”: true,
“verify_outgoing”: true,
“Client_addr”: “0.0.0.0”,
“addresses”: {
“https”: “0.0.0.0”
},
“ports”: {
“https”: 8080
}
}

this got me to the point here consul was responding to requests, and everything seemed to be configured to use TLS. but I was seeing nodes: 0 in docker info when connected to the swarm. So I started digging about again, and when I attached to my manager container I saw this error:

ERRO[1020] Discovery error: Put https://192.168.137.2:8080/v1/kv/docker/swarm/nodes: x509: cannot validate certificate for 192.168.137.2 because it doesn’t contain any IP SANs
ERRO[1020] Discovery error: Unexpected watch error
ERRO[1080] Discovery error: Get https://192.168.137.2:8080/v1/kv/docker/swarm/nodes?consistent=: x509: cannot validate certificate for 192.168.137.2 because it doesn’t contain any IP SANs

After a little googling, I added subjectAltName = IP:192.168.137.2 to /etc/ssl/openssl.cnf on my swarmMaster/CA host, and I just spent an hour re-generating and installing my certs again. But when I attach to the manager container, I am still seeing the same error (for the record I removed all containers and created them again too)

I am at a loss now, and pretty frustrated with it. There seem to be so many missing pieces of the puzzle that are not well documented.

what am I failing to understand here? All I want is a 3 node swarm that uses TLS so that developers that are not familiar with docker can simply docker-compose up an environment from git bash on their machines.

If setting it up manually, I’d try and get everything working without TLS first just to make sure that you get those pieces fitting together properly. There’s enough moving parts that are tricky to get right without throwing TLS in. For an all-local setup on a trusted network, TLS is not really mandatory (although it is nice). For production you should absolutely use TLS everywhere though.

Have you looked into the docker-machine driver for hyper-v? It might be slightly easier.

I have actually resolved this now. I found that following the TLS instructions given here : https://docs.docker.com/engine/security/https/ to work, where as I continualy generated “bad certificates” following the Swarm TLS guide found here: https://docs.docker.com/swarm/configure-tls/

1 Like

If I’m not on a trusted network but my swarm is using TLS, will be the
traffic between nodes (docker overlay network) encrypted or should I
individually set some type of security in each container/application
between hosts?

Assuming that you have set everything up correctly, yes – with TLS enabled the proxy from the manager to the joined nodes should be encrypted, and identity verified, with TLS.

To be clear about the above, this is encrypted and authorized communications between the manager and worker Swarm daemons. Encrypting communication between services on an overlay requires setting the --opt secure option for docker network create -d overlay.