Hi there, I must be missing something, but I have really really struggled to get anywhere with setting up a development swarm.
I have followed the guides and used docker toolbox, and I can get to the end of the guides etc. However, I want every developer on my team to have a consistent swarm on their machines. Our company uses Hyper-V rather than virtual box.
I want to be able to pull my service repo, and have a compose file that will bring up all the required dependencies for that service (DB, message queue etc), and I want to run that compose against a local swarm.
So I have build 3 Ubuntu 14.04 machines, installed docker etc, and at first everything was going well, until I tried to point my docker client on my windows host at the swarm running in my VMs. While I could do that with the -H switch, I was struggling to run compose. In the end I decided I needed TLS.
So I went through the TLS guide, and I could access the swarm simply by typing docker info on my local box, however, all the swarm nodes were reporting their status as pending, with no errors.
after digging into the containers, I thought it was consul not using TLS, so I worked to enable TLS for consul. Eventually though I realised that nothing could talk to consul, as it’s port was not being mapped correctly, so I resolved that.
So… here is the command I am using to start consul
docker run -d -v /var/data/macbeth/node_discovery:/data -v /home/administrator/.certs:/certs:ro -v /home/administrator/.config:/config -p 8080:8080 --name=Discovery --restart=always progrium/consul -server -bootstrap -config-dir=/config -data-dir=/data
here is a node
docker run -d -v /home/administrator/.certs:/certs:ro --name=SwarmNode --restart=always swarm:1.1.0 join --ttl “180s” --discovery-opt kv.cacertfile=/certs/ca.pem --discovery-opt kv.certfile=/certs/cert.pem --discovery-opt kv.keyfile=/certs/key.pem --advertise 192.168.137.3:2376 consul://192.168.137.2:8080
and here is the manager
docker run -d -p 3376:3376 -v /home/administrator/.certs:/certs:ro --name=SwarmManager --restart=always swarm:1.1.0 manage --tlsverify --tlscacert=/certs/ca.pem --tlscert=/certs/cert.pem --tlskey=/certs/key.pem --discovery-opt kv.cacertfile=/certs/ca.pem --discovery-opt kv.certfile=/certs/cert.pem --discovery-opt kv.keyfile=/certs/key.pem --host=0.0.0.0:3376 consul://192.168.137.2:8080
and my consul config file
{
“ca_file”: “/certs/ca.pem”,
“cert_file”: “/certs/cert.pem”,
“key_file”: “/certs/key.pem”,
“verify_incoming”: true,
“verify_outgoing”: true,
“Client_addr”: “0.0.0.0”,
“addresses”: {
“https”: “0.0.0.0”
},
“ports”: {
“https”: 8080
}
}
this got me to the point here consul was responding to requests, and everything seemed to be configured to use TLS. but I was seeing nodes: 0 in docker info when connected to the swarm. So I started digging about again, and when I attached to my manager container I saw this error:
ERRO[1020] Discovery error: Put https://192.168.137.2:8080/v1/kv/docker/swarm/nodes: x509: cannot validate certificate for 192.168.137.2 because it doesn’t contain any IP SANs
ERRO[1020] Discovery error: Unexpected watch error
ERRO[1080] Discovery error: Get https://192.168.137.2:8080/v1/kv/docker/swarm/nodes?consistent=: x509: cannot validate certificate for 192.168.137.2 because it doesn’t contain any IP SANs
After a little googling, I added subjectAltName = IP:192.168.137.2 to /etc/ssl/openssl.cnf on my swarmMaster/CA host, and I just spent an hour re-generating and installing my certs again. But when I attach to the manager container, I am still seeing the same error (for the record I removed all containers and created them again too)
I am at a loss now, and pretty frustrated with it. There seem to be so many missing pieces of the puzzle that are not well documented.
what am I failing to understand here? All I want is a 3 node swarm that uses TLS so that developers that are not familiar with docker can simply docker-compose up an environment from git bash on their machines.