Docker Community Forums

Share and learn in the Docker community.

UCP cluster degraded, ucp-auth-api error

ucp

(Wonderlang) #1

Hi everyone,

I am having a weird problem with my test UCP cluster. In a web interface, one node is shown as degraded. Here’s what I see in CLI:

docker logs ucp-controller

{“level”:“warning”,“msg”:“Auth API provider status check error: Auth API Provider error: Get https://10.161.16.21:12385/enzi/_ping: read tcp 172.17.0.11:59371-\u003e10.161.16.21:12385: read: connection reset by peer”,“time”:“2016-06-07T19:32:22Z”}

and as port 12385 is ucp-auth-api:

docker logs ucp-auth-api

{“level”:“info”,“msg”:“connecting to db …”,“time”:“2016-06-07T19:33:53Z”}
{“level”:“info”,“msg”:“generating private key …”,“time”:“2016-06-07T19:33:54Z”}
{“level”:“info”,“msg”:“initializing services …”,“time”:“2016-06-07T19:33:54Z”}
{“level”:“fatal”,“msg”:“unable to initialize OpenID Connect Provider service: unable to save JWT signing key: unable to save signing key: gorethink: Cannot perform write: primary replica for shard [”", +inf) not available in: \nr.DB(“enzi”).Table(“signing_keys”).Insert({id=“ce692a38e0ee09edf3fdc5bc3859a98bf39a87dd656e4423f704f365a0bf661b”, keyType=“RSA”, modulus=“uBNRVVnkDm-vbb5Ive-fsE3m2VtSVW2frLAOyFZPPG4_SioGEN84HaTdDq14iTTvAki9wLKb9U-H8bJrlVPApgsJtLr6BFKSOQqvSsN13rLow_V_sREpLyyCF_OvPgr3lKZ4OpjREoVwubqARDUaQr2TkCXZFaUQPqDcgsgPPKSKW7q0sw2atoB8JQKXKUExKCEKaQ5BBjxMtTnmr6m_dg6v8LlMu1uLqluR4VDQGW6wVO2v0Lcwzft4FdF7Ne-Y5_9m685bxxJGJzSLhsRYjuk2ck4gMVssaAYNKrEgvaIHIt249nH3DoQUZ-9CRvN_iW6kj8UMVqfMTLVbuuOVxw”, exponent=“AQAB”, expiration={$reql_type$=“TIME”, epoch_time=1.4653712347795677e+09, timezone="+00:00"}})",“time”:“2016-06-07T19:33:54Z”}

So, apparently there’s something wrong with the DB, but I don’t have a slightest idea how to fix it. Please help.


(Ehlers320) #2

I am seeing the same thing. The rethink database appears to be up and i can netcat the port.


(Ehlers320) #3

I ended up re-installing the entire cluster to fix it. Sorry i cant be more helpful.


(Vivek Saraswat) #4

The database may not have been ready when the API server started. Have you tried simply restarting the auth api container on the malfunctioning node?

e.g. docker restart ucp-auth-api

You may need to restart the ucp-auth-worker container as well.


(Alex Wellock) #5

I ran into the same problem after restarting our cluster. I have tried restarting both the api and the worker as well as the store. I have also tried rebooting the host. So far I have not had any luck getting the api and worker out of the restarting sate.


(Neoresin) #6

Ran into this problem and I was able to solve it by syncing the clocks of the machines in the cluster and restarting the docker daemon on each (not sure if this last step was necessary, but hey, it worked). So on each of the RHEL 7 boxes we’re using:

  1. ntpdate -u domain_controller_1.domain.com (there was significant drift on the primary replica)
  2. systemctl restart docker (which restarted all the containers in the process, in the proper order)