Docker Community Forums

Share and learn in the Docker community.

Unable to recover unhealthy cluster (RETHINKDB recover doesn't work)

(Javier Ramirez) #1

After trying a rethinkdb cluster recovering using a replica reconfiguration, I get an error about one table and can recover cluster misconfiguration.

docker container run --rm -v ucp-auth-store-certs:/tls docker/ucp-auth:${VERSION} --db-addr=${NODE_ADDRESS}:12383 --debug reconfigure-db --num-replicas 1 --emergency-repair

level=fatal msg="Unable to reconfigure database replication: unable to emergency repair table “collections”: unable to reconfigure database replication: gorethink: The server(s) hosting table ucp.collections are currently unreachable. The table was not reconfigured. If you do not expect the server(s) to recover, you can use emergency_repair to restore availability of the table. in:\nr.DB(“ucp”).Table(“collections”).Reconfigure(replicas=1, shards=1)

Checking status shows that some databases are not fully functional/replicated, but ucp console doesn’t show any error with one manager only. When trying to add new managers, we get {“level”:“error”,“msg”:“Unable to promote node to controller: unsuccessful node promote request: node promotion failed: {“message”:“etcd and rethinkdb cluster health check failed: rethinkdb cluster unhealthy: 0 of 1 replicas are healthy”}\n”,“time”:“2017-10-08T11:27:02Z”}.

Etcd cluster is healthy without any problems.

Thanks for your help,
Javier R.