I have a strange problem on our system running
- Docker version 18.06.1 (also tested 18.03.01 and 18.09)
- docker-compose v1.23.2
- CoreOS 1576.5.0 (also tested on CoreOS 1911.4.0 with same behavior)
I have an installation previously running with a storage-container to maintain data.
We are migrating to named volumes, and the easiest way was to:
- -> stop and remove containers
- -> reconfigure to use named volumes
- -> build and start containers
- -> Go to remove server, stop docker-service, copy the data in /var/lib/docker from the old storage to the new, named volume.
- -> Restart everything.
It works to some degree, except; if we are rebuilding the containers now, one of the containers just hang on docker-compose up, until timeout
Creating prod-new_exp_1 ... ERROR: for prod-new_exp_1 HTTPSConnectionPool(host='xxx.xx.xxx.xxx', port=2376): Read timed out. (read timeout=60)
I have tried to increase the timeout-value, but then it times out:
(ERROR: for ded738f9c4c7_prod_exp_1 ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))
The docker container appears in list as “created”, but its not possible to start, inspect or anything (gets “No such container”)
If i restart the docker-service, the container is now up and running, but Ive seen issues e.g with the dns-lookups in between the containeres (e.g the apache and the web-container) if its done like this.
Now; If I move the volume, rebuilds and starts, shuts down the docker-service, move the data back and restart the service, everything works as long as I dont rebuild the containers again, so it is obviously related to the volume.
Is it because we did the “hack” by moving data from one volume to another that causes the issues? The volume contains a couple of million files and 15GB of data.
Any way of finding out what is really happening? No debug-logging or e.g journalctl gives me any hits whats happening.