Docker Community Forums

Share and learn in the Docker community.

Service/Containers not automatically starting on node restart


(Chris Jones) #1

Docker version 1.11.1-cs1, build bfd1f99

Expected behavior: Services defined in stackfile with
restart:always
should start automatically when the node is rebooted.

Actual behavior: Services do not start automatically. Have to manually start using CLI or Docker Cloud web interface.

This is new, unexpected & inconvenient behavior. What’s the point of the restart: parameter if it’s ignored?


(Chris Jones) #2

This is similar behavior to https://github.com/weaveworks/weave/issues/2222


(Chris Jones) #3

I just deployed a fresh node on Digital Ocean and it does the same thing.
So basically none of my services automatically restart when docker-cloud-agent is restarted.


(Chris Jones) #4

Here’s some things of interest in the docker.log
time=“2016-08-31T15:01:54.615903784Z” level=warning msg="container 0dc07c7bec9c7224cea2081f72fbaf424dc6d37dfbf13222417ab8ce3738697a restart canceled"
time=“2016-08-31T15:02:15.105417216Z” level=info msg="Removing stale sandbox 8c0bac3e691e2f058cf1962adf1cbc965d2a89b339156aeefc5f6220475f75e5 (0dc07c7bec9c7224cea2081f72fbaf424dc6d37dfbf13222417ab8ce3738697a)"
time=“2016-08-31T15:02:15.281440798Z” level=warning msg="failed to cleanup ipc mounts:\nfailed to umount /var/lib/docker/containers/0dc07c7bec9c7224cea2081f72fbaf424dc6d37dfbf13222417ab8ce3738697a/shm: invalid argument"
time=“2016-08-31T15:02:30.483253619Z” level=warning msg="failed to cleanup ipc mounts:\nfailed to umount /var/lib/docker/containers/0dc07c7bec9c7224cea2081f72fbaf424dc6d37dfbf13222417ab8ce3738697a/shm: invalid argument"
time=“2016-08-31T15:02:30.501606159Z” level=error msg=“Failed to start container 0dc07c7bec9c7224cea2081f72fbaf424dc6d37dfbf13222417ab8ce3738697a: failed to add endpoint: plugin not found”


(Chris Jones) #5

$ cat config.v2.json | json_reformat
{
“State”: {
“Running”: false,
“Paused”: false,
“Restarting”: false,
“OOMKilled”: false,
“RemovalInProgress”: false,
“Dead”: false,
“Pid”: 0,
“ExitCode”: 2,
“Error”: “failed to add endpoint: plugin not found”,
“StartedAt”: “2016-08-31T15:13:21.494841044Z”,
“FinishedAt”: “2016-08-31T15:22:59.93692141Z”
},


(Chris Jones) #6

@fermayo Could you please help me get some attention for this? Thanks kindly!


(Seals) #7

I’m getting this problem since upgrading to “Docker version 1.12.1, build 23cf638” on Ubuntu 16.04.

I have my /var/lib/docker pointing to a different drive/mount point. (btrfs)

I can’t “down” containers, I need to reboot to allow docker service to restart.

Sep 01 11:18:50 sealsubuntu dockerd[4681]: time="2016-09-01T11:18:50.990631269-04:00" level=warning msg="failed to cleanup ipc mounts:\nfailed to umount /mnt/Drive3/var/lib/docker/containers/75e9a1d38a9ca239df864f0b25f6ab4316097ca68fba4e3a489f9e46e68a8807/shm: invalid argument"
Sep 01 11:18:51 sealsubuntu dockerd[4681]: time="2016-09-01T11:18:51.136496548-04:00" level=warning msg="failed to cleanup ipc mounts:\nfailed to umount /mnt/Drive3/var/lib/docker/containers/d43a479b337caa15c28b74aa1699c46238a239bb376449392c43c8ea7e0c3411/shm: invalid argument"
Sep 01 11:18:51 sealsubuntu dockerd[4681]: time="2016-09-01T11:18:51.271906528-04:00" level=warning msg="failed to cleanup ipc mounts:\nfailed to umount /mnt/Drive3/var/lib/docker/containers/09f14d385b83ea5057a1aa55162aea4614118e169c42d82be717720826bea46b/shm: invalid argument"
Sep 01 11:18:51 sealsubuntu dockerd[4681]: time="2016-09-01T11:18:51.407317470-04:00" level=warning msg="failed to cleanup ipc mounts:\nfailed to umount /mnt/Drive3/var/lib/docker/containers/9a860a4e052aa11eb5d912a01c5bc6021f721c68e7b047213cbc292ad2616e56/shm: invalid argument"
Sep 01 11:18:51 sealsubuntu dockerd[4681]: time="2016-09-01T11:18:51.592480420-04:00" level=warning msg="failed to cleanup ipc mounts:\nfailed to umount /mnt/Drive3/var/lib/docker/containers/e24c46c6014ebc613d6531ba2582e0703f1c9ce737a23107ec556231f53d0c77/shm: invalid argument"
Sep 01 11:18:51 sealsubuntu dockerd[4681]: time="2016-09-01T11:18:51.647421373-04:00" level=error msg="Force shutdown daemon"
Sep 01 11:18:51 sealsubuntu dockerd[4681]: time="2016-09-01T11:18:51.648118683-04:00" level=info msg="stopping containerd after receiving terminated"
Sep 01 11:18:51 sealsubuntu dockerd[4681]: time="2016-09-01T11:18:51.740589546-04:00" level=warning msg="failed to cleanup ipc mounts:\nfailed to umount /mnt/Drive3/var/lib/docker/containers/58b075b0be73090bd4ffff7675f43f46a7146cd3db8ff428f1275fbef8a76772/shm: invalid argument"

(Chris Jones) #8

Seems related https://github.com/docker/docker/issues/26195


(Akinaru) #9

Restarting dockercloud-agent or rebooting break the docker restart policy. Is there any news about that weave issue on docker-cloud side ? Is there any fix / workaround to actually restart all services that match restart:always on reboot or when dockercloud-agent is restarted ?


(Chris Jones) #10

I still don’t have a solution.

All of my Docker-Cloud nodes seem to be running 1.11.1-cs1.

I did notice a very similar-sounding issue is resolved in CS Engine 1.11.1-cs2 (17 May 2016)
https://docs.docker.com/cs-engine/release-notes/release-notes/
"This release fixes the following issue which prevented DTR containers to be automatically restarted on a docker daemon restart:

https://github.com/docker/docker/issues/22486"


(Akinaru) #11

Indeed, thank you Chris. I’ll be waiting docker-cloud to update to 1.11.1-cs2


(Floranger) #12

I have the same problem here… I have setup a test environment for this particular issue. I have two nodes, one in AWS and the other is a BYON. I run a simple CouchDB container on each of them. Whenever I reboot a node, the container won’t get back to life. I have to manually redeploy them.


(Chris Jones) #13

This issue persists, even after updating all nodes to 1.11.1-cs5.


(Bernardo) #14

So apparently it is a Docker Engine race condition when weave plugin container starts and your containers start. Your containers starts before the plugin container start so they fail. You will see in the docker logs something like:
level=error msg="Failed to start container XXX: failed to add endpoint: plugin not found

We are going to implement a workaround for DockerCloud soon to fix this behaviour. Stay tuned.

Some information:


[Resolved] Docker 1.11.2-cs5 Node restart weave issues