Hi,
I am new to docker and doing a POC to learn docker.
I have a docker swarm network containing 3 servers. Each server have certain containers which has been deployed via docker stack deploy command.
The container yml file does contain stop_grace_period: 60s and stop_signal: SIGTERM
For some reason a server reboot (worker node) rebooted.
After that the container is not coming up. Following are the error in docker logs:
level=info Container failed to exit within 10s of signal 15 - using the force" container=c12da60adc52e4bd25807f1de7f78b6e1b1618a1a056d75b5131e490
level=error msg=“fatal task error” error=“task: non-zero exit (137)” module=node/agent/taskmanager node.id=l4rymukg1kael881r0ix2o374 service.id=behq372dbhx6e9jryrfnmp task.id=ltk40zckkp6lm2fj6352m
level=warning msg=“ShouldRestart failed, container will not be restarted” container=de8e710fe332b55c41bb52245e77d595eb455868f0b618c6e566068dd daemonShuttingDown=true error=“restart canceled”
level=warning msg=“Failed to disconnect container lb-test5 from swarm network ml5 on cluster leave: endpoint lb-test5 not found”
level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint vpggexha1kl3adh8jdmnrc9fac 516239abeb331286eae450319b90ea6503e39a0a87fc4a48248e868a3], retrying…
level=warning msg=“Failed to remove swarm network ml5 on cluster leave: error while removing network: unknown network ml5 id pggexha1kl3adh8jdmnrc9fac”
we are suspecting it is because container was not stopped gracefully within 10 seconds. However we have stop_grace_period and stop_signal set.
Could you please let me know where I am going wrong and how to resolve above error?
Many Thanks,
Abhishek