Hello,
We have a docker swarm stack with 36 services deployed on Azure Cloud with Docker 20.10.12 version, 7 nodes and 3 managers. Every time a deploy is done in the cluster, 3 of our services get restarted even if there are no updates for them.
The issue is not happening for the other services, so we want to understand why these 3 are restarted.
For all services including these we are using a explicitly tagged images(not latest) and 2 of the restarted services have a healthcheck, one of them doesn’t.
Inspecting the shutdown containers doesn’t provide the cause, container health looks ok until it seems to be shutdown by the swarm:
"State": {
"Status": "exited",
"Running": false,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 0,
"ExitCode": 0,
"Error": "",
"StartedAt": "2022-06-28T11:07:50.253278648Z",
"FinishedAt": "2022-06-28T12:09:38.028324252Z",
"Health": {
"Status": "unhealthy",
"FailingStreak": 0,
"Log": [
{
"Start": "2022-06-28T12:07:31.823395345Z",
"End": "2022-06-28T12:07:31.877107802Z",
"ExitCode": 0,
"Output": ""
}...
Docker stack ps for one of the services:
“CurrentState”:“Shutdown 2 hours ago”,“DesiredState”:“Shutdown”,“Error”:“”,“ID”:“986fpoefleoj” …
We’ve set docker logs to debug and I can see the running task desired state is set to shutdown and a new task is created but no reason why this happens:
new task ---> time="2022-06-28T12:08:26.336889162Z" level=debug msg="task kcu8p5kk29s7a3bqxaaifvqnm was marked pending allocation" module=node node.id=g9to4w40wx0met9q8h64h29b7
...
existing task---> time="2022-06-28T12:08:26.533482977Z" level=debug msg=assigned module=node/agent node.id=g9to4w40wx0met9q8h64h29b7 task.desiredstate=SHUTDOWN task.id=986fpoefleoj8mpnhu0eaop62
time="2022-06-28T12:08:26.533530078Z" level=debug msg=assigned module=node/agent node.id=g9to4w40wx0met9q8h64h29b7 task.desiredstate=READY task.id=kcu8p5kk29s7a3bqxaaifvqnm
Existing container is disabled and e new one is created:
DisableService 134bcd810301dad5ce535324960e66de085d3b5db07e4ac80cdef4fb4b2e6d69 START
...
time="2022-06-28T12:08:45.247316912Z" level=debug msg="EnableService b00270822ac16280d40d162d34eed1b01a05af06debe6fd10e7bf90fd0cd1c7e START"
...
I cannot attach a more detailed log because I’m a new user, but will provide more information if needed.
Found these posts that are somewhat similar but they don’t seem to have gotten an answer:
Any idea how we could find the cause why these services are restarted?