Automatic Gracefull DOCKER container SHUTDOWN on SYSTEM Restart/Shutdown (SYSTEM OS)

Hi,
I noticed, when I Click on Hypervisor HOST to shutdown VM (Debian +Docker )
When the OS (VM) is restarted or shutdown, but docker container is NOT GRACEFULLY shutdown first.
Is there a way to configure the OS, to send some command and WAIT for DOCKER to shutdown its CONTAINERs FIRST, and ONLY then continue to SHUTDOWN or restart the OS?

It is very strange, this is something not OUT-OF-THE-BOX …
I am running 3 VMs each with Docker inside, and need to implement solution to make sure that ANY non-IT superskilled developer can "turn off / restart the VM from Hypervisor HOST, and make sure the VM will terminated all DOCKER services/containers gracefully, without need to use SSH and some crazy commands.
Is there a way?

I would think that Docker can receive a SIGTERM and also pass it to the app, before being killed 10 secs later.

I read that when you use a shell in Dockerfile command it might not pass the signal to the extra process started. So check your Dockerfile setup.

To extend on @bluepuma77’s response: you need to configure the container to actually send the signal actually required by the main process. Keep in mind that only pid 1 receives this signal.

You can configure the grace period and the stop signal for a container:

Does this behavior (only PID 1 receiving the shutdown signal) also apply to Windows containers, if you happen to know? I’ve heard both according to this Unable to react to graceful shutdown of (Windows) container · Issue #25982 · moby/moby · GitHub

Tangential, but it seems that setting the --time flag (on a run command) and stop_grace_period (in the compose) both don’t function at all for Windows containers. The windows sigkill equivalent is sent 10seconds after the sigterm equivalent, no matter what

As far as I know, Windows does not have PID 1 at all. You can run

Get-Process

from a Powershell and the process called “System” has PID 4. And there is a Process called “Idle” which has PID 0. PID 0 does not exist on Linux.

You can also try

Get-Process -Id 0

and

Get-Process -Id 4

Although Docker made it possible to use Windows containers through its command line interface, Windows containers are totally different from Linux containers.

But the fact that there is a PID 1 on Linux means nothing. The point is that (on Linux) there is a clear process tree and there is a main process which can start other processes and this is usually Systemd on a physical machine or virtual machine, but can be anything in a Linux container. That first process has to handle signals and forward the signals to the child processes. That is why sending the stop signal to PID 1 should be enough on Linux, but I can’t really talk about Windows containers.

I have no experience with Windows containers.

Maybe @vrapolinario has some insights about how this is handled by Windows containers.

I appreciate the insight and the reply. Still learning how all of this stuff works.

Once I figure out how to get my DB container to shut down gracefully I’ll update this thread with what I tried.

Update:

Couldn’t figure out how to gracefully shut down the process from the entrypoint script unfortunately. Tried to trap the shutdown signals in my powershell script but it wouldn’t trigger. Not sure how else I could ‘intercept’ that and run my own shutdown commands without directly modifying the apps source and recompiling.

I ended up just using an IIS image to pull servicemonitor.exe and installed the service instead of just running the process. Looking at the logs its finally gracefully shutting down. Dockerfile syntax if anyone is wondering:

FROM mcr.microsoft.com/windows/servercore/iis:windowsservercore-ltsc2022 AS iis
FROM mcr.microsoft.com/windows/servercore:ltsc2022
COPY --from=iis /ServiceMonitor.exe /ServiceMonitor.exe

Whether the container will have >5 seconds to terminate is a different story though. Going to keep testing. Hopefully someone finds this info useful.

1 Like

Update 2

Not sure if just installing the app as a service or explicitly running with servicemonitor got the process to receive the graceful shutdown signal, but its working for me so I am not going to touch it.

Making this change finally got the process to start its graceful shutdown as evidenced in the logs. To give the container the time it needs to shut down:

  1. (In addition to the above servicemonitor method) these registry keys need to be updated with large values:

RUN reg add hklm\system\currentcontrolset\services\cexecsvc /v ProcessShutdownTimeoutSeconds /t REG_DWORD /d 7200
RUN reg add hklm\system\currentcontrolset\control /v WaitToKillServiceTimeout /t REG_SZ /d 7200000 /f

  1. The container must be ran with STDIN open to actually receive the stop_grace_period! This means if standalone, run it with the -i flag. If running from a compose file (I use Swarm) you must define the config option stdin_open:true.

So in summary, if using compose, you need both

 stdin_open: true
 stop_grace_period: 300s #or other defined value

defined for the service, as well as the registry key updates above while also running it with servicemonitor.exe to achieve a graceful shutdown. Hope this helps someone. Either way it seems stdin has to be open so that’s the one glaring drawback here.

1 Like