Healthcheck - different interval for startup check vs recurring checks. Maybe make a separate `STARTUPCHECK`

As far as I understand it, the interval directive for health checks is the time that docker will wait to check a newly created container to see if it’s healthy, and it’s also the recurring interval at which it will continue to check the container’s health status.

It would be nice to be able to specify an initial interval that is used just for the very first time when a container starts, and then an interval used for the recurring health checks.

An example: I have a container that takes roughly 1 second to start up and run. I would like to run health checks on 5 minute intervals. The problem is that right now, with my interval set to 5 minutes, when I start this container, it waits a full 5 minutes after creating the container before checking that the container is healthy!

Ideally, I would tell docker to wait 5 seconds after starting the container to run the first health check, then if that passes, move to a 5 minute interval for continued monitoring. Perhaps some extended cases might even warrant an entirely new directive similar to HEALTHCHECK but perhaps something like STARTUPCHECK with its own interval, timeout, and retries.

Inversely, someone might have a container that takes several minutes to run some kind of build process, but they’d like to continually monitor it on a shorter interval. Right now the only way to do that is to specify a shorter interval with a higher number of retries. This is bad because it accommodates the initial start of the container, but, you might not want that many retries for the ongoing health checks.

The issue for all this gets worse when your container is part of a stack and there are other containers using the depends_on directive. The problem compounds and you have to set really high health check intervals or retries and your full stack takes way too long to deploy.

This also affects when you do a service update. For example, I might want to regularly monitor a container using HEALTHCHECK on 5 minute intervals. Even if this container’s startup is near instant, when I update the service to use a new image it still takes a full 5 minutes before the health check passes and the container is put back into the rotation. That is 5 minutes of unnecessary downtime.

5 Likes

Fully agree. Containers should be available ASAP through passing their healthcheck even if the interval for periodic checking might be long.

2 Likes

Agree. When doing docker-compose and have dependencies on healthy services it would be good to do “you up? you up? you up?” every 10s but once it’s up you could drop the ping to 10m…

1 Like

This is super important - Tying the wait time before first probe to the interval just wasn’t the right thing to do. We need a separate wait-before-first-probe.