I need some clarification in regards to using HEALTHCHECK on a docker service.
We are experimenting with a multi-node mariadb cluster and by utilizing HEALTHCHECK we would like the bootstrapping containers to remain unhealthy until bootstrapping is complete. We want this so that front-end users don’t access that particular container in the service until it is fully online and sync’d with the cluster. The issue is that bootstrapping relies on the network between containers in order to do a state transfer and it won’t work when a container isn’t accessible on the network.
When a container’s status is either starting or unhealthy does HEALTHCHECK completely kill network access to and from the container?
As an example, when a container is healthy I can run the command
getent hosts tasks.<service_name> inside the container which returns the IP address of other containers in a service. However, when the same container is unhealthy that command does not return anything… Hence my suspicion that HEALTHCHECK kills the network at the container level (as opposed to at the service/load balancer level) if the container isn’t healthy.
Thanks in advance
Maybe I am wrong, because I don’t use it too often, but why would a health “CHECK” change anything? When a healthcheck checks the network status but it brakes the network because the container unhealthy, it would never become healthy.
Healthcheck doesn’t only check the network status. It can check any condition (for example status of the mysqld service). And if my observation is correct, if the condition isn’t met it puts the container in the “unhealthy” state effectively removing it from the network even though it is running (as unhealthy)
I know that. It was just an example why it should not affect the network or anything unless you use a command in the healthcheck that does.
I see… I get your point now and it’s a good point. So maybe there is another reason why the ‘getent’ command doesn’t work while the container is starting.
I ran some more tests and found my own answer. Basically docker does not kill container networking when it is either in the started or unhealthy phase. The reason
getent hosts tasks.<service_name> command does not work during those phases is that that command goes back to get the container IP address through the service which does not have the unhealthy container(s) assigned to it.