I have some containers which can have long running jobs that take longer than 60s to complete, and can interfere if another instance of the container starts up on another node. If I set the stop_grace_period:, the original container will stay up until either the job finishes or it reaches the grace period, but no matter what setting I try, the new container gets started after 60s. I thought it might respect one of the delay: options under deploy, but nothing I put there seemed to change that timing, and nothing else in the docs seems to give anything the suggests a way to do this.
I could keep a lock file (like a .pid file) and remove it after the job is finished, but that just kicks the can down the road. If the container gets killed or host fails before the job is finished the new container could be waiting indefinitely (and this seems kludgy to begin with, since unlike processes that run in the same environment, I can’t check if the pidfile still points to the ‘right’ process.)
My google-fu is failing me here, or is this just something not supported by swarm?
I must be missing something, as I am not quite sure if I understood your expected outcome.
if stop_grace_periond: 60 is used, and the process inside the container does not act on SIGTERM, it will take up to 60 seconds until the process receives a SIGKILL. As a result the service should be killed, which ends the evicted service task, the scheduler should detect a drift between current state and desired state and schedule the start of a new service task to remedy the drift.
Regardless, have you tried to tweak the restart policy? Depending on how you look at it, a node drain will stop the service task in a controlled way, so that it might be considered as a restart. I doubt that any setting underneath update_config will influence the behavior, as it only applies when the configuration of a service task gets updated, which is not the case on a node drain.
I must be missing something, as I am not quite sure if I understood your expected outcome.
Task is running on node1.
stop_grace_period is 15m
all of the settings under deploy (update_config, rollback_config, etc.) are set to order: stop-first
docker node update node1 --availability drain is run
task running on node1 correctly gets SIGTERM and begins shutting down gracefully.
a new task is prepared on node<n> and is in “Ready State”
60s after it is in Ready State, the task on node<n> starts regardless of the fact that the task on node1 is still shutting down, but has not finished, and has not exceeded the stop_grace_period yet. This time never changes, it’s always 60s after the new task has finished preparation and is in ready state, no matter that the original task has not finished it’s stop yet, and no matter what values I specify under deploy:<update_config|rollback_config|restart_policy>:delay.
My expectation is that the new task should actually wait for the first task to stop before starting, when told to stop-first
Yes, this is what I meant by “I thought it might respect one of the delay: options under deploy, but nothing I put there seemed to change that timing, and nothing else in the docs seems to give anything the suggests a way to do this.”
It makes sense that neither update_config nor rollback_config applies. As a node drain does not update the service configuration.
Apparently it doesn’t seem to be considered a restart, like I thought, as It already schedules the new service task, even before the old service task is exited.
I doubt there is any configuration in the compose file specs that supports your use case.