Docker stack or service kill to send signal to remote containers

chrishecker · December 27, 2017, 11:09am

I’d like to be able to use the “docker kill -s SIG” command to send signals to containers running on other nodes in a swarm, so, something like:

docker stack kill -s HUP myswarm

It seems like there’s currently no way to send a signal to remote nodes, except for the TERM sent during a stack rm?

Thanks,
Chris

Edit: after doing some more research, it looks like I can probably do a crazy hack that involves doing a deploy of a small alpine container on all the nodes, and that container can have the /var/run/docker.sock mapped in, and curl (sadly the nc in alpine doesn’t support unix sockets) and jq installed, and a script can use the REST API to find the container ID of the target by the com.docker.swarm.service.name and then send it a signal with the kill endpoint. This is obviously totally nutballs and heavyweight compared to just having the swarm be able to send signals like above, but I think it’ll work.

bryceryan · December 27, 2017, 4:52pm

One might also consider use of docker exec to execute a command in a running container.

chrishecker · December 27, 2017, 9:05pm

How would that work, exactly? exec can’t execute on remote containers as far as I know (and tested). In other words, if I can get on the remote machine to run exec, I can just run kill anyway.

I need a command that will automatically run kill on all the swarm nodes that have a given service running using a single command on a manager without requiring me to ssh into all the nodes. The only way I can figure out to do that with the current set of features is to make another service that can then launch and do the kill, and the only way for a service to kill another service that I’ve found is for it to talk to the API socket.

If there’s a better way I’d love to hear it!

Chris

sdetweil · December 27, 2017, 10:12pm

the doc says to remove a service from a swarm, use

docker service rm service_name

on the manager node

see https://docs.docker.com/engine/swarm/swarm-tutorial/delete-service/

chrishecker · December 27, 2017, 10:20pm

Hi, I mentioned stack/service rm in the OP…I don’t want to send a TERM (and then KILL), I want to be able to send arbitrary signals, like docker kill -s can do for containers on the same machine. Basically I want exactly what docker kill can do for local containers, just for services. Seems like a reasonable feature request, no?

Chris

sdetweil · December 28, 2017, 12:22am

why do you need to signal the processes?
isn’t that what an API is for?

chrishecker · December 28, 2017, 4:47am

This is a strange question. Are you saying they should remove docker kill? The ability to send arbitrary signals to applications is a pretty standard and useful unix thing, there’s a lot of infrastructure around it, and it’s supported for local containers, so it seems like a bit of an omission from the services API. This seems pretty obvious to me?

In my specific case, I want to send signals to change the state of the app, like if it’s going into a certain mode, like maintenance mode, or if it should send a message to logged in users about an impending restart, or whatever. Another example is sending SIGHUP to an app to get it to reread config files without restarting. There are a zillion uses for sending signals to applications besides just TERM and KILL and an entire set of functions and tools to do this for apps, so it is strange to me that it’s not totally obvious why I’d want to send a signal to apps on remote machines in a swarm? Maybe you’ve never written a stateful app or something that doesn’t just handle a web request or whatever and exit immediately?

No idea what you’re talking about with the API thing…are you saying there’s an API entrypoint that will send signals to all the containers in a swarm service? If so, I missed it, and I’d love a pointer because I’d totally use that to solve my problem. If not, you’ll have to explain what you’re talking about and how that actually solves the problem I’m pointing out and requesting the feature for.

Chris

sdetweil · December 28, 2017, 12:12pm

just because signals worked for local applications on unix, does NOT make that the right way to do it in a distributed swarm environment.
i suggested that YOU could add some code to your application, creating an API that YOU could call to provide the information needed. vs Docker trying to implement this older mechanism.

"Maybe you’ve never written a stateful app or something that doesn’t just handle a web request or whatever and exit immediately?"

that ‘web request’ is an api

chrishecker · December 28, 2017, 9:21pm

Of course I could modify my app. I could also rewrite all of docker, or go become a farmer, or run for president, but this is a features request forum for docker, so it seems pretty reasonable to request features here, especially ones that already exist in the project (docker kill -s) and would just be relatively small extensions to the swarm API, and are obviously useful and time-proven ways of doing simple interprocess communications. On edit: and, in fact, docker uses signals on the swarm already with rm and so your point about them not scaling seems suspect.

In this case, I do control my app, but that is not always true; I’m not sure what you’d tell somebody who didn’t have the option or ability to modify an app running in a container. This is why platforms add features, so every app doesn’t have to be modified to implement them. This is presumably why they have a features request forum…

I’m not sure why you’re so determined to try to talk me out of making this totally reasonable feature request; it feels like the “you don’t need that” mentality that is so common (and poisonous) in OSS communities. Instead of saying, “hmm, let me try to understand this person’s problem, it might be something I haven’t thought of, and it’s a big world out there”, there’s this immediate, “you don’t need that, the current thing is fine, you could just do this other thing”…it was clear from your very first post that you didn’t bother to even understand what I was asking since you suggested something I’d already mentioned in the OP as not what I need, and now you’re clearly psychologically dug in on defense and so it feels like a waste of time to continue.

Anyway, I’m done discussing it with you, but feel free to have the last word if you’d like.

Hopefully somebody who works on docker will stop by and check out the request. If anybody else has questions or comments let me know. If I find some time after this deadline I’ll look into adding the feature myself and do a PR if it’s something they’d be interested in integrating. My first experience with containers has been pretty positive and it’s a cool project and I’d be happy to contribute back.

Thanks,
Chris

sdetweil · December 28, 2017, 9:22pm

i never suggested you NOT submit your request. ever…

you did clarify your value in the request, which is good info.

looking at all the things the docker team has to do, and the number of users asking for it, i would expect a very long wait.

aselvan · February 8, 2018, 12:47am

@chrishecker
Have you managed to find a solution to this problem? I have the very same need and I am sure there are others like us. I have a legacy java app that is now dockerized which does some cleanup to leave everything in a consistent state before shutting down on SIGTERM or SIGINT. Unfortunately, "docker stack rm " kills it abruptly leaving it with inconsistent state. Appreciate any help.
Thanks

chrishecker · February 8, 2018, 12:54am

No, I haven’t, I figured I’d just do something like use ansible, which is kind of lame but I looked at the code to make the change and it’s an ocean of RPC calls so it’d be a pretty huge diff, so I think somebody from the core team would have to make it. It’s unfortunate, since they’ve got the non-swarm version of the command right there, it just needs to be extended to the swarm.

Chris

PS. There are a few other puzzling decisions like not being able to have the worker token set manually anymore (there are a couple of threads about this here) that basically require you to use another tool like ansible in addition to docker it seems…I haven’t worked through all the issues yet with running my app in a swarm but it seems like there is a list of them. Once you’re forced to use a config tool you might as well have it send the signals too I guess. Sigh, more packages.

aselvan · February 8, 2018, 6:54pm

@chrishecker
Thanks for the quick response. After experimenting a little bit with “docker stack rm” I did find something very interesting. Actually, docker does indeed send SIGTERM but unfortunately, does not give much time before sending SIGKILL which follows almost immediately. See below.

I deployed a stack with a simple image that runs the following script on startup in a swarm …

#!/bin/sh

term_signal_handler() {
  echo "############  Caught SIGTERM #############"
}
int_signal_handler() {
  echo "############  Caught SIGINT #############"
}

trap 'term_signal_handler' SIGTERM
trap 'int_signal_handler' SIGINT

while true ; do
  echo "Waiting for signal ...."
  sleep 10 
done

When I do “docker stack rm…” I do see SIGTERM getting caught by the script as shown below

[root@docker]# docker logs -f 9dc75fa80f21
Waiting for signal ....
Waiting for signal ....
Waiting for signal ....
Waiting for signal ....
Waiting for signal ....
############  Caught SIGTERM #############
Waiting for signal ....

At this point above, the script is killed.

Update:
Finally, found a way to make it work for me. I just need ~30 secs or so to do cleanup after the receipt of SIGTERM. I found an option stop_grace_period that works perfectly for me. I was able to do cleanup before getting killed.

flyingchipmunk · April 30, 2018, 12:09am

my google’ing brought me here, +1 for what I agree would be very useful.

rooholam · December 29, 2019, 6:49am

I was trying to configure logrotate for Traefik api gateway running on docker swarm, I needed to send signal “USR1” to container for postrotate, if this feature was available I could do it very clean.

schnappi · June 2, 2020, 1:25pm

+1 from my site. Was there any progress on this? I haven’t found a nice and clean soloution for that

julianklock · April 7, 2021, 8:59am

I would also like to add my +1 to this feature request. My use case is the exact same as described by rooholam: We’re running Traefik in Docker Swarm, and the ability to easily send a USR1 to the container(s) in a swarm would simplify the use of logrotate.

Topic		Replies	Views
Socker stack rm - SIGTERM not sent to the running process General swarm	0	844	September 5, 2018
Is docker service rm performing a graceful shutdown on nodes? Swarm	0	1562	January 29, 2018
How to kill a broken swarm? Swarm	13	25642	October 1, 2019
How does docker scale work Swarm swarm	2	955	August 10, 2022
Can't stop or kill a container Swarm docker , swarm	24	99016	January 11, 2024

Docker stack or service kill to send signal to remote containers

Related topics