Can't stop or kill a container

We have below containers and believe the ones with elastic 6.5.4 are older ones and need to be removed/deleted. Stopping or killing these container/s using stop/kill commands doesn’t exit its status. Please share how to totally remove these images/containers.
I tried below commands as well. The ones with 7.10.1 got exited but not the other ones.
containers with elasticsearch 7.10.1 are the newer ones and we need to retain them.

sudo systemctl daemon-reload
sudo systemctl restart docker

Moreover, Can we know if these 2 different versions are running in different clusters and if so, delete the unneeded cluster?

@rimelek @meyay

You don’t need to mention us to get help. @meyay and I are very active, but we are not the only people who can help. If you directly mention someone, you risk that other people will feel the question is not for them so they just turn back.

When you run a container without Docker swarm, Kubernetes or other orchestrator, you can stop that this way:

docker stop containername

It sends the process inside the container a signal, but that signal can be different for every app, however it is usually the TERM signal. The process in the container must catch it, but depending on how you create the container it may not even get it to catch. You can read about it more in my tutorial:

If you want to remove the containers not just stop, you can stop and remove with one command:

docker rm -f containername

When docker stop can’t stop the container normally, it should force it to stop with the KILL signal after a timeout which is 10 seconds by default. If that doesn’t work either, then maybe your Docker daemon does not respond or the container is recreated by an other process. It can be Systemd, Docker Swarm, Kubernetes and so on…

So without knowing how you created the containers we can’t tell how you should properly stop and remove them. If the containers are started by Swarm, then you can’t just remove with docker rm, you need to remove the service or the whole swarm stack.

What do you mean by different clusters? If you run docker ps or docker container ls to get the list you shared, then the containers are running on your machine.

1 Like

It doesn’t removes or stops with “docker stop containername”, “docker rm -f containername”.On executing, status is Up & new containerID shows up each time.

Executing “sudo docker node ls” gives error “This node is not a swarm manager…”
sudo docker history gives below response

If there is a way to know now how these containers were created & further remove, please do share.

It is something that you should know. docker history does not help here, since that just shows how you created the image, not the container.

If you somehow don’t remember, how you created these containers, for example because it is a server which you did not maintain at the beginning, you can try to use the ps command to see the process tree and see if you can find any relation between the containers and an other process, or use grep to search for docker in the systemd configurations or any other folder which you think might be relevant. If you run this on Linux to search in the /etc folder there is a good chance you will find something, but not 100 percent sure.

sudo grep -r "docker" /etc

if you get too many result because of selinux or other features that mentions docker a lot, you can tr to search in specific directories:

sudo grep -r "docker" /etc/systemd

It doesn’t show me anything on my CentOS even though I am sure it contains Docker so you my need to use an other command or find the configuration by browsing the files manually.

update:

Okay it was because of symbolic links, so this one worked:

sudo grep -r "docker" /usr/lib/systemd

It was just an idea. The output you shared (it looks like it is just one part of the full output) only shows systemd runs Docker which is usually the case on almost every machine. Does not say anything about how the container was started.

Could you explain why you don’t know how the container was created? It might help us to come up with an other idea to try.

To extend on what @rimelek wrote: can you do a docker inspect --format "{{json .ContainerConfig.Labels}}" {container name or id} (of course you need to replace {container name or id} with the container name or id of the container you want to check). If a container was started by docker compose or docker stack deploy it will show labels that indicate it.

Update: if the above is not working, try: docker inspect --format "{{json .Config.Labels}}" {container name or id}

update2: If the output of ctr ns ls shows more than default and moby, it could be an evidence that another orchestrator is present as well.

1 Like

It was set up by a 3rd party vendor many years ago. Now we are not using it anymore. Hence, would need to remove them. Furthermore, the engineer managing this server is not in contact anymore. So, I’m supposed to support it going forward. Tha output attached before is the full output after running the command sudo grep -r “docker” /usr/lib/systemd

both the commands geives error “No such Object”

It’s either .Config or .ContainerConfig and the “{ }” was just to indicate that it’s a placeholder. Please repeat without the curly braces.

Please find below output

You can safely rule out docker-compose and docker swarm, as they would have labels starting with com.docker.compose. for docker-compose and com.docker.swarm. for docker stack deploy and docker service create.

So it must be systemd, k8s or another orchestrator.

Can i try for each. how to remove if it’s systemd?

@meyay 's post gave me an idea. We usually think of snap packages when we see Docker does not work properly, but in this case I didn’t think of it. However I can imagine that you have a snap package that runs a cluster. Like minikube or microk8s. In that case I would check snap packages if snap is installed on the server

snap list

and docker context:

docker context ls

You can also run these commands:

ps aux | grep docker

or

ps aux | grep kube

Or just run

ps auxf

and browse the running processes until you see something suspicious. The “f” option does not always work but if it works, it shows the process tree so it will be easier to see what runs what.

If systemd is what restarts the containers, you can list systemd services:

# it shows everything I think
systemctl
# usually it is enough
systemctl list-units

and try to recognize services. The problem is that it is also possible that you just have a cron job that restarts containers when it sees the container is not running. Or it can even be a python service or anything. So not an easy task if you don’t have any information about how the previous maintainer ran the containers, but I hope the above commands will help.

‘snap list’ & ‘docker context ls’ gives no command error.
‘ps aux | grep kube’ gives output
“elastic 11576 0.0 0.0 114288 2316 pts/0 S+ 08:52 0:00 grep --color=auto kube”

Executing other commands gives lot of information in output. which i don’t know how to best utilize.

Other details that i collected from documentations shared by vendor.In case it’s helpful

"- Docker engine is installed on the host. ES Docker containers are run as system enabled services, configured to start automatically on system startup

If we assume it is Kubernetes which is recreating the containers, then how can kubernetes configuration file be found which is responsible for recreating the containers(May be we need to comment out the replication controller or load balancer that is recreating the containers in the kubernetes config file) and then run the docker commands again to stop the containers.

I am pretty sure the documentation would have mentioned kubernetes and how the kubeconfig needs to be configured to establish connection to the cluster and control it. Once connected you can remove the responsible Deplyoyment (or StatefulSet orDaemonSet).

I can’t imagine that your documentation just mentions “system enabled services” without further details, Still it looks most likely that systemd services are resonsible to control the lifecycle of the containers.

I am afraid you have to identify the corresponding systemd service yourself, as I have no idea how to identify it.

I worked it out by following below steps:

Sudo systemctl stop docker  -- This stopped the docker 

Then starting the docker properly
Sudo systemctl enable docker
Sudo systemctl start docker

Once up, the unneeded containers were gone

Next, unneeded images could be removed by
Sudo docker image rm – It worked this time

You just started this topic with this

systemctl restart is basically systemctl stop and systemctl start . It stops the deamon which means also stopping containers. When you start the daemon again, only containers with a proper restart policy will start automatically. Containers run with --rm will be deleted when they stop. If you don’t have any running container from an image, the image can be removed without --force. So I really don’t understand why it helped, because restart should have had the same result. As if stopping and starting the docker daemon with two commands instead of just restarting gave a service enough time to recognize the daemon was not running so it stopped running too, therefore it could not recreate the containers again. I am glad it worked for you, but keep your eyes open, because I wouldn’t be surprised if those removed containers were recreated again some times later. For example when you restart the machine.

By the way systemctl enable docker only enables docker to be started when you restart the machine. You don’t need that command if docker is already enabled. Of course, it doesn’t hurt either.

That’s correct. I tried “sudo systemctl restart docker” in another instance and it didn’t removed there as well.and you are right, on restarting the machine, the containers & images got recreated :frowning:
Since, there is better clarity now. Can anymore steps be please suggested?

I don’t see anything would be more clear. You just assumed the way you tried would help, but it didn’t, exactly as I expected it. I think we shared all of our ideas which we would do in case of a similar situation, including using the ps command to list processes and find and grep to search for files. Since I would do the same, and this is how you CAN find what you want, unfortunately we can’t give you more ideas. Of course I can’t speak on behalf of everyone, but I am pretty sure this is the case. Maybe there are some tools to make that process listing easier but this should be enough. List the processes and check them one by one. If you don’t recognize the name, search for it on Google. It is not a one-minute process, but it should help yopu to solve the issue eventially.

Don’t forget to check the system logs as well. Maybe you can find an error message or status message that tells what you is restarting the containers.

I removed the solution flag from your post since it was clearly not the solution.

1 Like

Attempted below & trying my luck again in case these could help provide any clarity to find a resolution to the problem :frowning:

Update 1:
Below some part of output on running the 'docker inspect <ConatinerID to be deleted>' command
Output:
"HostConfig": {
            "Binds": [
                "<some-location>:/usr/share/elasticsearch/config",
                "<some-location>:/usr/share/elasticsearch/data",
                "<some-location>:/usr/share/elasticsearch/logs"],
"GraphDriver": {
            "Data": {
                "LowerDir": "/var/lib/docker/overlay2/<somepath>/diff:/var/lib/docker/overlay2/<somepat>

Update 2:
find . -name docker-compose.yml – None of the docker-compose.yml has the mention of the containers to be deleted

/var/log/messages snapshot here when i try to docker stop the container