Every docker command hangs

Every docker hangs. I encountered first this yesterday (with a known incident in one of our remote server 2 days ago Feb 16 2022).

  • All docker compose commands I tried hanged (docer-compose up, down, ps)
  • Thus I tried docker commands, which also hanged (e.g docker -v docker --help)

I followed What to do when all docker commands hang? - #4 by samsgates but to no avails, first I tried stoping docker service :

$ sudo system docker stop # hangs…

I then tried killing the related process

$ ps -A | grep docker
12381 ? 00:00:00 dockerd
3147 ? 00:00:00 docker-proxy
3189 ? 00:00:00 docker-proxy
3222 ? 00:00:00 docker-proxy
3239 ? 00:00:00 docker-proxy
3321 ? 00:00:00 docker-proxy

$ sudo pkill -x docker-proxy
$ sudo pkill -x dockerd
$ ps -A | grep docker
13190 ? 00:00:02 dockerd
14285 ? 00:00:00 docker-proxy
14295 ? 00:00:00 docker-proxy
14347 ? 00:00:00 docker-proxy
14354 ? 00:00:00 docker-proxy
14424 ? 00:00:00 docker-proxy
14435 ? 00:00:00 docker-proxy

The problem happened on one remote server, and twice on my local machine, I was finally able to overcome the problem yesterday on my local machine by wipping all files and restarting and on the remote machine by reinitializing to a clean ubuntu install but the problem keep happening.

How should I proceed? I could manage to get it working again but I don’t want it to fail every morning.

IOS : Pop_OS! 20.04 LTS, Ubuntu 20.04 LTS
Docker version ( 20.10.12 on remote server)
Client: Docker Engine - Community
Version: 20.10.12
API version: 1.41
Go version: go1.16.12
Git commit: e91ed57
Built: Mon Dec 13 11:45:33 2021
OS/Arch: linux/amd64
Context: default
Experimental: true

Server: Docker Engine - Community
Engine:
Version: 20.10.12
API version: 1.41 (minimum version 1.12)
Go version: go1.16.12
Git commit: 459d0df
Built: Mon Dec 13 11:43:41 2021
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.12
GitCommit: 7b11cfaabd73bb80907dd23182b9347b4245eb5d
runc:
Version: 1.0.2
GitCommit: v1.0.2-0-g52b36a2
docker-init:
Version: 0.19.0
GitCommit: de40ad0

1 Like

I figured out docker hanged because dockerd was stuck in activation mode

me@pop-os ~ % sudo systemctl status docker       
● docker.service - Docker Application Container Engine
     Loaded: loaded (/lib/systemd/system/docker.service; disabled; vendor preset: enabled)
     Active: activating (start) since Wed 2022-02-23 15:15:39 CET; 26min ago
TriggeredBy: ● docker.socket
       Docs: https://docs.docker.com
   Main PID: 5936 (dockerd)
      Tasks: 55
     Memory: 145.5M
     CGroup: /system.slice/docker.service
             ├─5936 /usr/bin/dockerd -D -H fd:// --containerd=/run/containerd/containerd.sock
             ├─6907 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 9113 -container-ip 192.168.208.2 -container-port 9113
             ├─6965 /usr/bin/docker-proxy -proto tcp -host-ip :: -host-port 9113 -container-ip 192.168.208.2 -container-port 9113
             ├─7015 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 9726 -container-ip 192.168.208.3 -container-port 9726
             └─7026 /usr/bin/docker-proxy -proto tcp -host-ip :: -host-port 9726 -container-ip 192.168.208.3 -container-port 9726

I’m still trying to figure out what went wrong

Have you fixed it? I encountered the same problem.

1 Like

I encountered this problem as well after following the instructions at Install Docker Engine on Debian | Docker Docs.

Eventually, I noticed that repeating the systemctl status command showed a new PID every time – so the docker unit file was running, failing, and restarting. During that time, docker commands failed.

The constant restarting was, in my case, because of an old and out-of-date systemd unit file that I’d installed years ago. The old, bad one was at /etc/systemd/system/docker.service – which seems to override the newer, working one installed at /lib/systemd/system/docker.service by the proper method of installation. The old file had a command to start the docker service that was quite out of date and no longer worked; systemd would run that command, it’d fall over immediately, then the process would start again.

Once I realized what was going on, I removed the old, bad file, ran systemctl daemon-reload, and restarted the docker service. Presto – everything works now.

This won’t work in every situation, but it’s worth checking to see if the way the docker daemon is being started is incorrect, or if you have multiple systemd unit files.

1 Like

Thank you for sharing your solution. I just want to add some notes about the systemd unit files.

The service unit file is read from somewhere /etc/systemd, so just by having a file under /lib/systemd the service should not start. The installer will create a service unit file at

/lib/systemd/system/docker.service

but a symbolic link will point to it from

/etc/systemd/system/multi-user.target.wants/docker.service

When you run systemctl disable docker, the docker.service symlink is removed, but the original file isn’t.
Since there is a symlink to a socket file at

/etc/systemd/system/sockets.target.wants/docker.socket

pointing to

/lib/systemd/system/docker.socket

when you run a docker command, the docker service is started by the socket unit unless you disable it too

systemctl disable docker.socket

If you had a file at

/etc/systemd/system/docker.service

that could indeed cause problems, but the symlink is probably still there where it should be and loaded from there. When you run systemctl status docker, you will still see the target of the symlink.