Docker exec gets stuck on exit

Hi,
I’m running docker 19.03.12-ce on Manjaro ARM (aarch64) on a raspberry Pi4. I’ve been using this setup for a while running different containers: pihole, unbound, nextcloud, letsencrypt and airsonic.

However I’m experiencing weird issue since few days. I noticed 2 things:
– if I try to stop some containers they will keep showing as running
– when I use exec ( docker exec -it container sh ), it will get stuck on exit.

The container themselves seems to be working fine because I can access the application running inside (pihole via https and unbound via dig command).

I tried to delete all containers and recreated them in case of problem with the images. I only kept the volumes for the data. The problem still persist after trying that.

I checked systemctl status dockerand saw some errors:

Sep 01 19:15:49 picloud dockerd[383]: time="2020-09-01T19:15:49.786217800+02:00" level=warning msg="Health check for container 0cdd14ad024d820e89a6b73d8aef68ea697d950c246ee90d49cb63b6aa657e22 error: context deadline exceeded"
Sep 01 19:15:52 picloud dockerd[383]: time="2020-09-01T19:15:52.489964718+02:00" level=warning msg="Health check for container cc28ad563148fc3c31b778c7819e801005ce98f242e6f769d61a959e251643c2 error: context deadline exceeded"

I checked the container ID and they are for pihole and unbound respectively.

For both container I can have a shell with docker exec -it container sh and run commands. But as soon as I try to exit it will get stuck. I have to kill the docker exec process. When it happens both containers are showing as unhealthy.

Last night I forced a restart of both pihole and unbound containers. By force I mean I had to stop them by killing the containerd processes. At first I was able to use docker exec -it without problem to execute commands and exit. The containers were showing as healthy too. However this morning both containers were showing as unhealthy again and docker exec -it is stuck on exit.

Any suggestions about what I should check next to investigate ? Thanks.

System informations

$ docker version
Client:
 Version:           19.03.12-ce
 API version:       1.40
 Go version:        go1.14.5
 Git commit:        48a66213fe
 Built:             Sat Jul 18 02:40:17 2020
 OS/Arch:           linux/arm64
 Experimental:      false

Server:
 Engine:
  Version:          19.03.12-ce
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.14.5
  Git commit:       48a66213fe
  Built:            Sat Jul 18 02:39:40 2020
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          v1.4.0.m
  GitCommit:        09814d48d50816305a8e6c1a4ae3e2bcc4ba725a.m
 runc:
  Version:          1.0.0-rc92
  GitCommit:        ff819c7e9184c13b7c2607fe6c30ae19403a7aff
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683
$
$ docker info
Client:
 Debug Mode: false

Server:
 Containers: 6
  Running: 5
  Paused: 0
  Stopped: 1
 Images: 30
 Server Version: 19.03.12-ce
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 09814d48d50816305a8e6c1a4ae3e2bcc4ba725a.m
 runc version: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.4.59-1-MANJARO-ARM
 Operating System: Manjaro ARM
 OSType: linux
 Architecture: aarch64
 CPUs: 4
 Total Memory: 3.758GiB
 Name: picloud
 ID: 7VJ7:5I7W:7D5N:OWTP:I4ZH:AFDM:TYNF:E4IP:MWJC:E2L6:LVZQ:RTCE
 Docker Root Dir: /data/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No swap limit support
1 Like

Hi, I also encountered the same problem. One of my containers (gitlab) goes into unhealthy after 4 hours, systemctl status docker giving me exactly the same logs. Also, I can’t quit the prompt after running docker exec -it ....

I also noticed an astonishing behavior, when I try to run a command with sudo in another terminal and docker is in error state, the password is always rejected, although I’m sure it’s the right one. However, I was unable to reproduce the problem afterwards.

I tried to downgrade docker until 18.09.8, same result. I’m on Manjaro too, in x86_64.

I configured 2 VM to see if I can reproduce the problem with the same containers:

– CentOS 7 x86_64, kernel 3.10.0, docker-ce 19.03.12
– Manjaro x86_64, kernel 5.4.60, docker-ce 19.03.12

Like @kiwhacks I also have the same problem with Manjaro x86_64. After about 4-5 hours my pihole and unbound containers became unhealthy. At that point running docker exec -it ... showed the same problem.

On the other hand docker on CentOS is working fine. The containers are up for about 10 hours now with no sign of problem. The containers are still healthy and there is no problem to exit docker exec with and without the -it flag.

At this point I think it might be something specific to Manjaro. The first thing I’m thinking about is the Kernel version. Both Manjaro install have almost the same kernel: 5.4.60 for x86_84 and 5.4.59 for aarch64.

I will try to switch to an older Kernel on Manjaro, like 4.9.

Manjaro x86_64 with kernel 4.9.233-2-MANJARO and docker 19.03.12-ce has the same problem. I don’t think I’ll be able to test with that Kernel on ARM

In parallel the VM with CentOS is still OK. Containers that become unhealthy on Manjaro are still healthy on CentOS. No problem with docker exec.

I also experience this issue since the last few weeks, running Linux 5.8.6-1-MANJARO. When a container has been running long enough with docker exec, after the process exits the container fails to. Interrupts are ignored and docker stop and docker kill do not function as they should, but systemctl restart docker does work (eventually) and restores normal functionality.