Containers that have healthcheck becoming unhealthy, and buggy

Issue type: Healthcheck, container operation
OS Version/build: Archlinux: 5.4.52-1-lts #1 SMP Thu, 16 Jul 2020 19:35:06 +0000 x86_64 GNU/Linux
Docker version 19.03.12-ce, build 48a66213fe
AMD Ryzen 5 3600 6-Core Processor
32GB ram

Key Issue:
After my 23 containers have been up for ~5-8hrs, ALL health checks begin to fail - but the containers continue to operate as usual. CPU utilization is consistently < 10%. Snapshot of stats are below.

When I dig into the rationale for the healthcheck fail, I find: Health check exceeded timeout (30s)

if I enter the container (e.g., docker exec -it container /bin/bash), all looks fine, but then if I try to EXIT, it HANGS. same on all containers that have a healthcheck (e.g., pihole)… Containers without a healthcheck exit just fine. I am stuck. does anybody have any ideas? Ive been running essentially these same containers for years, on an even less capable machine. I’ve also tried deleting all of docker stores, and recreating all containers. same problem. thank you so much

NAME                   CPU % - MEM % - PIDS   MEM USAGE / LIMIT     NET I/O             BLOCK I/O
nginx                  0.00% - 0.09% - 13     13.97MiB / 15.65GiB   965kB / 0B          16.4kB / 0B
sftp                   0.00% - 0.04% - 1      6.73MiB / 15.65GiB    1.06MB / 0B         4.1kB / 168kB
sonarr                 0.00% - 1.13% - 17     181MiB / 15.65GiB     36.8MB / 72.2MB     18.2MB / 34.4MB
jackett                0.03% - 1.23% - 17     197.8MiB / 15.65GiB   17.4MB / 41.1MB     7.53MB / 352kB
plex                   0.21% - 2.16% - 78     346.7MiB / 15.65GiB   0B / 0B             213MB / 2.77MB
tautulli               0.05% - 0.50% - 27     79.68MiB / 15.65GiB   11.8MB / 11.1MB     60.4MB / 69.6kB
xyz                    0.12% - 0.95% - 16     151.8MiB / 15.65GiB   20.1MB / 13.8MB     63.6MB / 22.7MB
client_openvpn         0.01% - 0.09% - 5      14.66MiB / 15.65GiB   7.62MB / 4.9MB      34.5MB / 28.7kB
traefik                0.05% - 0.15% - 18     23.33MiB / 15.65GiB   259MB / 212MB       51.6MB / 0B
nextcloud              0.01% - 0.10% - 6      15.76MiB / 15.65GiB   2.32MB / 0B         50MB / 0B
zigbee2mqtt            0.16% - 0.40% - 22     64.74MiB / 15.65GiB   2.66MB / 404kB      51.8MB / 582kB
Portainer              0.01% - 0.12% - 18     18.63MiB / 15.65GiB   1.35MB / 1.49MB     35.1MB / 12.9MB
cloudflare-ddns        0.00% - 0.05% - 2      7.492MiB / 15.65GiB   2.39MB / 73kB       5.91MB / 4.1kB
pihole                 0.03% - 0.25% - 20     39.92MiB / 15.65GiB   12.8MB / 9.78MB     99.9MB / 138MB
mariadb                0.17% - 0.69% - 16     110.6MiB / 15.65GiB   15.8MB / 3.05MB     93.8MB / 2.33GB
Jupyter                0.00% - 0.36% - 2      57.36MiB / 15.65GiB   2.34MB / 23.8kB     93.3MB / 0B
xyzx                   0.06% - 0.66% - 18     105.5MiB / 15.65GiB   107MB / 8.97MB      95.1MB / 26.6MB
node-red               1.30% - 1.19% - 21     190.1MiB / 15.65GiB   0B / 0B             72.8MB / 61.4kB
home-assistant         1.59% - 1.49% - 78     239.3MiB / 15.65GiB   0B / 0B             122MB / 1.44MB
mosquitto              0.02% - 0.12% - 1      19.33MiB / 15.65GiB   4.05MB / 1.26MB     1.49MB / 0B
Open_VPN_Server        0.00% - 0.03% - 1      4.648MiB / 15.65GiB   1.41MB / 473kB      4.34MB / 0B
backuppc               0.00% - 0.23% - 5      36.29MiB / 15.65GiB   8.47MB / 512kB      46.8MB / 0B
watchtower             0.00% - 0.06% - 18     9.613MiB / 15.65GiB   692kB / 0B          7.7MB / 0B

I see the same issue, did you find any reason for this behaviour?

Yes, I did amazingly. I could not live with this and I tried everything, including completely recreating my docker installation.

But what it was, was the zigbee2mqtt program. Something was wrong with it, which was fixed in a new version. So the lesson is, is that it could be any one of your running containers. This one made have been in host mode, or privileged - so you might start by taking a look at those containers.

Ok, thanks for your reply. I have a fix under test now based on this information - https://github.com/moby/moby/issues/40817 , so far so good.