Docker Daemon Hangs Due to Network Controller Errors


I am running around 10 microservices in Docker, written in Java and C++. Once a month, the Docker daemon becomes totally stuck, and there is no response from the API (Socket). Although containerd and runc appear to be operating. It is impossible to restart the Docker daemon.

In the Docker logs, I see errors related to the Network controller while restarting the Docker daemon:

Host dockerd[6628]: panic: runtime error: index out of range [0] with length 0
Host dockerd[6628]: goroutine 1 [running]:
Host dockerd[6628]:*controller).reservePools(0x55f2d82822e0?)
Host dockerd[6628]:         /go/src/ +0xb65

I have found a quick fix to remove /var/lib/docker/network/files/local-kv.db file and restart the daemon, which resolves the issue.

However, I am unable to determine the root cause of the problem or its relation to other factors. Could you provide any insight or help me understand this issue better?


Issue type: Docker daemon hang
OS Version/build: Ubuntu 20
App version: Docker 20.10