Dockerd crashing in libnetwork on launch

I’ve recently started seeing a weird dockerd crash in libnetwork on start. I tried downgrading, but that didn’t help. I also tried uninstalling the Nvidia container tools. The only solution that fixed the crash was completely wiping away /var/lib/docker. The only container I am running on this daemon is Jellyfin. However, the problem returns on system reboot.

Below is the log up until the crash when I start dockerd manually. Note that the exact same crash occurs when the systemd service starts. I’ve also added some info about my system. Let me know if I need to share any more info.

Thanks.

$ sudo dockerd
INFO[2023-02-19T17:29:03.455357713-05:00] Starting up
INFO[2023-02-19T17:29:03.456995192-05:00] [core] [Channel #1] Channel created           module=grpc
INFO[2023-02-19T17:29:03.457036179-05:00] [core] [Channel #1] original dial target is: "unix:///run/containerd/containerd.sock"  module=grpc
INFO[2023-02-19T17:29:03.457075584-05:00] [core] [Channel #1] parsed dial target is: {Scheme:unix Authority: Endpoint:run/containerd/containerd.sock URL:{Scheme:unix Opaque: User: Host: Path:/run/containerd/containerd.sock RawPath: OmitHost:false ForceQuery:false RawQuery: Fragment: RawFragment:}}  module=grpc
INFO[2023-02-19T17:29:03.457099058-05:00] [core] [Channel #1] Channel authority set to "localhost"  module=grpc
INFO[2023-02-19T17:29:03.457237128-05:00] [core] [Channel #1] Resolver state updated: {
  "Addresses": [
    {
      "Addr": "/run/containerd/containerd.sock",
      "ServerName": "",
      "Attributes": {},
      "BalancerAttributes": null,
      "Type": 0,
      "Metadata": null
    }
  ],
  "ServiceConfig": null,
  "Attributes": null
} (resolver returned new addresses)  module=grpc
INFO[2023-02-19T17:29:03.457335134-05:00] [core] [Channel #1] Channel switches to new LB policy "pick_first"  module=grpc
INFO[2023-02-19T17:29:03.457397451-05:00] [core] [Channel #1 SubChannel #2] Subchannel created  module=grpc
INFO[2023-02-19T17:29:03.457475959-05:00] [core] [Channel #1 SubChannel #2] Subchannel Connectivity change to CONNECTING  module=grpc
INFO[2023-02-19T17:29:03.457505585-05:00] [core] [Channel #1 SubChannel #2] Subchannel picks a new address "/run/containerd/containerd.sock" to connect  module=grpc
INFO[2023-02-19T17:29:03.457674493-05:00] [core] [Channel #1] Channel Connectivity change to CONNECTING  module=grpc
INFO[2023-02-19T17:29:03.457919766-05:00] [core] [Channel #1 SubChannel #2] Subchannel Connectivity change to READY  module=grpc
INFO[2023-02-19T17:29:03.457944513-05:00] [core] [Channel #1] Channel Connectivity change to READY  module=grpc
INFO[2023-02-19T17:29:03.458479222-05:00] [core] [Channel #4] Channel created           module=grpc
INFO[2023-02-19T17:29:03.458491565-05:00] [core] [Channel #4] original dial target is: "unix:///run/containerd/containerd.sock"  module=grpc
INFO[2023-02-19T17:29:03.458502737-05:00] [core] [Channel #4] parsed dial target is: {Scheme:unix Authority: Endpoint:run/containerd/containerd.sock URL:{Scheme:unix Opaque: User: Host: Path:/run/containerd/containerd.sock RawPath: OmitHost:false ForceQuery:false RawQuery: Fragment: RawFragment:}}  module=grpc
INFO[2023-02-19T17:29:03.458510812-05:00] [core] [Channel #4] Channel authority set to "localhost"  module=grpc
INFO[2023-02-19T17:29:03.458532703-05:00] [core] [Channel #4] Resolver state updated: {
  "Addresses": [
    {
      "Addr": "/run/containerd/containerd.sock",
      "ServerName": "",
      "Attributes": {},
      "BalancerAttributes": null,
      "Type": 0,
      "Metadata": null
    }
  ],
  "ServiceConfig": null,
  "Attributes": null
} (resolver returned new addresses)  module=grpc
INFO[2023-02-19T17:29:03.458550045-05:00] [core] [Channel #4] Channel switches to new LB policy "pick_first"  module=grpc
INFO[2023-02-19T17:29:03.458564683-05:00] [core] [Channel #4 SubChannel #5] Subchannel created  module=grpc
INFO[2023-02-19T17:29:03.458578569-05:00] [core] [Channel #4 SubChannel #5] Subchannel Connectivity change to CONNECTING  module=grpc
INFO[2023-02-19T17:29:03.458598016-05:00] [core] [Channel #4 SubChannel #5] Subchannel picks a new address "/run/containerd/containerd.sock" to connect  module=grpc
INFO[2023-02-19T17:29:03.458637110-05:00] [core] [Channel #4] Channel Connectivity change to CONNECTING  module=grpc
INFO[2023-02-19T17:29:03.458767345-05:00] [core] [Channel #4 SubChannel #5] Subchannel Connectivity change to READY  module=grpc
INFO[2023-02-19T17:29:03.458800679-05:00] [core] [Channel #4] Channel Connectivity change to READY  module=grpc
INFO[2023-02-19T17:29:03.474131360-05:00] [graphdriver] using prior storage driver: overlay2
INFO[2023-02-19T17:29:03.479738078-05:00] Loading containers: start.
INFO[2023-02-19T17:29:03.672352245-05:00] failed to read ipv6 net.ipv6.conf.<bridge>.accept_ra  bridge=docker0 syspath=/proc/sys/net/ipv6/conf/docker0/accept_ra
panic: runtime error: index out of range [0] with length 0

goroutine 1 [running]:
github.com/docker/docker/libnetwork.(*controller).reservePools(0x5626885d0300?)
	/go/src/github.com/docker/docker/libnetwork/controller.go:862 +0xb65
github.com/docker/docker/libnetwork.New({0xc000628230, 0x8, 0xe})
	/go/src/github.com/docker/docker/libnetwork/controller.go:233 +0x6bc
github.com/docker/docker/daemon.(*Daemon).initNetworkController(0xc000b3c000, 0xc000ccd590)
	/go/src/github.com/docker/docker/daemon/daemon_unix.go:851 +0x4e
github.com/docker/docker/daemon.(*Daemon).restore(0xc000b3c000)
	/go/src/github.com/docker/docker/daemon/daemon.go:478 +0x51d
github.com/docker/docker/daemon.NewDaemon({0x5626885e4b38?, 0xc000ea6640}, 0xc0008f5b80, 0xc0005d2090)
	/go/src/github.com/docker/docker/daemon/daemon.go:1085 +0x2b79
main.(*DaemonCli).start(0xc000cb8420, 0xc000b39490)
	/go/src/github.com/docker/docker/cmd/dockerd/daemon.go:200 +0x9f6
main.runDaemon(...)
	/go/src/github.com/docker/docker/cmd/dockerd/docker_unix.go:14
main.newDaemonCommand.func1(0xc000960c00?, {0x562689965730?, 0x0?, 0x0?})
	/go/src/github.com/docker/docker/cmd/dockerd/docker.go:38 +0x5e
github.com/docker/docker/vendor/github.com/spf13/cobra.(*Command).execute(0xc000960c00, {0xc000052240, 0x0, 0x0})
	/go/src/github.com/docker/docker/vendor/github.com/spf13/cobra/command.go:916 +0x862
github.com/docker/docker/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xc000960c00)
	/go/src/github.com/docker/docker/vendor/github.com/spf13/cobra/command.go:1044 +0x3bd
github.com/docker/docker/vendor/github.com/spf13/cobra.(*Command).Execute(...)
	/go/src/github.com/docker/docker/vendor/github.com/spf13/cobra/command.go:968
main.main()
	/go/src/github.com/docker/docker/cmd/dockerd/docker.go:102 +0x15d

System info:

$ docker version
Client: Docker Engine - Community
 Version:           23.0.1
 API version:       1.42
 Go version:        go1.19.5
 Git commit:        a5ee5b1
 Built:             Thu Feb  9 19:47:01 2023
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          23.0.1
  API version:      1.42 (minimum version 1.12)
  Go version:       go1.19.5
  Git commit:       bc3805a
  Built:            Thu Feb  9 19:47:01 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.18
  GitCommit:        2456e983eb9e37e47538f59ea18f2043c9a73640
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
$ sudo apt list --installed | grep -E "docker|container"

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

containerd.io/jammy,now 1.6.18-1 amd64 [installed]
docker-buildx-plugin/jammy,now 0.10.2-1~ubuntu.22.04~jammy amd64 [installed]
docker-ce-cli/jammy,now 5:23.0.1-1~ubuntu.22.04~jammy amd64 [installed]
docker-ce-rootless-extras/jammy,now 5:23.0.1-1~ubuntu.22.04~jammy amd64 [installed,automatic]
docker-ce/jammy,now 5:23.0.1-1~ubuntu.22.04~jammy amd64 [installed]
docker-compose-plugin/jammy,now 2.16.0-1~ubuntu.22.04~jammy amd64 [installed]
docker-scan-plugin/jammy,now 0.23.0~ubuntu-jammy amd64 [installed,automatic]
libnvidia-container-tools/bionic,now 1.12.0-1 amd64 [installed,automatic]
libnvidia-container1/bionic,now 1.12.0-1 amd64 [installed,automatic]
nvidia-container-toolkit-base/bionic,now 1.12.0-1 amd64 [installed]
nvidia-container-toolkit/bionic,now 1.12.0-1 amd64 [installed]
$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=22.04
DISTRIB_CODENAME=jammy
DISTRIB_DESCRIPTION="Ubuntu 22.04.2 LTS"
$ uname -a
Linux carthage.lan 5.15.0-60-generic #66-Ubuntu SMP Fri Jan 20 14:29:49 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Did you enable IPv6 for Docker containers? Do you need it?

It looks like some kind of bug in libnetwork detecting ipv6 kernel parameters. There was some new feature in libnetwork to support DOCKER-USER chain in ipv6 iptables rules which was introduced in Docker 23.0.0. This and similar changes could be related to your issue.

Hmm, I don’t think so. How would I check if IPv6 is enabled? And how do I disable it?

Thanks for the full stack trace! This does look like a new regression in 23.0.1, though I wonder why you’re the first to see it. You /might/ be able to set ipv6.disabled=1 on the kernel command line to work around it, but I’m not sure what preconditions we have on reading this file.

Anyway, I’ll open a ticket upstream to investigate. If you can figure out what factor makes this happen for you/reproduce in a clean VM, please let us know here!

The upstream issue is at Panic in libnetwork during daemon start (failed to read /proc/sys/net/ipv6/conf/docker0/accept_ra) · Issue #45057 · moby/moby · GitHub – please feel free to provide more details like a docker info there as well.

3 Likes