After upgraded to docker 23.0.0 docker service crashes every night

We are running Hudu which uses docker containers on Ubuntu 20.04.
After upgrading to docker 23 every night around exact same time the docker service crashes with the following errors:
(we do have resolvconf setup so resolv.conf to have DNS servers)

Feb 12 22:00:00 hostname dockerd[984]: panic: runtime error: invalid memory address or nil pointer dereference
Feb 12 22:00:00 hostname dockerd[984]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x558a19bb87ff]
Feb 12 22:00:00 hostname dockerd[984]: goroutine 3330 [running]:
Feb 12 22:00:00 hostname dockerd[984]: github.com/docker/docker/libnetwork.(*resolver).ServeDNS(0xc000e5eee0, {0x558a1b14f0c0, 0xc000f27a00}, 0xc0004c94d0)
Feb 12 22:00:00 hostname dockerd[984]:         /go/src/github.com/docker/docker/libnetwork/resolver.go:399 +0x6ff
Feb 12 22:00:00 hostname dockerd[984]: github.com/docker/docker/vendor/github.com/miekg/dns.(*Server).serveDNS(0xc000fcd440, {0xc000840400, 0x1c, 0x200}, 0xc000f27a00)
Feb 12 22:00:00 hostname dockerd[984]:         /go/src/github.com/docker/docker/vendor/github.com/miekg/dns/server.go:651 +0x4e2
Feb 12 22:00:00 hostname dockerd[984]: github.com/docker/docker/vendor/github.com/miekg/dns.(*Server).serveUDPPacket(0xc000fcd440, 0x0?, {0xc000840400, 0x1c, 0x200}, {0x558a1b14bad0?, 0xc000014ee0}, 0xc000a34f40, {0x0, 0x0})
Feb 12 22:00:00 hostname dockerd[984]:         /go/src/github.com/docker/docker/vendor/github.com/miekg/dns/server.go:591 +0x185
Feb 12 22:00:00 hostname dockerd[984]: created by github.com/docker/docker/vendor/github.com/miekg/dns.(*Server).serveUDP
Feb 12 22:00:00 hostname dockerd[984]:         /go/src/github.com/docker/docker/vendor/github.com/miekg/dns/server.go:521 +0x485
Feb 12 22:00:00 hostname systemd[1]: docker.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Feb 12 22:00:00 hostname systemd[1]: docker.service: Failed with result 'exit-code'.
Feb 12 22:00:02 hostname systemd[1]: docker.service: Scheduled restart job, restart counter is at 1.
Feb 12 22:00:02 hostname systemd[1]: Stopped Docker Application Container Engine.
Feb 12 22:00:02 hostname systemd[1]: Starting Docker Application Container Engine...
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.637668789-05:00" level=info msg="Starting up"
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.641338075-05:00" level=info msg="[core] [Channel #1] Channel created" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.641549580-05:00" level=info msg="[core] [Channel #1] original dial target is: \"unix:///run/containerd/containerd.sock\"" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.641588081-05:00" level=info msg="[core] [Channel #1] parsed dial target is: {Scheme:unix Authority: Endpoint:run/containerd/containerd.sock URL:{Scheme:unix Opaque: User: Host: Path:/run/containerd/containerd.sock RawPath: OmitHost:false ForceQuery:false RawQuery: Fragment: RawFragment:}}" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.641758085-05:00" level=info msg="[core] [Channel #1] Channel authority set to \"localhost\"" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.642009591-05:00" level=info msg="[core] [Channel #1] Resolver state updated: {\n  \"Addresses\": [\n    {\n      \"Addr\": \"/run/containerd/containerd.sock\",\n      \"ServerName\": \"\",\n      \"Attributes\": {},\n      \"BalancerAttributes\": null,\n      \"Type\": 0,\n      \"Metadata\": null\n    }\n  ],\n  \"ServiceConfig\": null,\n  \"Attributes\": null\n} (resolver returned new addresses)" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.642320498-05:00" level=info msg="[core] [Channel #1] Channel switches to new LB policy \"pick_first\"" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.642410400-05:00" level=info msg="[core] [Channel #1 SubChannel #2] Subchannel created" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.642486402-05:00" level=info msg="[core] [Channel #1 SubChannel #2] Subchannel Connectivity change to CONNECTING" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.642520303-05:00" level=info msg="[core] [Channel #1 SubChannel #2] Subchannel picks a new address \"/run/containerd/containerd.sock\" to connect" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.642630505-05:00" level=info msg="[core] [Channel #1] Channel Connectivity change to CONNECTING" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.643244020-05:00" level=info msg="[core] [Channel #1 SubChannel #2] Subchannel Connectivity change to READY" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.643286221-05:00" level=info msg="[core] [Channel #1] Channel Connectivity change to READY" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.644112440-05:00" level=info msg="[core] [Channel #4] Channel created" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.644137441-05:00" level=info msg="[core] [Channel #4] original dial target is: \"unix:///run/containerd/containerd.sock\"" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.644166641-05:00" level=info msg="[core] [Channel #4] parsed dial target is: {Scheme:unix Authority: Endpoint:run/containerd/containerd.sock URL:{Scheme:unix Opaque: User: Host: Path:/run/containerd/containerd.sock RawPath: OmitHost:false ForceQuery:false RawQuery: Fragment: RawFragment:}}" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.644189442-05:00" level=info msg="[core] [Channel #4] Channel authority set to \"localhost\"" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.644270444-05:00" level=info msg="[core] [Channel #4] Resolver state updated: {\n  \"Addresses\": [\n    {\n      \"Addr\": \"/run/containerd/containerd.sock\",\n      \"ServerName\": \"\",\n      \"Attributes\": {},\n      \"BalancerAttributes\": null,\n      \"Type\": 0,\n      \"Metadata\": null\n    }\n  ],\n  \"ServiceConfig\": null,\n  \"Attributes\": null\n} (resolver returned new addresses)" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.644307045-05:00" level=info msg="[core] [Channel #4] Channel switches to new LB policy \"pick_first\"" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.644348446-05:00" level=info msg="[core] [Channel #4 SubChannel #5] Subchannel created" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.644377846-05:00" level=info msg="[core] [Channel #4 SubChannel #5] Subchannel Connectivity change to CONNECTING" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.644411947-05:00" level=info msg="[core] [Channel #4 SubChannel #5] Subchannel picks a new address \"/run/containerd/containerd.sock\" to connect" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.644441248-05:00" level=info msg="[core] [Channel #4] Channel Connectivity change to CONNECTING" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.644715054-05:00" level=info msg="[core] [Channel #4 SubChannel #5] Subchannel Connectivity change to READY" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.644746055-05:00" level=info msg="[core] [Channel #4] Channel Connectivity change to READY" module=grpc
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.731826797-05:00" level=info msg="[graphdriver] using prior storage driver: overlay2"
Feb 12 22:00:02 hostname dockerd[434619]: time="2023-02-12T22:00:02.762034905-05:00" level=info msg="Loading containers: start."

then the Hudu containers cannot be started anymore, we have to do:
cd ~/hudu2 && sudo docker compose down && sudo docker compose pull && sudo docker compose up -d

Feb 12 22:00:03 hostname dockerd[434619]: time="2023-02-12T22:00:03.090214901-05:00" level=info msg="ignoring event" container=7008c9991726ea6202af9748cc51f8d47b46fee082b5f4fb11b11a788bbeb3be module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Feb 12 22:00:03 hostname dockerd[434619]: time="2023-02-12T22:00:03.103182105-05:00" level=warning msg="ShouldRestart failed, container will not be restarted" container=7008c9991726ea6202af9748cc51f8d47b46fee082b5f4fb11b11a788bbeb3be daemonShuttingDown=false error="restart canceled" execDuration=12h55m35.317064705s exitStatus="{0 false 2023-02-13 03:00:03.076715585 +0000 UTC}" hasBeenManuallyStopped=true restartCount=0
Feb 12 22:00:06 hostname dockerd[434619]: time="2023-02-12T22:00:06.355353913-05:00" level=info msg="ignoring event" container=96e66941ce521f892e48a3b057957ccfa26a6052793b9369fdcec14e08b89f23 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Feb 12 22:00:06 hostname dockerd[434619]: time="2023-02-12T22:00:06.365663454-05:00" level=warning msg="ShouldRestart failed, container will not be restarted" container=96e66941ce521f892e48a3b057957ccfa26a6052793b9369fdcec14e08b89f23 daemonShuttingDown=false error="restart canceled" execDuration=12h55m38.477236854s exitStatus="{0 false 2023-02-13 03:00:06.344592561 +0000 UTC}" hasBeenManuallyStopped=true restartCount=0
Feb 12 22:00:06 hostname dockerd[434619]: time="2023-02-12T22:00:06.399934957-05:00" level=info msg="ignoring event" container=1ced9073ea6591fa64849e5e318a55d235bec3cbbaad1ccb03aa64a724fae26a module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Feb 12 22:00:06 hostname dockerd[434619]: time="2023-02-12T22:00:06.427548303-05:00" level=warning msg="ShouldRestart failed, container will not be restarted" container=1ced9073ea6591fa64849e5e318a55d235bec3cbbaad1ccb03aa64a724fae26a daemonShuttingDown=false error="restart canceled" execDuration=12h55m38.437140103s exitStatus="{0 false 2023-02-13 03:00:06.368834828 +0000 UTC}" hasBeenManuallyStopped=true restartCount=0
Feb 12 22:00:06 hostname dockerd[434619]: time="2023-02-12T22:00:06.743937513-05:00" level=info msg="ignoring event" container=5d639c52b6491790446f24bf798617632608efa8864ee9295a830b71ad565ef6 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Feb 12 22:00:06 hostname dockerd[434619]: time="2023-02-12T22:00:06.753757042-05:00" level=warning msg="ShouldRestart failed, container will not be restarted" container=5d639c52b6491790446f24bf798617632608efa8864ee9295a830b71ad565ef6 daemonShuttingDown=false error="restart canceled" execDuration=12h55m38.798670342s exitStatus="{0 false 2023-02-13 03:00:06.723972545 +0000 UTC}" hasBeenManuallyStopped=true restartCount=0
Feb 12 22:00:12 hostname dockerd[434619]: time="2023-02-12T22:00:12.984632382-05:00" level=info msg="Container failed to exit within 10s of signal 15 - using the force" container=297f6d7823811484923b79d0888e922fe949a51eff800f411dfade2f9077301d
Feb 12 22:00:13 hostname dockerd[434619]: time="2023-02-12T22:00:13.066621097-05:00" level=info msg="ignoring event" container=297f6d7823811484923b79d0888e922fe949a51eff800f411dfade2f9077301d module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Feb 12 22:00:13 hostname dockerd[434619]: time="2023-02-12T22:00:13.076592430-05:00" level=warning msg="ShouldRestart failed, container will not be restarted" container=297f6d7823811484923b79d0888e922fe949a51eff800f411dfade2f9077301d daemonShuttingDown=false error="restart canceled" execDuration=12h55m45.068485129s exitStatus="{137 false 2023-02-13 03:00:13.056640064 +0000 UTC}" hasBeenManuallyStopped=true restartCount=0
Feb 12 22:00:13 hostname dockerd[434619]: time="2023-02-12T22:00:13.366050992-05:00" level=info msg="Removing stale sandbox 002f92584605c03c08059445eb7e7fdb9ecec4da79f7949acc06e6101fe77b01 (7008c9991726ea6202af9748cc51f8d47b46fee082b5f4fb11b11a788bbeb3be)"
Feb 12 22:00:13 hostname dockerd[434619]: time="2023-02-12T22:00:13.375574514-05:00" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint 0906a58a168f88cbd50a3954a2b1f93e6dde8c5f04e105ed88df621e39444459 68f7b942574d45f4d9e67856eb826901f60764f3b8f8b808ff16cfc90605dfc2], retrying...."
Feb 12 22:00:13 hostname dockerd[434619]: time="2023-02-12T22:00:13.483620638-05:00" level=info msg="Removing stale sandbox 21f45f4ecdbbb985107570dbe6a7b4d06932800348640cdacb67893cc4c6e675 (5d639c52b6491790446f24bf798617632608efa8864ee9295a830b71ad565ef6)"
Feb 12 22:00:13 hostname dockerd[434619]: time="2023-02-12T22:00:13.530785540-05:00" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint 0906a58a168f88cbd50a3954a2b1f93e6dde8c5f04e105ed88df621e39444459 c74e2d065e628db1668832d7eca3a3334ce1affc99b89a4d526ea4f8d3847569], retrying...."
Feb 12 22:00:13 hostname dockerd[434619]: time="2023-02-12T22:00:13.644773503-05:00" level=info msg="Removing stale sandbox 2f31854075f09fbb0200926c72b7100fc1befb49ea5e4d0f1e0b5e779e024695 (297f6d7823811484923b79d0888e922fe949a51eff800f411dfade2f9077301d)"
Feb 12 22:00:13 hostname dockerd[434619]: time="2023-02-12T22:00:13.657204193-05:00" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint 0906a58a168f88cbd50a3954a2b1f93e6dde8c5f04e105ed88df621e39444459 f48609718aa65e68e21ab3fba9bfcb07b1ce529a3b0a793257301682e6066fda], retrying...."
Feb 12 22:00:13 hostname dockerd[434619]: time="2023-02-12T22:00:13.794109291-05:00" level=info msg="Removing stale sandbox 6ea53cf6162dacca7efc847b2d767f483f7ba30470409687b2a638d22eef8a4e (96e66941ce521f892e48a3b057957ccfa26a6052793b9369fdcec14e08b89f23)"
Feb 12 22:00:13 hostname dockerd[434619]: time="2023-02-12T22:00:13.802997499-05:00" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint 0906a58a168f88cbd50a3954a2b1f93e6dde8c5f04e105ed88df621e39444459 1e8a6760aa95afa7cbce0c0d5b1da2747b4f3b5421b053a2415a814e760e815e], retrying...."
Feb 12 22:00:13 hostname dockerd[434619]: time="2023-02-12T22:00:13.916699055-05:00" level=info msg="Removing stale sandbox 950e0986dfe74c694436c9853275209e83bb37d0ee1b1b5fef5235367945e745 (1ced9073ea6591fa64849e5e318a55d235bec3cbbaad1ccb03aa64a724fae26a)"
Feb 12 22:00:13 hostname dockerd[434619]: time="2023-02-12T22:00:13.926124475-05:00" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint 0906a58a168f88cbd50a3954a2b1f93e6dde8c5f04e105ed88df621e39444459 9f7544fdb5cb8b49164fcd31f18933c6939dced999b5f69ba57522364bf0c590], retrying...."
Feb 12 22:00:13 hostname dockerd[434619]: time="2023-02-12T22:00:13.991452601-05:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
Feb 12 22:00:14 hostname dockerd[434619]: time="2023-02-12T22:00:14.051646207-05:00" level=info msg="Loading containers: done."
Feb 12 22:00:14 hostname dockerd[434619]: time="2023-02-12T22:00:14.089519791-05:00" level=warning msg="WARNING: No swap limit support"
Feb 12 22:00:14 hostname dockerd[434619]: time="2023-02-12T22:00:14.089871599-05:00" level=info msg="Docker daemon" commit=bc3805a graphdriver=overlay2 version=23.0.1
Feb 12 22:00:14 hostname dockerd[434619]: time="2023-02-12T22:00:14.090103705-05:00" level=info msg="Daemon has completed initialization"
Feb 12 22:00:14 hostname dockerd[434619]: time="2023-02-12T22:00:14.108699239-05:00" level=info msg="[core] [Server #7] Server created" module=grpc
Feb 12 22:00:14 hostname systemd[1]: Started Docker Application Container Engine.
Feb 12 22:00:14 hostname dockerd[434619]: time="2023-02-12T22:00:14.113167143-05:00" level=info msg="API listen on /run/docker.sock"

As a note, we had a support ticket with Hudu.
They said they have fixed it and fix is out in beta and soon in GA.
No details yet on what the fix is adn why Hudu was crashing the docker service.

I don’t know about Hudu, but there is a newer Docker version already with some bugfixes. 23.0.1. It is recommended to upgrade to that version too.

@rimelek yes, we already applied that with no success. Hudu has some tasks that run at certain interval. Seems one crashed docker service. Maybe they will tell more when they release new update.

This is Invalid PTR query causes resolver panic · Issue #44979 · moby/moby · GitHub, which will be fixed in 23.0.2 (expect it early next month or so). Technically Hudu is sending an invalid query, so a fix on their end is more proper, but obviously the Docker daemon should not exit just because someone pushed malformed data over the network.

@neersighted Hudu actually fixed it in 2.21. we upgraded a few days ago and since then no issues.

Good to hear! The root cause will be fixed in Moby/Docker soon as well :smile: