Docker daemon unexpected shutdown

I’m quite new to Docker.

I wrote an application with four containers and left the containers running on an Ubuntu 20.04.1 LTS.

Yesterday, docker daemon simply stopped. Fortunately the application wasn’t in production. Nobody was using it.

Using “sudo journalctl -fu docker.service” I saw a log, which, as far as I understand, says that the daemon stopped because of o broken pipe.

I don’t know if I’m interpreting the log correctly. I also have no idea of what this broken pipe means.

The log follows:

Dec 02 06:10:50 playip dockerd[272721]: time=“2020-12-02T06:10:50.893272829Z” level=error msg=“attach failed with error: error attaching stdout stream: write unix /run/docker.sock->@: write: broken pipe”
Dec 02 06:10:57 playip dockerd[272721]: time=“2020-12-02T06:10:57.311405315Z” level=info msg=“Container deb673757d6bd2df5b065fd2426c020df49ab2521802f9455a4a4eec5fe769eb failed to exit within 10 seconds >
Dec 02 06:10:57 playip dockerd[272721]: time=“2020-12-02T06:10:57.313347547Z” level=info msg=“Container 729ab27e181fe8946e68577633faee71599dcd1fcb1b573521eeffbe5bcf11bd failed to exit within 10 seconds >
Dec 02 06:10:57 playip dockerd[272721]: time=“2020-12-02T06:10:57.510109356Z” level=info msg=“ignoring event” module=libcontainerd namespace=moby topic=/tasks/delete type=”*events.TaskDelete”
Dec 02 06:10:57 playip dockerd[272721]: time=“2020-12-02T06:10:57.510202045Z” level=error msg=“attach failed with error: error attaching stdout stream: write unix /run/docker.sock->@: write: broken pipe”
Dec 02 06:10:57 playip dockerd[272721]: time=“2020-12-02T06:10:57.531745711Z” level=info msg=“ignoring event” module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Dec 02 06:10:57 playip dockerd[272721]: time=“2020-12-02T06:10:57.832386907Z” level=info msg=“stopping event stream following graceful shutdown” error="" module=libcontainerd namespace=moby
Dec 02 06:10:57 playip dockerd[272721]: time=“2020-12-02T06:10:57.832814995Z” level=info msg=“Daemon shutdown complete”
Dec 02 06:10:57 playip systemd[1]: docker.service: Succeeded.
Dec 02 06:10:57 playip systemd[1]: Stopped Docker Application Container Engine.

One of the containers was running a Python service, another was running NGINX, another was running Certbot and the last one was running a LibreSpeed server.

I read that if one container uses to much memory the Docker daemon can be killed by Linux, thus killing all other containers. Is that true?

However, a saw no indication of a memory exception. Just the log above.

Here is the output of docker version:

Client:
Version: 19.03.8
API version: 1.40
Go version: go1.13.8
Git commit: afacb8b7f0
Built: Wed Oct 14 19:43:43 2020
OS/Arch: linux/amd64
Experimental: false

Server:
Engine:
Version: 19.03.8
API version: 1.40 (minimum version 1.12)
Go version: go1.13.8
Git commit: afacb8b7f0
Built: Wed Oct 14 16:41:21 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.3.3-0ubuntu2.1
GitCommit:
runc:
Version: spec: 1.0.1-dev
GitCommit:
docker-init:
Version: 0.18.0
GitCommit:

How can I prevent this from happening again?

Is there anything else that I should check to understand what happened?

Docker daemon was not enabled in systemctl, but the host machine was not restarted (uptime of 44 days). So I don’t think this matters.

I restarted the daemon with “sudo service docker start” and all containers started automatically and are working correctly now.

Thank you for any help.

I have found out what caused the problem, but still didn’t understand its extension.

I had used the command “sudo docker-compose up” through a ssh shell, without “–detach”. The connection was lost after sometime causing the broken pipe.

However, shouldn’t that have stoped only my containers and not docker daemon?

Hey, it might be possible that the culprit is actually systemd By default systemd kills processes started by the user (might happen if you are using a rootless docker setup).

Solution

You can enable linger, or configure systemd to hold on to your processes. (or both)

To enable linger:

sudo loginctl enable-linger $USER

To configure systemd, add the foolowing lines to the end of the file /etc/systemd/logind.conf

UserStopDelaySec=infinity
KillUserProcesses=no

I wrote a blog about it with a bit more detail if you want to check it out.

Official systemd Docs

The systemd docs have this to say about enabling linger.

enable-linger [USER...], disable-linger [USER...]

Enable/disable user lingering for one or more users. If enabled for a specific user, a user manager is spawned for the user at boot and kept around after logouts. This allows users who are not logged in to run long-running services. Takes one or more user names or numeric UIDs as argument. If no argument is specified, enables/disables lingering for the user of the session of the caller.

The docs mention the following when it comes to the configuration options.

UserStopDelaySec=

Specifies how long to keep the user record and per-user service user@.service around for a user after they logged out fully. If set to zero, the per-user service is terminated immediately when the last session of the user has ended. If this option is configured to non-zero rapid logout/login cycles are sped up, as the user’s service manager is not constantly restarted. If set to “infinity” the per-user service for a user is never terminated again after first login, and continues to run until system shutdown. Defaults to 10s.

KillUserProcesses=

Takes a boolean argument. Configures whether the processes of a user should be killed when the user logs out. If true, the scope unit corresponding to the session and all processes inside that scope will be terminated. If false, the scope is “abandoned”, see systemd.scope(5), and processes are not killed. Defaults to “no”, but see the options KillOnlyUsers= and KillExcludeUsers= below.