Docker Community Forums

Share and learn in the Docker community.

Docker image pull fails and makes docker daemon inaccessible until machine restart

I am facing a strange issue with the docker image pull command where the command fails with this error;

[desai@brilp0017 ~]$ docker image pull nginx:latest
latest: Pulling from library/nginx
d121f8d1c412: Extracting [==================================================>]  
27.09MB/27.09MB
ebd81fc8c071: Download complete 
655316c160af: Download complete 
d15953c0e0f8: Download complete 
2ee525c5c3cc: Download complete 
failed to register layer: Error processing tar file(exit status 1): Error cleaning up after pivot: remove /.pivot_root534731447: device or resource busy

After this error the docker daemon is no longer accessible and all docker commands return following error;

[desai@brilp0017 ~]$ docker info
Client:
 Debug Mode: false

Server:
ERROR: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
errors pretty printing info

The command systemctl status docker however shows it as running;

[desai@brilp0017 ~]$ systemctl status docker
● docker.service - Docker Application Container Engine
    Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
    Active: active (running) since Fri 2020-09-11 14:25:53 BST; 14min ago
TriggeredBy: ● docker.socket
    Docs: https://docs.docker.com
Main PID: 2380 (dockerd)
    Tasks: 14
    Memory: 249.5M
    CGroup: /system.slice/docker.service
            └─2380 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock

The journalctl log has the same error line as encountered in the pull command;

Sep 11 14:25:52 brilp0017 systemd[1]: Starting Docker Application Container Engine...
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.345006155+01:00" level=info msg="Starting up"
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.348597478+01:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.348667479+01:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.348733420+01:00" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock 0  <nil>}] <nil>}" module=grpc
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.348765306+01:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.353865701+01:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.353908904+01:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.353944835+01:00" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock 0  <nil>}] <nil>}" module=grpc
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.353988191+01:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.497701794+01:00" level=info msg="[graphdriver] using prior storage driver: overlay2"
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.816295801+01:00" level=warning msg="Your kernel does not support cgroup rt period"
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.816318357+01:00" level=warning msg="Your kernel does not support cgroup rt runtime"
Sep 11 14:25:52 brilp0017 dockerd[2380]: time="2020-09-11T14:25:52.816442165+01:00" level=info msg="Loading containers: start."
Sep 11 14:25:53 brilp0017 dockerd[2380]: time="2020-09-11T14:25:53.101411528+01:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
Sep 11 14:25:53 brilp0017 dockerd[2380]: time="2020-09-11T14:25:53.125378601+01:00" level=info msg="Loading containers: done."
Sep 11 14:25:53 brilp0017 dockerd[2380]: time="2020-09-11T14:25:53.291896277+01:00" level=warning msg="Not using native diff for overlay2, this may cause degraded performance for building images: kernel has CONFIG_OVERLAY_FS_REDIRECT_DIR enabled" storage-driver=overlay2
Sep 11 14:25:53 brilp0017 dockerd[2380]: time="2020-09-11T14:25:53.292711063+01:00" level=info msg="Docker daemon" commit=48a66213fe graphdriver(s)=overlay2 version=19.03.12-ce
Sep 11 14:25:53 brilp0017 dockerd[2380]: time="2020-09-11T14:25:53.293190069+01:00" level=info msg="Daemon has completed initialization"
Sep 11 14:25:53 brilp0017 dockerd[2380]: time="2020-09-11T14:25:53.340381428+01:00" level=info msg="API listen on /run/docker.sock"
Sep 11 14:25:53 brilp0017 systemd[1]: Started Docker Application Container Engine.
Sep 11 14:32:38 brilp0017 dockerd[2380]: time="2020-09-11T14:32:38.011501405+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Sep 11 14:33:11 brilp0017 dockerd[2380]: time="2020-09-11T14:33:11.592234770+01:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Sep 11 14:34:52 brilp0017 dockerd[2380]: time="2020-09-11T14:34:52.864254519+01:00" level=info msg="Attempting next endpoint for pull after error: failed to register layer: Error processing tar file(exit status 1): Error cleaning up after pivot: remove /.pivot_root534731447: device or resource busy"

After this the error remains the same even after stopping and starting docker service multiple times with systemctl. After restarting the laptop fully and starting the docker service it starts work as expected until the next time docker pull command is used.

I have searched for solution on the internet but majority of them point to the user not being in docker group but that is not the case for me;

[desai@brilp0017 ~]$ groups
sys network power vboxusers wireshark sambashare docker lp wheel desai

Here is the output of docker version before it crashes for version details;

[desai@brilp0017 ~]$ docker version
Client:
Version:           19.03.12-ce
API version:       1.40
Go version:        go1.14.5
Git commit:        48a66213fe
Built:             Sat Jul 18 01:33:21 2020
OS/Arch:           linux/amd64
Experimental:      false

Server:
Engine:
Version:          19.03.12-ce
API version:      1.40 (minimum version 1.12)
Go version:       go1.14.5
Git commit:       48a66213fe
Built:            Sat Jul 18 01:32:59 2020
OS/Arch:          linux/amd64
Experimental:     false
containerd:
Version:          v1.4.0.m
GitCommit:        09814d48d50816305a8e6c1a4ae3e2bcc4ba725a.m
runc:
Version:          1.0.0-rc92
GitCommit:        ff819c7e9184c13b7c2607fe6c30ae19403a7aff
docker-init:
Version:          0.18.0
GitCommit:        fec3683

I am using Manjaro linux OS;

Operating System: Manjaro Linux
KDE Plasma Version: 5.19.4
KDE Frameworks Version: 5.73.0
Qt Version: 5.15.0
Kernel Version: 4.19.141-2-MANJARO
OS Type: 64-bit
Processors: 8 × Intel® Core™ i7-8550U CPU @ 1.80GHz
Memory: 31.2 GiB of RAM
Graphics Processor: Mesa Intel® UHD Graphics 620

Any help on this would be appreciated.

This is something that happens for us almost every day.

The machines are being shutdown using an automated script, when they are started again in the morning some of the machines docker service fail to start.
We start it again using systemctl and then docker service starts but many containers are unable to start.

1 major diff from my other environments - This is running on an Azure VM, that has data disk. Due to performance issues Azure recommended not using the OS disk, but rather using the Data disk to host docker as well.
so in daemon.json we setup “data-root”: to be on the mounted data disk.

FSTAB config:
UUID={REDACTED} /mnt ext4 defaults,nofail 1 2

Journalctl show the following logs:

Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.290544908Z” level=error msg=“stream copy error: reading from a closed fifo”
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.291066423Z” level=error msg=“stream copy error: reading from a closed fifo”
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.308654806Z” level=error msg=“Container not cleaned up from containerd from previous run” container=bb3ed15b048a4e238e621c6132fc3eb
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.350653261Z” level=error msg=“stream copy error: reading from a closed fifo”
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.350694862Z” level=error msg=“stream copy error: reading from a closed fifo”
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.438887887Z” level=error msg="0275e9fe31a598f36f33baba545504639570d155af7e8e1c5d86ab3474b22607 cleanup: failed to delete container
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.438921088Z” level=error msg="Failed to start container 0275e9fe31a598f36f33baba545504639570d155af7e8e1c5d86ab3474b22607: mkdir /ru
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.452774269Z” level=error msg=“Container not cleaned up from containerd from previous run” container=a6eef1a1fd61cb0315dd65add4e8fb8
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.503689269Z” level=error msg=“stream copy error: reading from a closed fifo”
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.503738371Z” level=error msg=“stream copy error: reading from a closed fifo”
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.581512709Z” level=error msg="200654d8ceeac5f08c1bef83e541fa460fd432f816be7f5be1274f854f821ca9 cleanup: failed to delete container
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.581547810Z” level=error msg="Failed to start container 200654d8ceeac5f08c1bef83e541fa460fd432f816be7f5be1274f854f821ca9: mkdir /ru
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.586983159Z” level=error msg="9477ba5bc530b58fddb10983ba7d502ee3b52eafdd6de485d973d2d062fbc90d cleanup: failed to delete container
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.587013560Z” level=error msg="Failed to start container 9477ba5bc530b58fddb10983ba7d502ee3b52eafdd6de485d973d2d062fbc90d: mkdir /ru
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.611438932Z” level=error msg=“Container not cleaned up from containerd from previous run” container=eefa30ddc7f1a58a81496c64bd1bc2c
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.629294423Z” level=error msg="f579ea5779d73a9e698f62221385d2ddc8aa271ffceadb2c485fbda5b6eafc3d cleanup: failed to delete container
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.629333324Z” level=error msg="Failed to start container f579ea5779d73a9e698f62221385d2ddc8aa271ffceadb2c485fbda5b6eafc3d: mkdir /ru
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.659101042Z” level=error msg=“stream copy error: reading from a closed fifo”
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.659146444Z” level=error msg=“stream copy error: reading from a closed fifo”
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.736744077Z” level=error msg="bb3ed15b048a4e238e621c6132fc3eb551c9abdd615a0daa990a1a641ab44afc cleanup: failed to delete container
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.736782078Z” level=error msg="Failed to start container bb3ed15b048a4e238e621c6132fc3eb551c9abdd615a0daa990a1a641ab44afc: mkdir /ru
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.843428611Z” level=error msg="a6eef1a1fd61cb0315dd65add4e8fb8598e9e863fa1dfa37b3d402229eebf068 cleanup: failed to delete container
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.843461112Z” level=error msg="Failed to start container a6eef1a1fd61cb0315dd65add4e8fb8598e9e863fa1dfa37b3d402229eebf068: mkdir /ru
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.864845100Z” level=error msg="eefa30ddc7f1a58a81496c64bd1bc2c57cdd32b929f91015016133277a3a6a87 cleanup: failed to delete container
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.864876501Z” level=error msg="Failed to start container eefa30ddc7f1a58a81496c64bd1bc2c57cdd32b929f91015016133277a3a6a87: mkdir /ru
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.864921902Z” level=info msg=“Loading containers: done.”
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.893609391Z” level=warning msg="Not using native diff for overlay2, this may cause degraded performance for building images: kernel
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.894162006Z” level=info msg=“Docker daemon” commit=afacb8b7f0 graphdriver(s)=overlay2 version=19.03.8
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.895777950Z” level=info msg=“Daemon has completed initialization”
Jul 09 06:01:46 prod-serv dockerd[59946]: time=“2020-07-09T06:01:46.914413963Z” level=info msg=“API listen on /var/run/docker.sock”
Jul 09 06:01:46 prod-serv systemd[1]: Started Docker Application Container Engine.
– Subject: Unit docker.service has finished start-up

similar to #597

Output of docker version:

Client: Docker Engine - Community
Version: 19.03.8
API version: 1.40
Go version: go1.12.17
Git commit: afacb8b7f0
Built: Wed Mar 11 01:25:46 2020
OS/Arch: linux/amd64
Experimental: false

Server: Docker Engine - Community
Engine:
Version: 19.03.8
API version: 1.40 (minimum version 1.12)
Go version: go1.12.17
Git commit: afacb8b7f0
Built: Wed Mar 11 01:24:19 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.2.13
GitCommit: 7ad184331fa3e55e52b890ea95e65ba581ae3429
runc:
Version: 1.0.0-rc10
GitCommit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
docker-init:
Version: 0.18.0
GitCommit: fec3683

Output of docker info:

Client:
Debug Mode: false

Server:
Containers: 208
Running: 208
Paused: 0
Stopped: 0
Images: 1
Server Version: 19.03.8
Storage Driver: overlay2
Backing Filesystem:
Supports d_type: true
Native Overlay Diff: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
init version: fec3683
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 5.3.0-1032-azure
Operating System: Ubuntu 18.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 15.59GiB
Name: prod-serv
ID: RQJD:7ZKW:YLDY:QCHP:MMHB:2X2S:RIUD:AVV6:OH74:C7D6:2VOX:UJU6
Docker Root Dir: /mnt/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false

WARNING: No swap limit support

Additional environment details (AWS, VirtualBox, physical, etc.)
Azure VM, Ubuntu 18.04

1 major diff from my other environments - This is running on an Azure VM, that has data disk. Due to performance issues Azure recommended not using the OS disk, but rather using the Data disk to host docker as well.
so in daemon.json we setup “data-root”: to be on the mounted data disk.

FSTAB config:
UUID={REDACTED} /mnt ext4 defaults,nofail 1 2