Docker won't work with my current /var/lib/docker

Hi everyone.
I don’t know where to ask for help so I guess I may do it here.
Recently in the night, there was a power outage at my house and so my Debian 10 server powered off.
When I powered it on, Docker was not working anymore. I had the following error message :

Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the Docker daemon running?

I tried a lot of things and I uninstalled/re-installed Docker multiple times, but nothing works.
So I made a backup of my /var/lib/docker, then I deleted the original folder and Docker is now perfectly working.
So it seems that my /var/lib/docker makes docker crash. When I replace the new working /var/lib/docker folder with my backup, I get the error message again.

Please, I have a lot of important containers, volumes and networks that I am daily using and I just can’t lose them.
Is there any solution to get Docker working again with my backup?

Thank you

Edit : I have read some people talking about a kernel upgrade but I don’t remember having upgraded my kernel but it seems to cause the same issue than mine. But since I didn’t upgrade kernel, I don’t think that’s the problem.

I can’t definitively speak to your system, but when I get that message on PhotonOS it’s trying to tell me that the Docker service is not running, and I can easily fix that by entering “systemctrl enable docker” and then “systemctrl start docker”. Or you can reboot instead of the second command, as once enabled it should start on reboot.

I have noted that once or twice a tdnf update to Docker has rendered it non-startable, and I had to repeat this process on a system which previously worked.

Thanks for your answer.
Sorry I had to say that at the beginning but I completely forgot about it.
I already tried systemctl enable docker and systemctl start docker, but I get this error message :

Job for docker.service failed because the control process exited with error code.
See “systemctl status docker.service” and “journalctl -xe” for details.

And if I type systemctl status docker.service, I get this :

● docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Active: activating (auto-restart) (Result: exit-code) since Tue 2020-02-04 18:17:50 CET; 1s ago
Docs: https://docs.docker.com
Process: 3506 ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock (code=exited, status=2)
Main PID: 3506 (code=exited, status=2)

Sorry but what’s tdnf?

Because I use PhotonOS VM from VMWare, tdnf is the command to update stuff. What I meant was, in my experience, platform updates to Docker can stop Docker from automatically starting, which is annoying. But you’ve un/reinstalled Docker and already tried the commands I wrote, so this likely isn’t where your problem lies.

Your error messages indicate that Docker is attempting to activate a container but something is broke so it’s exiting. This leads me to believe it’s not a problem with Docker running, but in the container it’s trying to run. I haven’t had to remove a container from Docker when Docker won’t start; I suspect this might justify my use of –restart unless-stopped rather than –restart always.

So I did a little Googling and if I understand this correctly: How to prevent docker from starting a container automatically on system startup? - Stack Overflow, you might need to dig into the docker container configuration and modify the --restart parameter. It sounds to me like something may have gotten corrupted in your power failure, so hopefully you can narrow down which container is preventing the Docker daemon from starting, and then restore just that container from a backup.

Hmmm…
Just tried that, but it doesn’t seem to be working :frowning:

Okay so the issue has evolved now.
I deleted my /var/lib/docker and I have created all of my docker containers again.
Now it works, but when I restart my server, then I try a command with docker such as docker ps, I get the same error message:

Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

And there’s what I get when I enter sudo systemctl status docker:

● docker.service - Docker Application Container Engine

Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)

Active: inactive (dead) since Wed 2020-02-05 20:57:29 CET; 51s ago

Docs: https://docs.docker.com

Process: 980 ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock (code=exited, status=0/SUC

Main PID: 980 (code=exited, status=0/SUCCESS)

févr. 05 20:57:28 S-R610 dockerd[980]: time=“2020-02-05T20:57:28.626024396+01:00” level=info msg="Loading containers: done

févr. 05 20:57:28 S-R610 dockerd[980]: time=“2020-02-05T20:57:28.670936015+01:00” level=error msg=“failed to get event” er

févr. 05 20:57:28 S-R610 dockerd[980]: time=“2020-02-05T20:57:28.671009260+01:00” level=error msg=“failed to get event” er

févr. 05 20:57:28 S-R610 dockerd[980]: time=“2020-02-05T20:57:28.902618417+01:00” level=warning msg="failed to retrieve co

févr. 05 20:57:28 S-R610 dockerd[980]: time=“2020-02-05T20:57:28.942225336+01:00” level=info msg=“Docker daemon” commit=63

févr. 05 20:57:28 S-R610 dockerd[980]: time=“2020-02-05T20:57:28.959645445+01:00” level=info msg="Daemon has completed ini

févr. 05 20:57:29 S-R610 dockerd[980]: time=“2020-02-05T20:57:29.026118513+01:00” level=info msg="API listen on /var/run/d

févr. 05 20:57:29 S-R610 dockerd[980]: time=“2020-02-05T20:57:29.026976974+01:00” level=info msg="Daemon shutdown complete

févr. 05 20:57:29 S-R610 systemd[1]: docker.service: Succeeded.

févr. 05 20:57:29 S-R610 systemd[1]: Stopped Docker Application Container Engine.

lines 1-17/17 (END)

Of course I already enabled docker daemon to start on boot thanks to sudo systemctl enable docker command, and as we can see, it doesn’t work.
However, if I manually start the docker daemon with sudo systemctl start docker, it works!
What can I do to solve this, please?

Thank you

So Docker starts manually, but throws an error when it starts up with the host… yuck. I did some googling on the docker.sock errors and came up with install issues and a bug in December that’s since been fixed. I presume you’re running the latest version of Docker? Did you uninstall & reinstall Docker after deleting /var/lib/docker/? Would it be possible to migrate your containers into another Docker host (maybe a temporary sandbox VM) to see if the issues persist?

Hmmm, that’s interesting. Even if I‘ve had docker installed for months and I didn’t update it at all. I will also try to investigate into that.

Yep, for both questions.
What can I do to completely remove everything related to docker and start the installation process from zero?

Ooh it might be worth to give a try, I’ll do it tomorrow. Thanks for suggestion and for trying to help me :slight_smile:

In general, when hitting errors with Docker failing to initialize, systemctl status docker can help but is usually not sufficient. Look into journalctl -u docker instead to see the logs emitted by the Docker daemon to see exactly why it failed.

For debug, I also suggest starting the Docker daemon with the --debug and/or --log-level options. You can do this by modifying the systemd service unit for Docker, usually found at /lib/systemd/system/docker.service. In that file, modify the line ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock to add the desired daemon options. Restart docker (e.g., systemctl restart docker) and then look into the logs (journalctl -u docker).

Hope that helps!