OCI runtime exec failed: ... fork/exec /proc/self/fd/7: no such file or directory

Hello

I have a Docker Swarm setup with 10 containers.
Docker Swarm runs in a VM, data center, VM-Ware.
The swarm consists of only one machine.
I do not need high availability.

When I restart the VM, several containers start with an error.

OCI runtime exec failed: exec failed: unable to start container process: 
error starting setns process: fork/exec /proc/self/fd/7: no such file or directory: unknown

Does anyone have a solution for this error?
How can I debug the error?

System:

Debian 12 - 6.1.0-37-amd64 (up to date)
Docker version 28.3.1, build 38b7060 (up to date)
1 vCPU
4 GB RAM
200 GB HDD

Journalctl -u docker

https://pastebin.com/tNAYnMr6

Regards fguser

Did you enable ā€œlive-restoreā€ in Docker daemon.json?

I proceeded as follows:

  1. I installed docker
  2. I stopped docker (systemctl stop docker)
  3. I created the new root directory.
  4. I copied the original files there.
  5. I created daemon.json
  6. I started docker again.

I did not delete ā€œvar/lib/dockerā€.

systemctl stop docker
mkdir -p /opt/docker-root
cp -R /var/lib/docker/* /opt/docker-root/
vi /etc/docker/daemon.json
{
 "data-root": "/opt/docker-root"
}

The only thing we have changed subsequently is the virtual hardware:
2vCPU → 1 vCPU
8GB RAM → 4GB RAM

Copying existing content from the default data-root location to another location is not a good idea, even more so, when container configurations exist, as some configuration paths are absolute, and not relative to the value of data-root.

Try with an empty data-root folder, and check whether the problem still exists. It shouldn’t.

If I understand it correctly, you have running containers and the errors occur only when you restart the machine. So I’m not sure if that is caused by the data root copy, but I guess it is possible.

@meyay could be right about the absolute paths, but I don’t remember any. Maybe except when I set a custom source path to a volume, so I need to change that too.. I copied the data root multiple times, but simply using ā€œcpā€ might not keep all the attributes of the files. There is a ā€œ-pā€ flag to ā€œpreserveā€ and a ā€œ-aā€ flag to ā€œarchiveā€ and a couple of other useful flags. You could also use rsync which has similar flags.cp -a also includes ā€œ-Rā€. You can find more in the help outputs of the commands. Without the right commands, you can break symlinks and probably other special files.

Yes exact.
If i restart the container in portainer it works afterwards…

I didn’t know that the necessary data is recreated when docker is started.
That’s why I copied the directory.

As soon as i create an empty root folder i have no more errors.
Unfortunately i now have to recreate all services.

Thanks for your help.

Only the base folder structure and if you have all the services defined in a compose file or you have any commands to recreate your containers, it will of course pull images and recreate the new containers the same way as they were before.

Since that way all containers are reproducable as they should be, I think I copied the docker data after I deleted all containers with ā€œdocker compose downā€ and I just ran ā€œdocker compose up -dā€ after the data was on the new disk, so I didn’t have to copy containers, only images.

But it’s great that everything works in a new data root!

Of course I am :slight_smile: This is the file I meant in my last post:

<data-root>/containers/<container-id>/config.v2.json

The value for the keys "LogPath", "HostnamePath", "HostsPath" and "ResolvConfPath" are absolute paths, and need to be updated when the configuration is moved to another location.

I never copy the data-root. I used to move it in the past, but since a couple of years I prefer to start with a fresh data-root. Back up named volumes (that actually persist in <data-root>/volumes/<volume name>/_data) from the old data-root instance, then deploy the containers/services using their compose files with the new data-root instance, stop the compose deployment, then finally restore the content of the volumes. When I write backup and restore, I mean by using a helper container. I never access anything in the folder directly.

1 Like

As usually :slight_smile: I thought you meant the target of the symlinks and I forgot about the config file. Since I never had problem with it, that confirms I always instinctively deleted the containers before moving the data root.

Moving the data root could spare time in environments with slow network when it was urgent, but I noticed I forgot to mount the data disk before starting the Docker daemon and pullig images.

Hello again

I have now done the following:

Created a new empty ā€œdocker-rootā€ folder.
renamed the folder to docker.old under var/lib/docker.
Then started Docker.

After starting I noticed that docker created a folder /var/lib/docker/network, why?
I changed the docker root in the daemon.json to /opt/docker-root

So I have now done the following:

/etc/docker/daemon.json -> deleted
ln -s /opt/docker-root /var/lib/docker 
systemctl start docker

Docker is running so far

I have now recreated all services.
The problem still exists.

Here is an excerpt from the logs

There are several errors that I do not understand:

  • level=warning msg="Error (Unable to complete atomic operation,
  • level=error msg="network portainer_agent_network remove failed: error while removing network: unknown network
  • msg="error creating cluster object" error="name conflicts with an existing object" module=node

I understand the following error, but I don’t know how to solve it:

  • msg="Error getting v2 registry: Get \"https://registry.company.ch/v2/\": dial tcp 10.22.22.11:443: connect: connection refused"

I use a selfhost runner via github. The image is sent to the local registry via CI/CD.
When I restart the server it tries to download the image via the local registry. But it is missing the login data. How can I configure this so that docker has the login data?

as I said, when I restart the server, various containers continue to be started with the following error:

OCI runtime exec failed: exec failed: unable to start container process: error starting setns process: fork/exec /proc/self/fd/7: no such file or directory: unknown

I am grateful for any help.

Greetings fguser

It should create it in <data-root>/network. If it is creating it in the default data-root folder /var/lib/docker, but not in the one specified in your daemon.json settings, than this pretty much smells like a bug.

This looks awful wrong to me…

What does docker info show as data root? Can you test it again?

I’m not sure, but I think symlinking the docker data root can break some features. It is just a vague memory.

I tested it a minute ago on Ubuntu 24.04 and docker-ce 28.1.1 and 28.3.2.

What I did as root:

systemctl stop docker docker.socket

mkdir /opt/docker
cat <<EOF >/etc/docker/daemon.json
{
 "data-root": "/opt/docker"
}
EOF
mv /var/lib/docker /var/lib/docker.old

systemctl start docker

The default folder /var/lib/docker is not re-created.
When I created a network and start a container attached to the network, the folder /var/lib/docker is still not re-created.

I was able to reproduce it in another VM.
the folder /var/lib/docker/network is created after I have installed portainer (business).
although I have set this to /opt/docker-root in the daemon.json.

root@deb-docker:/var/lib/docker/network/files/lb_68jnizjky0f2c068n2ti5xuka# ls -la
insgesamt 20
drwxr-xr-x 2 root root 4096 10. Jul 10:02 .
drwxr-xr-x 3 root root 4096 10. Jul 09:52 ..
-rw-r--r-- 1 root root  158 10. Jul 09:52 hosts
-rw-r--r-- 1 root root  356 10. Jul 09:52 resolv.conf
-rw-r--r-- 1 root root   71 10. Jul 09:52 resolv.conf.hash

root@deb-docker:/var/lib/docker/network/files/lb_68jnizjky0f2c068n2ti5xuka# docker info -f '{{ .DockerRootDir}}'
/opt/docker-root

root@deb-docker:/var/lib/docker/network/files/lb_68jnizjky0f2c068n2ti5xuka# cat /etc/docker/daemon.json
{
  "data-root": "/opt/docker-root"
}

Can anyone reproduce this?

Instructions are here:

curl -L https://downloads.portainer.io/ee-lts/portainer-agent-stack.yml -o portainer-agent-stack.yml

vi portainer-agent-stack.yml
# edit volumes : /opt/docker-root/volumes:/opt/docker-root/volumes
# remove /var/lib/docker/volumes:....

docker stack deploy -c portainer-agent-stack.yml portainer

This would explain the following error:

The question now is why the network is created under var/lib/docker and not under /opt/docker-root

Regards fguser

Forgive me, but I am not going to follow you down the Portainer-path, as I have no particular interest in re-learning a tool, I stopped using years ago. I lack the required patience for clickOps :wink: I am sure it’s a great product and has its community. It’s just not the right tool for me.

I have no idea what causes your problem, as I am not able to reproduce it.

I leave this one for someone else to answer. Good luck, I hope you find the solution!

If portainer creates something that is a question to the portainer community. I asked for the docker info output to focus on Docker.

If the only thing that is created is the network folder, it is not Docker that creates it as it would not stop there.

Thank you both for your help.

I have been able to fix the problem:

  • I have completely removed Docker
  • Then I reinstalled docker but a specific version because Portainer doesn’t work with the latest version properly?
  • I did not move the main Docker folder.

Errors are still present but at least all Docker containers start after a restart.

Removing

dpkg -l | grep -i docker
sudo apt-get purge -y docker-engine docker docker.io docker-ce docker-ce-cli docker-compose-plugin docker-buildx-plugin
sudo apt-get autoremove -y --purge docker-engine docker docker.io docker-ce docker-compose-plugin docker-buildx-plugin
dpkg -l | grep -i docker
sudo rm -rf /var/lib/docker /etc/docker
#sudo rm /etc/apparmor.d/docker
sudo groupdel docker
sudo rm -rf /var/run/docker.sock
sudo rm -rf /var/lib/containerd
sudo rm -r ~/.docker

Installing

Portainer Version: Business 2.27.9
Docker Version : 26.0.2, 27.0.3

apt-cache madison docker-ce | awk '{ print $3 }'

VERSION_STRING=5:27.0.3-1~debian.12~bookworm
apt-get install docker-ce=$VERSION_STRING docker-ce-cli=$VERSION_STRING containerd.io docker-buildx-plugin docker-compose-plugin

apt-mark hold docker-ce docker-ce-cli docker-ce-rootless-extras docker-compose-plugin

Regards fguser

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.