Hello,
I am attempting to run docker rootless on compute nodes that are running RHEL 8.8 in memory. The OS is not installed to a physical disk and is running in RAM on the physical server.
Here is the system information below:
Kernel info
Linux compute_node01.example 4.18.0-477.10.1.el8_8.x86_64 #1 SMP Wed Apr 5 13:35:01 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux
Docker info
Client: Docker Engine - Community
Version: 25.0.3
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.12.1
Path: /usr/libexec/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.24.5
Path: /usr/libexec/docker/cli-plugins/docker-compose
Packages installed for docker:
docker-ce x86_64 3:25.0.3-1.el8
container-selinux noarch 2:2.205.0-2.module+el8.8.0+18438+15d3aa65
containerd.io x86_64 1.6.28-3.1.el8
docker-ce-cli x86_64 1:25.0.3-1.el8
fuse-common x86_64 3.3.0-16.el8
fuse-overlayfs x86_64 1.10-1.module+el8.8.0+18060+3f21f2cc
fuse3 x86_64 3.3.0-16.el8
fuse3-libs x86_64 3.3.0-16.el8
libcgroup x86_64 0.41-19.el8
libslirp x86_64 4.4.0-1.module+el8.8.0+18060+3f21f2cc
slirp4netns x86_64 1.2.0-2.module+el8.8.0+18060+3f21f2cc
docker-buildx-plugin x86_64 0.12.1-1.el8
docker-ce-rootless-extras x86_64 25.0.3-1.el8
docker-compose-plugin x86_64 2.24.5-1.el8
The steps I am taking to attempt to setup dockerd-rootless
Run dockerd-rootless-setup.sh
$ dockerd-rootless-setuptool.sh check
[INFO] Requirements are satisfied
$ dockerd-rootless-setuptool.sh install
[INFO] Creating /home/user/.config/systemd/user/docker.service
[INFO] starting systemd service docker.service
+ systemctl --user start docker.service
Job for docker.service failed because the control process exited with error code.
See "systemctl --user status docker.service" and "journalctl --user -xe" for details.
+ set +x
[ERROR] Failed to start docker.service. Run `journalctl -n 20 --no-pager --user --unit docker.service` to show the error log.
[ERROR] Before retrying installation, you might need to uninstall the current setup: `/usr/bin/dockerd-rootless-setuptool.sh uninstall -f ; /usr/bin/rootlesskit rm -rf /home/local-user/.local/share/docker`
If I change to the same user as I am already logged in as, this command executes successfully with the following output:
$ dockerd-rootless-setuptool.sh install
[INFO] systemd not detected, dockerd-rootless.sh needs to be started manually:
PATH=/usr/bin:/sbin:/usr/sbin:$PATH dockerd-rootless.sh
[INFO] Creating CLI context "rootless"
Successfully created context "rootless"
[INFO] Using CLI context "rootless"
Current context is now "rootless"
[INFO] Make sure the following environment variable(s) are set (or add them to ~/.bashrc):
# WARNING: systemd not found. You have to remove XDG_RUNTIME_DIR manually on every logout.
export XDG_RUNTIME_DIR=/home/local-user/.docker/run
export PATH=/usr/bin:$PATH
[INFO] Some applications may require the following environment variable too:
export DOCKER_HOST=unix:///home/local-user/.docker/run/docker.sock
Then if I go to launch dockerd-rootless.sh manually, after exporting env variables, I get the following:
$ PATH=/usr/bin:/sbin:/usr/sbin:$PATH dockerd-rootless.sh
+ case "$1" in
+ '[' -w /home/local-user/.docker/run ']'
+ '[' -d /home/local-user ']'
+ rootlesskit=
+ for f in docker-rootlesskit rootlesskit
+ command -v docker-rootlesskit
+ for f in docker-rootlesskit rootlesskit
+ command -v rootlesskit
+ rootlesskit=rootlesskit
+ break
+ '[' -z rootlesskit ']'
+ : /home/local-user/.docker/run/dockerd-rootless
+ : ''
+ : ''
+ : builtin
+ : auto
+ : auto
+ net=
+ mtu=
+ '[' -z '' ']'
+ command -v slirp4netns
+ slirp4netns --help
+ grep -qw -- --netns-type
+ net=slirp4netns
+ '[' -z '' ']'
+ mtu=65520
+ '[' -z slirp4netns ']'
+ '[' -z 65520 ']'
+ dockerd=dockerd
+ '[' -z '' ']'
+ _DOCKERD_ROOTLESS_CHILD=1
+ export _DOCKERD_ROOTLESS_CHILD
++ id -u
+ '[' 1000 = 0 ']'
+ command -v selinuxenabled
+ selinuxenabled
+ exec rootlesskit --state-dir=/home/local-user/.docker/run/dockerd-rootless --net=slirp4netns --mtu=65520 --slirp4netns-sandbox=auto --slirp4netns-seccomp=auto --disable-host-loopback --port-driver=builtin --copy-up=/etc --copy-up=/run --propagation=rslave /usr/bin/dockerd-rootless.sh
[rootlesskit:parent] error: failed to setup network &{logWriter:0xc000252ae0 binary:slirp4netns mtu:65520 ipnet:<nil> disableHostLoopback:true apiSocketPath: enableSandbox:true enableSeccomp:true enableIPv6:false ifname:tap0 infoMu:{w:{state:0 sema:0} writerSem:0 readerSem:0 readerCount:{_:{} v:0} readerWait:{_:{} v:0}} info:<nil>}: waiting for ready fd (/usr/bin/slirp4netns --mtu 65520 -r 3 --disable-host-loopback --enable-sandbox --enable-seccomp 159795 tap0): slirp4netns failed
[rootlesskit:child ] error: EOF
Which points me to a slirp4netns error, when looking at the logs, I see the following:
localhost dockerd-rootless.sh[159644]: + exec rootlesskit --state-dir=/run/user/1000/dockerd-rootless --net=slirp4netns --mtu=65520 --slirp4netns-sandbox=auto --slirp4netns-seccomp=auto --disable-host-loopback --port-driver=builtin --copy-up=/etc --copy-up=/run --propagation=rslave /usr/bin/dockerd-rootless.sh
May 8 03:14:15 localhost kernel: IPv6: ADDRCONF(NETDEV_UP): tap0: link is not ready
May 8 03:14:15 localhost kernel: IPv6: ADDRCONF(NETDEV_CHANGE): tap0: link becomes ready
May 8 03:14:16 localhost dockerd-rootless.sh[159644]: [rootlesskit:parent] error: failed to setup network &{logWriter:0xc000250ae0 binary:slirp4netns mtu:65520 ipnet:<nil> disableHostLoopback:true apiSocketPath: enableSandbox:true enableSeccomp:true enableIPv6:false ifname:tap0 infoMu:{w:{state:0 sema:0} writerSem:0 readerSem:0 readerCount:{_:{} v:0} readerWait:{_:{} v:0}} info:<nil>}: waiting for ready fd (/usr/bin/slirp4netns --mtu 65520 -r 3 --disable-host-loopback --enable-sandbox --enable-seccomp 159656 tap0): slirp4netns failed
May 8 03:14:16 localhost dockerd-rootless.sh[159656]: [rootlesskit:child ] error: EOF
May 8 03:14:16 localhost systemd[121379]: docker.service: Main process exited, code=exited, status=1/FAILURE
May 8 03:14:16 localhost systemd[121379]: docker.service: Killing process 159656 (exe) with signal SIGKILL.
May 8 03:14:16 localhost systemd[121379]: docker.service: Failed with result 'exit-code'.
May 8 03:14:16 localhost systemd[121379]: Failed to start Docker Application Container Engine (Rootless).
May 8 03:14:18 localhost systemd[121379]: docker.service: Service RestartSec=2s expired, scheduling restart.
May 8 03:14:18 localhost systemd[121379]: docker.service: Scheduled restart job, restart counter is at 3.
May 8 03:14:18 localhost systemd[121379]: Stopped Docker Application Container Engine (Rootless).
May 8 03:14:18 localhost systemd[121379]: docker.service: Start request repeated too quickly.
May 8 03:14:18 localhost systemd[121379]: docker.service: Failed with result 'exit-code'.
May 8 03:14:18 localhost systemd[121379]: Failed to start Docker Application Container Engine (Rootless).
May 8 03:14:43 localhost su[159679]: (to local-user) local-user on pts/2
May 8 03:16:38 localhost kernel: IPv6: ADDRCONF(NETDEV_UP): tap0: link is not ready
May 8 03:16:38 localhost kernel: IPv6: ADDRCONF(NETDEV_CHANGE): tap0: link becomes ready
When investigating I notice the line:
localhost dockerd-rootless.sh[159644]: [rootlesskit:parent] error: failed to setup network &{logWriter:0xc000250ae0 binary:slirp4netns mtu:65520 ipnet:<nil> disableHostLoopback:true apiSocketPath: enableSandbox:true enableSeccomp:true enableIPv6:false ifname:tap0 infoMu:{w:{state:0 sema:0} writerSem:0 readerSem:0 readerCount:{_:{} v:0} readerWait:{_:{} v:0}} info:<nil>}: waiting for ready fd (/usr/bin/slirp4netns --mtu 65520 -r 3 --disable-host-loopback --enable-sandbox --enable-seccomp 159656 tap0): slirp4netns failed
I believe the PID (159656) that is being called is an issue? But no idea, I am stumped.
Here is the server disk setup as well, with the OS running in memory and a locally attached disk named “scratch” for persistent storage:
Filesystem Size Used Avail Use% Mounted on
rootfs 189G 3.3G 186G 2% /
devtmpfs 189G 0 189G 0% /dev
tmpfs 189G 0 189G 0% /dev/shm
tmpfs 189G 18M 189G 1% /run
tmpfs 189G 0 189G 0% /sys/fs/cgroup
tmpfs 38G 0 38G 0% /run/user/0
tmpfs 38G 0 38G 0% /run/user/1000
/dev/mapper/scratch 931G 6.6G 925G 1% /scratch
When doing these exact same steps on a normal installed VM, there are no issues at all.
Any help or troubleshooting is appreciated. I would use a different network driver, but have somewhat of a hard requirement on slirp4netns. I am also aware other softwares are more suited for rootless container execution.