Docker Rootless on diskless compute nodes

Hello,
I am attempting to run docker rootless on compute nodes that are running RHEL 8.8 in memory. The OS is not installed to a physical disk and is running in RAM on the physical server.

Here is the system information below:

Kernel info
Linux compute_node01.example 4.18.0-477.10.1.el8_8.x86_64 #1 SMP Wed Apr 5 13:35:01 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux

Docker info

Client: Docker Engine - Community
 Version:    25.0.3
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.12.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.24.5
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Packages installed for docker:

 docker-ce                    x86_64    3:25.0.3-1.el8                               
 container-selinux            noarch    2:2.205.0-2.module+el8.8.0+18438+15d3aa65    
 containerd.io                x86_64    1.6.28-3.1.el8                               
 docker-ce-cli                x86_64    1:25.0.3-1.el8                               
 fuse-common                  x86_64    3.3.0-16.el8                                 
 fuse-overlayfs               x86_64    1.10-1.module+el8.8.0+18060+3f21f2cc         
 fuse3                        x86_64    3.3.0-16.el8                                 
 fuse3-libs                   x86_64    3.3.0-16.el8                                 
 libcgroup                    x86_64    0.41-19.el8                                  
 libslirp                     x86_64    4.4.0-1.module+el8.8.0+18060+3f21f2cc        
 slirp4netns                  x86_64    1.2.0-2.module+el8.8.0+18060+3f21f2cc        
 docker-buildx-plugin         x86_64    0.12.1-1.el8                                 
 docker-ce-rootless-extras    x86_64    25.0.3-1.el8                                 
 docker-compose-plugin        x86_64    2.24.5-1.el8

The steps I am taking to attempt to setup dockerd-rootless

Run dockerd-rootless-setup.sh

$ dockerd-rootless-setuptool.sh check 

[INFO] Requirements are satisfied
$ dockerd-rootless-setuptool.sh install

[INFO] Creating /home/user/.config/systemd/user/docker.service
[INFO] starting systemd service docker.service
+ systemctl --user start docker.service
Job for docker.service failed because the control process exited with error code.
See "systemctl --user status docker.service" and "journalctl --user -xe" for details.
+ set +x
[ERROR] Failed to start docker.service. Run `journalctl -n 20 --no-pager --user --unit docker.service` to show the error log.
[ERROR] Before retrying installation, you might need to uninstall the current setup: `/usr/bin/dockerd-rootless-setuptool.sh uninstall -f ; /usr/bin/rootlesskit rm -rf /home/local-user/.local/share/docker`

If I change to the same user as I am already logged in as, this command executes successfully with the following output:

$ dockerd-rootless-setuptool.sh install
[INFO] systemd not detected, dockerd-rootless.sh needs to be started manually:

PATH=/usr/bin:/sbin:/usr/sbin:$PATH dockerd-rootless.sh

[INFO] Creating CLI context "rootless"
Successfully created context "rootless"
[INFO] Using CLI context "rootless"
Current context is now "rootless"

[INFO] Make sure the following environment variable(s) are set (or add them to ~/.bashrc):
# WARNING: systemd not found. You have to remove XDG_RUNTIME_DIR manually on every logout.
export XDG_RUNTIME_DIR=/home/local-user/.docker/run
export PATH=/usr/bin:$PATH

[INFO] Some applications may require the following environment variable too:
export DOCKER_HOST=unix:///home/local-user/.docker/run/docker.sock

Then if I go to launch dockerd-rootless.sh manually, after exporting env variables, I get the following:

$ PATH=/usr/bin:/sbin:/usr/sbin:$PATH dockerd-rootless.sh
+ case "$1" in
+ '[' -w /home/local-user/.docker/run ']'
+ '[' -d /home/local-user ']'
+ rootlesskit=
+ for f in docker-rootlesskit rootlesskit
+ command -v docker-rootlesskit
+ for f in docker-rootlesskit rootlesskit
+ command -v rootlesskit
+ rootlesskit=rootlesskit
+ break
+ '[' -z rootlesskit ']'
+ : /home/local-user/.docker/run/dockerd-rootless
+ : ''
+ : ''
+ : builtin
+ : auto
+ : auto
+ net=
+ mtu=
+ '[' -z '' ']'
+ command -v slirp4netns
+ slirp4netns --help
+ grep -qw -- --netns-type
+ net=slirp4netns
+ '[' -z '' ']'
+ mtu=65520
+ '[' -z slirp4netns ']'
+ '[' -z 65520 ']'
+ dockerd=dockerd
+ '[' -z '' ']'
+ _DOCKERD_ROOTLESS_CHILD=1
+ export _DOCKERD_ROOTLESS_CHILD
++ id -u
+ '[' 1000 = 0 ']'
+ command -v selinuxenabled
+ selinuxenabled
+ exec rootlesskit --state-dir=/home/local-user/.docker/run/dockerd-rootless --net=slirp4netns --mtu=65520 --slirp4netns-sandbox=auto --slirp4netns-seccomp=auto --disable-host-loopback --port-driver=builtin --copy-up=/etc --copy-up=/run --propagation=rslave /usr/bin/dockerd-rootless.sh
[rootlesskit:parent] error: failed to setup network &{logWriter:0xc000252ae0 binary:slirp4netns mtu:65520 ipnet:<nil> disableHostLoopback:true apiSocketPath: enableSandbox:true enableSeccomp:true enableIPv6:false ifname:tap0 infoMu:{w:{state:0 sema:0} writerSem:0 readerSem:0 readerCount:{_:{} v:0} readerWait:{_:{} v:0}} info:<nil>}: waiting for ready fd (/usr/bin/slirp4netns --mtu 65520 -r 3 --disable-host-loopback --enable-sandbox --enable-seccomp 159795 tap0): slirp4netns failed
[rootlesskit:child ] error: EOF

Which points me to a slirp4netns error, when looking at the logs, I see the following:

localhost dockerd-rootless.sh[159644]: + exec rootlesskit --state-dir=/run/user/1000/dockerd-rootless --net=slirp4netns --mtu=65520 --slirp4netns-sandbox=auto --slirp4netns-seccomp=auto --disable-host-loopback --port-driver=builtin --copy-up=/etc --copy-up=/run --propagation=rslave /usr/bin/dockerd-rootless.sh
May  8 03:14:15 localhost kernel: IPv6: ADDRCONF(NETDEV_UP): tap0: link is not ready
May  8 03:14:15 localhost kernel: IPv6: ADDRCONF(NETDEV_CHANGE): tap0: link becomes ready
May  8 03:14:16 localhost dockerd-rootless.sh[159644]: [rootlesskit:parent] error: failed to setup network &{logWriter:0xc000250ae0 binary:slirp4netns mtu:65520 ipnet:<nil> disableHostLoopback:true apiSocketPath: enableSandbox:true enableSeccomp:true enableIPv6:false ifname:tap0 infoMu:{w:{state:0 sema:0} writerSem:0 readerSem:0 readerCount:{_:{} v:0} readerWait:{_:{} v:0}} info:<nil>}: waiting for ready fd (/usr/bin/slirp4netns --mtu 65520 -r 3 --disable-host-loopback --enable-sandbox --enable-seccomp 159656 tap0): slirp4netns failed
May  8 03:14:16 localhost dockerd-rootless.sh[159656]: [rootlesskit:child ] error: EOF
May  8 03:14:16 localhost systemd[121379]: docker.service: Main process exited, code=exited, status=1/FAILURE
May  8 03:14:16 localhost systemd[121379]: docker.service: Killing process 159656 (exe) with signal SIGKILL.
May  8 03:14:16 localhost systemd[121379]: docker.service: Failed with result 'exit-code'.
May  8 03:14:16 localhost systemd[121379]: Failed to start Docker Application Container Engine (Rootless).
May  8 03:14:18 localhost systemd[121379]: docker.service: Service RestartSec=2s expired, scheduling restart.
May  8 03:14:18 localhost systemd[121379]: docker.service: Scheduled restart job, restart counter is at 3.
May  8 03:14:18 localhost systemd[121379]: Stopped Docker Application Container Engine (Rootless).
May  8 03:14:18 localhost systemd[121379]: docker.service: Start request repeated too quickly.
May  8 03:14:18 localhost systemd[121379]: docker.service: Failed with result 'exit-code'.
May  8 03:14:18 localhost systemd[121379]: Failed to start Docker Application Container Engine (Rootless).
May  8 03:14:43 localhost su[159679]: (to local-user) local-user on pts/2
May  8 03:16:38 localhost kernel: IPv6: ADDRCONF(NETDEV_UP): tap0: link is not ready
May  8 03:16:38 localhost kernel: IPv6: ADDRCONF(NETDEV_CHANGE): tap0: link becomes ready

When investigating I notice the line:
localhost dockerd-rootless.sh[159644]: [rootlesskit:parent] error: failed to setup network &{logWriter:0xc000250ae0 binary:slirp4netns mtu:65520 ipnet:<nil> disableHostLoopback:true apiSocketPath: enableSandbox:true enableSeccomp:true enableIPv6:false ifname:tap0 infoMu:{w:{state:0 sema:0} writerSem:0 readerSem:0 readerCount:{_:{} v:0} readerWait:{_:{} v:0}} info:<nil>}: waiting for ready fd (/usr/bin/slirp4netns --mtu 65520 -r 3 --disable-host-loopback --enable-sandbox --enable-seccomp 159656 tap0): slirp4netns failed

I believe the PID (159656) that is being called is an issue? But no idea, I am stumped.

Here is the server disk setup as well, with the OS running in memory and a locally attached disk named “scratch” for persistent storage:

Filesystem           Size  Used Avail Use% Mounted on
rootfs               189G  3.3G  186G   2% /
devtmpfs             189G     0  189G   0% /dev
tmpfs                189G     0  189G   0% /dev/shm
tmpfs                189G   18M  189G   1% /run
tmpfs                189G     0  189G   0% /sys/fs/cgroup
tmpfs                 38G     0   38G   0% /run/user/0
tmpfs                 38G     0   38G   0% /run/user/1000
/dev/mapper/scratch  931G  6.6G  925G   1% /scratch

When doing these exact same steps on a normal installed VM, there are no issues at all.

Any help or troubleshooting is appreciated. I would use a different network driver, but have somewhat of a hard requirement on slirp4netns. I am also aware other softwares are more suited for rootless container execution.