Docker is not an absolute path, error only occurs after killing a container

Hello everyone,

I’m running the latest release of docker on Fedora 36 with nvidia container toolkit. Now when I launch my centos or any other container with my mounts and gpu, it works fine. However once I kill my container, trying to run any other container is met with the docker is not an absolute path error. What’s odd is that a reboot fixes it. Is there an error somewhere I am missing? It only happens after a kill a container that was launched with gpu flag.

What is the exact error message? “docker is not an absolute path” does not make sense to me, since docker is not a path obviously. It is a software. Where do you exactly get the error message? Is it coming from a container or from the “docker” command?

Its from running a container, actually running any container after I kill one

Here is full info:

 ai-fe  RVEHost  ~  docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.9.1-docker)
  compose: Docker Compose (Docker Inc., v2.12.2)
  scan: Docker Scan (Docker Inc., v0.21.0)

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 12
 Server Version: 20.10.21
 Storage Driver: btrfs
  Build Version: Btrfs v6.0
  Library Version: 102
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux nvidia runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 1c90a442489720eec95342e1789ee8a5e1b9536f
 runc version: v1.1.4-0-g5fd4c4d
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 6.0.5-200.fc36.x86_64
 Operating System: Fedora Linux 36 (Thirty Six)
 OSType: linux
 Architecture: x86_64
 CPUs: 16
 Total Memory: 15.53GiB
 Name: RVEHost
 ID: JWQB:PUBU:2IOE:G32N:3Q4K:EXJJ:5CXV:R7QF:KUMV:T47C:KWFU:TNSJ
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Now if I run the my CentOS container once, it works:

ai-fe  RVEHost  ~  /home/ai-fe/.docker/container-scripts/CentOS-7-x86_64/CentOS-7-Systemmode-Fedora35
non-network local connections being added to access control list
systemd 219 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN)
Detected virtualization docker.
Detected architecture x86-64.

Welcome to CentOS Linux 7 (Core)!

Set hostname to .
[ OK ] Started Dispatch Password Requests to Console Directory Watch.
[ OK ] Reached target Swap.
[ OK ] Created slice Root Slice.
[ OK ] Listening on /dev/initctl Compatibility Named Pipe.
[ OK ] Listening on Delayed Shutdown Socket.
[ OK ] Created slice User and Session Slice.
[ OK ] Created slice System Slice.
[ OK ] Reached target Slices.
[ OK ] Listening on Journal Socket.
Mounting Huge Pages File System…
Starting Read and set NIS domainname from /etc/sysconfig/network…
Starting Availability of block devices…
Mounting FUSE Control File System…
Starting Configure read-only root support…
Starting Journal Service…
[ OK ] Reached target Local File Systems (Pre).
[ OK ] Started Forward Password Requests to Wall Directory Watch.
[ OK ] Reached target Paths.
[ OK ] Reached target Local Encrypted Volumes.
[ OK ] Started Availability of block devices.
[ OK ] Started Journal Service.
Starting Flush Journal to Persistent Storage…
[ OK ] Mounted FUSE Control File System.
[ OK ] Mounted Huge Pages File System.
[ OK ] Started Read and set NIS domainname from /etc/sysconfig/network.
[ OK ] Started Configure read-only root support.
[ OK ] Reached target Local File Systems.
Starting Load/Save Random Seed…
[ OK ] Started Flush Journal to Persistent Storage.
Starting Create Volatile Files and Directories…
[ OK ] Started Create Volatile Files and Directories.
Starting Update UTMP about System Boot/Shutdown…
[ OK ] Started Load/Save Random Seed.
[ OK ] Started Update UTMP about System Boot/Shutdown.
[ OK ] Reached target System Initialization.
[ OK ] Listening on D-Bus System Message Bus Socket.
[ OK ] Reached target Sockets.
[ OK ] Reached target Basic System.
[ OK ] Started ABRT Automated Bug Reporting Tool.
Starting Builds and install new kernel modules through DKMS…
[ OK ] Started D-Bus System Message Bus.
Starting Builds and install new kmods from akmod packages…
Starting Login Service…
Starting Permit User Sessions…
Starting LSB: Bring up/down networking…
[ OK ] Started dnf makecache --timer.
[ OK ] Started Daily Cleanup of Temporary Directories.
[ OK ] Reached target Timers.
[ OK ] Started Permit User Sessions.
[ OK ] Started Job spooling tools.
[ OK ] Started Command Scheduler.
[ OK ] Started Console Getty.
[ OK ] Reached target Login Prompts.
Starting Cleanup of Temporary Directories…
[ OK ] Started Login Service.
[ OK ] Started Cleanup of Temporary Directories.
[ OK ] Started Builds and install new kmods from akmod packages.
[ OK ] Started Builds and install new kernel modules through DKMS.
[ OK ] Started LSB: Bring up/down networking.
[ OK ] Reached target Network.
[ OK ] Reached target Multi-User System.
Starting Update UTMP about System Runlevel Changes…
[ OK ] Reached target Network is Online.
Starting dnf makecache…
[ OK ] Started Update UTMP about System Runlevel Changes.

CentOS Linux 7 (Core)
Kernel 6.0.5-200.fc36.x86_64 on an x86_64

RVEContainer login:

However, if I kill the docker, and try to run any other docker I get this error:

 ai-fe  RVEHost  ~  /home/ai-fe/.docker/container-scripts/CentOS-7-x86_64/CentOS-7-Systemmode-Fedora35 
non-network local connections being added to access control list
docker:  is not an absolute path.
See 'docker run --help'.

So this is what I meant

This is why it is very important to see the exact error message. It is not just “docker is not an absolute path”, it is “docker: is not an absolute path” with two spaces before “is”. It means "docker: " shows you it is docker that gives you the error message, and it says the space or maybe empty string is not an absolute path. It could have been something like this:

docker: ./relative/path is not an absolute path

I don’t know why this error happens. How do you “kill” a container and how do you run one? The error message indicates that the docker command uses a parameter where it expects an absolute path, but it gets nothing.

This is my script for launching the docker

#!/bin/bash

xhost +local:root

docker network ls | grep hostonly > /dev/null 2>&1
if [ $? -ne 0 ]; then
echo Create host-only network for docker
docker network create -d bridge --internal hostonly
fi

#user should be a member of video and render to get full access to gpu

#export XAUTH_PROTO=$(xauth list | grep \hostname -s` | grep :0 |tail -1 |cut -d' ' -f3)`)
#export XAUTH_KEY=$(xauth list | grep \hostname -s` | grep :0 |tail -1 |cut -d' ' -f)5`)
#Do xauth list | grep unix:0
#inside docker shell xauth add :0 MIT-MAGIC... digest..

IMAGE=c7-systemd:latest

GIDS=( $(id -G) ) #All of my groups
unset GIDS[0] #remove primary group

for g in "${GIDS[@]}"
do
G+=" --group-add=$g"
done

#RM=""
RM=" --rm "

U=""
#U=" --user $(id -u):$(id -g) $G"
containeruser="ai-centOS"

#VOLS=' --volume=/etc/group:/etc/group:ro '
#VOLS+='--volume=/etc/passwd:/etc/passwd:ro '
#VOLS+='--volume=/etc/shadow:/etc/shadow:ro '
#VOLS+='--volume=/etc/sudoers.d:/etc/sudoers.d:ro '
VOLS+='--volume=/tmp/.X11-unix:/tmp/.X11-unix:rw '
VOLS+="--volume=/home/.docker-home/CentOS-7-x86_64/home/:/home "
VOLS+="--volume=/home/.docker-home/CentOS-7-x86_64/root/:/root "
VOLS+='--volume=/opt/.docker-opt/CentOS-7-x86_64:/opt '
VOLS+='--volume=/opt/.docker-opt/rhce-x86_64:/rhce '
VOLS+='--volume=/opt/.docker-opt/RootFS:/RootFS '
VOLS+='--volume=/run/media/ai-fedora:/mnt '
VOLS+="--device=/dev/dri "
VOLS+="--device=/dev/snd "
VOLS+="--device=/dev/vga_arbiter "

NVS=( $(ls /dev/nvidia* 2>/dev/null) )
for N in "${NVS[@]}"
do
VOLS+="--device=$N "
done

# NET='--network=host '
NET='--network=c7-net'

docker run $RM -it --cap-add=SYS_ADMIN --cap-add=SYS_PTRACE -v /sys/fs/cgroup:/sys/fs/cgroup:ro --log-driver none --shm-size=1g --ulimit nofile=262144:262144 --gpus all $U --env="DISPLAY" --env="XAUTHORITY=$XAUTHORITY" --env="XDG_RUNTIME_DIR=$XDG_RUNTIME_DIR" $VOLS -w="/home/${containeruser}" --ipc="host" $NET -w="/home/$containeruser" --hostname="RVEContainer" --name="CentOS7" ${IMAGE} /usr/sbin/init
# EOF

And I usually kill the container by opening another terminal and typing in docker kill CentOS7. The kill happens successfully. What I don’t understand is if docker fails to find some mount point, how does it work the first time around or after a reboot?

Please, edit your post and share the commands and the script in different code blocks instead of quotes, since it is hard to read this way and the forum can change some parts of the code without using code blocks.

You should not kill containers. If CTRL+C doesn’t work in the terminal where the container runs to stop the container, use docker stop containername in the other terminal. docker stop will send s TERM signal (or whatever the container requires) to stop properly, while docker kill just kills the container with a KILL signal immediately. Use it only when you really need to kill it immediately regardless of what the processes are doing in the container. For example when you have an infinite loop you don’t want to wait 10 seconds until docker stop times out and eventually kills the container with the KILL signal.

I didn’t mention mounts. I stil don’t know what causes the problem, but since you are using variables, it could be that one of your variables becomes empty. It could be a volume path, but it could also be XDG_RUNTiME_DIR or XAUTHORITY.

I used docker stop, same error. I removed all my additional volumes and still same error, even docker run hello world failed

Have you checked those variables? Try to put an “echo” before the whole command so instead of running it, you will show the command:

echo docker run $RM -it --cap-add=SYS_ADMIN --cap-add=SYS_PTRACE -v /sys/fs/cgroup:/sys/fs/cgroup:ro --log-driver none --shm-size=1g --ulimit nofile=262144:262144 --gpus all $U --env="DISPLAY" --env="XAUTHORITY=$XAUTHORITY" --env="XDG_RUNTIME_DIR=$XDG_RUNTIME_DIR" $VOLS -w="/home/${containeruser}" --ipc="host" $NET -w="/home/$containeruser" --hostname="RVEContainer" --name="CentOS7" ${IMAGE} /usr/sbin/init
1 Like

Hello, so I reinstalled nvidia-container-toolkit and I also removed the following

NVS=( $(ls /dev/nvidia* 2>/dev/null) )
for N in "${NVS[@]}"
do
VOLS+="--device=$N "
done

It seems like this for loop results in an empty volume, hence the blank error. It is suppose to find all nvidia folders in /dev, but I guess it fails. I manually added all the paths with VOLS+ in my script and it works so far.