I’m running the latest release of docker on Fedora 36 with nvidia container toolkit. Now when I launch my centos or any other container with my mounts and gpu, it works fine. However once I kill my container, trying to run any other container is met with the docker is not an absolute path error. What’s odd is that a reboot fixes it. Is there an error somewhere I am missing? It only happens after a kill a container that was launched with gpu flag.
What is the exact error message? “docker is not an absolute path” does not make sense to me, since docker is not a path obviously. It is a software. Where do you exactly get the error message? Is it coming from a container or from the “docker” command?
Now if I run the my CentOS container once, it works:
ai-fe RVEHost ~ /home/ai-fe/.docker/container-scripts/CentOS-7-x86_64/CentOS-7-Systemmode-Fedora35
non-network local connections being added to access control list
systemd 219 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN)
Detected virtualization docker.
Detected architecture x86-64.
Welcome to CentOS Linux 7 (Core)!
Set hostname to .
[ OK ] Started Dispatch Password Requests to Console Directory Watch.
[ OK ] Reached target Swap.
[ OK ] Created slice Root Slice.
[ OK ] Listening on /dev/initctl Compatibility Named Pipe.
[ OK ] Listening on Delayed Shutdown Socket.
[ OK ] Created slice User and Session Slice.
[ OK ] Created slice System Slice.
[ OK ] Reached target Slices.
[ OK ] Listening on Journal Socket.
Mounting Huge Pages File System…
Starting Read and set NIS domainname from /etc/sysconfig/network…
Starting Availability of block devices…
Mounting FUSE Control File System…
Starting Configure read-only root support…
Starting Journal Service…
[ OK ] Reached target Local File Systems (Pre).
[ OK ] Started Forward Password Requests to Wall Directory Watch.
[ OK ] Reached target Paths.
[ OK ] Reached target Local Encrypted Volumes.
[ OK ] Started Availability of block devices.
[ OK ] Started Journal Service.
Starting Flush Journal to Persistent Storage…
[ OK ] Mounted FUSE Control File System.
[ OK ] Mounted Huge Pages File System.
[ OK ] Started Read and set NIS domainname from /etc/sysconfig/network.
[ OK ] Started Configure read-only root support.
[ OK ] Reached target Local File Systems.
Starting Load/Save Random Seed…
[ OK ] Started Flush Journal to Persistent Storage.
Starting Create Volatile Files and Directories…
[ OK ] Started Create Volatile Files and Directories.
Starting Update UTMP about System Boot/Shutdown…
[ OK ] Started Load/Save Random Seed.
[ OK ] Started Update UTMP about System Boot/Shutdown.
[ OK ] Reached target System Initialization.
[ OK ] Listening on D-Bus System Message Bus Socket.
[ OK ] Reached target Sockets.
[ OK ] Reached target Basic System.
[ OK ] Started ABRT Automated Bug Reporting Tool.
Starting Builds and install new kernel modules through DKMS…
[ OK ] Started D-Bus System Message Bus.
Starting Builds and install new kmods from akmod packages…
Starting Login Service…
Starting Permit User Sessions…
Starting LSB: Bring up/down networking…
[ OK ] Started dnf makecache --timer.
[ OK ] Started Daily Cleanup of Temporary Directories.
[ OK ] Reached target Timers.
[ OK ] Started Permit User Sessions.
[ OK ] Started Job spooling tools.
[ OK ] Started Command Scheduler.
[ OK ] Started Console Getty.
[ OK ] Reached target Login Prompts.
Starting Cleanup of Temporary Directories…
[ OK ] Started Login Service.
[ OK ] Started Cleanup of Temporary Directories.
[ OK ] Started Builds and install new kmods from akmod packages.
[ OK ] Started Builds and install new kernel modules through DKMS.
[ OK ] Started LSB: Bring up/down networking.
[ OK ] Reached target Network.
[ OK ] Reached target Multi-User System.
Starting Update UTMP about System Runlevel Changes…
[ OK ] Reached target Network is Online.
Starting dnf makecache…
[ OK ] Started Update UTMP about System Runlevel Changes.
CentOS Linux 7 (Core)
Kernel 6.0.5-200.fc36.x86_64 on an x86_64
RVEContainer login:
However, if I kill the docker, and try to run any other docker I get this error:
ai-fe RVEHost ~ /home/ai-fe/.docker/container-scripts/CentOS-7-x86_64/CentOS-7-Systemmode-Fedora35
non-network local connections being added to access control list
docker: is not an absolute path.
See 'docker run --help'.
This is why it is very important to see the exact error message. It is not just “docker is not an absolute path”, it is “docker: is not an absolute path” with two spaces before “is”. It means "docker: " shows you it is docker that gives you the error message, and it says the space or maybe empty string is not an absolute path. It could have been something like this:
docker: ./relative/path is not an absolute path
I don’t know why this error happens. How do you “kill” a container and how do you run one? The error message indicates that the docker command uses a parameter where it expects an absolute path, but it gets nothing.
#!/bin/bash
xhost +local:root
docker network ls | grep hostonly > /dev/null 2>&1
if [ $? -ne 0 ]; then
echo Create host-only network for docker
docker network create -d bridge --internal hostonly
fi
#user should be a member of video and render to get full access to gpu
#export XAUTH_PROTO=$(xauth list | grep \hostname -s` | grep :0 |tail -1 |cut -d' ' -f3)`)
#export XAUTH_KEY=$(xauth list | grep \hostname -s` | grep :0 |tail -1 |cut -d' ' -f)5`)
#Do xauth list | grep unix:0
#inside docker shell xauth add :0 MIT-MAGIC... digest..
IMAGE=c7-systemd:latest
GIDS=( $(id -G) ) #All of my groups
unset GIDS[0] #remove primary group
for g in "${GIDS[@]}"
do
G+=" --group-add=$g"
done
#RM=""
RM=" --rm "
U=""
#U=" --user $(id -u):$(id -g) $G"
containeruser="ai-centOS"
#VOLS=' --volume=/etc/group:/etc/group:ro '
#VOLS+='--volume=/etc/passwd:/etc/passwd:ro '
#VOLS+='--volume=/etc/shadow:/etc/shadow:ro '
#VOLS+='--volume=/etc/sudoers.d:/etc/sudoers.d:ro '
VOLS+='--volume=/tmp/.X11-unix:/tmp/.X11-unix:rw '
VOLS+="--volume=/home/.docker-home/CentOS-7-x86_64/home/:/home "
VOLS+="--volume=/home/.docker-home/CentOS-7-x86_64/root/:/root "
VOLS+='--volume=/opt/.docker-opt/CentOS-7-x86_64:/opt '
VOLS+='--volume=/opt/.docker-opt/rhce-x86_64:/rhce '
VOLS+='--volume=/opt/.docker-opt/RootFS:/RootFS '
VOLS+='--volume=/run/media/ai-fedora:/mnt '
VOLS+="--device=/dev/dri "
VOLS+="--device=/dev/snd "
VOLS+="--device=/dev/vga_arbiter "
NVS=( $(ls /dev/nvidia* 2>/dev/null) )
for N in "${NVS[@]}"
do
VOLS+="--device=$N "
done
# NET='--network=host '
NET='--network=c7-net'
docker run $RM -it --cap-add=SYS_ADMIN --cap-add=SYS_PTRACE -v /sys/fs/cgroup:/sys/fs/cgroup:ro --log-driver none --shm-size=1g --ulimit nofile=262144:262144 --gpus all $U --env="DISPLAY" --env="XAUTHORITY=$XAUTHORITY" --env="XDG_RUNTIME_DIR=$XDG_RUNTIME_DIR" $VOLS -w="/home/${containeruser}" --ipc="host" $NET -w="/home/$containeruser" --hostname="RVEContainer" --name="CentOS7" ${IMAGE} /usr/sbin/init
# EOF
And I usually kill the container by opening another terminal and typing in docker kill CentOS7. The kill happens successfully. What I don’t understand is if docker fails to find some mount point, how does it work the first time around or after a reboot?
Please, edit your post and share the commands and the script in different code blocks instead of quotes, since it is hard to read this way and the forum can change some parts of the code without using code blocks.
You should not kill containers. If CTRL+C doesn’t work in the terminal where the container runs to stop the container, use docker stop containername in the other terminal. docker stop will send s TERM signal (or whatever the container requires) to stop properly, while docker kill just kills the container with a KILL signal immediately. Use it only when you really need to kill it immediately regardless of what the processes are doing in the container. For example when you have an infinite loop you don’t want to wait 10 seconds until docker stop times out and eventually kills the container with the KILL signal.
I didn’t mention mounts. I stil don’t know what causes the problem, but since you are using variables, it could be that one of your variables becomes empty. It could be a volume path, but it could also be XDG_RUNTiME_DIR or XAUTHORITY.
Hello, so I reinstalled nvidia-container-toolkit and I also removed the following
NVS=( $(ls /dev/nvidia* 2>/dev/null) )
for N in "${NVS[@]}"
do
VOLS+="--device=$N "
done
It seems like this for loop results in an empty volume, hence the blank error. It is suppose to find all nvidia folders in /dev, but I guess it fails. I manually added all the paths with VOLS+ in my script and it works so far.