Nvidia/cuda doesn't work on Docker Desktop but works on Docker Engine

Hi. I’m trying to comb through the forum, documentation and google search but none of the things I’ve read solved my problem so far.

Issue
“docker run --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi” on Docker Desktop gives the following error:

docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as ‘legacy’

nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.

But, using docker engine only, the container works just fine.

OS Version/Build

Docker Desktop 4.13.0 (89412) → installed through DEB package.
OS: Ubuntu 22.04.1 LTS 64-bit
Nvidia driver: 520.56.06
Package: nvidia-container-toolkit
Version: 1.12.0~rc.1-1
Priority: optional
Section: utils
Maintainer: NVIDIA CORPORATION <cudatools@nvidia.com>
Installed-Size: 2,172 kB
Depends: nvidia-container-toolkit-base (= 1.12.0~rc.1-1), libnvidia-container-tools (>= 1.12.0~rc.1-1), libnvidia-container-tools (<< 2.0.0), libseccomp2
Breaks: nvidia-container-runtime (<= 3.5.0-1), nvidia-container-runtime-hook
Replaces: nvidia-container-runtime (<= 3.5.0-1), nvidia-container-runtime-hook

Docker Desktop Version

Docker Desktop 4.13.0 (89412)
Client:
Context: desktop-linux
Debug Mode: false
Plugins:
app: Docker App (Docker Inc., v0.9.1-beta3)
buildx: Docker Buildx (Docker Inc., v0.9.1-docker)
compose: Docker Compose (Docker Inc., v2.12.0)
dev: Docker Dev Environments (Docker Inc., v0.0.3)
extension: Manages Docker extensions (Docker Inc., v0.2.13)
sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc., 0.6.0)
scan: Docker Scan (Docker Inc., v0.21.0)

Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 1
Server Version: 20.10.20
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9cd3357b7fd7218e4aec3eae239db1f68a5a6ec6
runc version: v1.1.4-0-g5fd4c4d
init version: de40ad0
Security Options:
seccomp
Profile: default
cgroupns
Kernel Version: 5.15.49-linuxkit
Operating System: Docker Desktop
OSType: linux
Architecture: x86_64
CPUs: 6
Total Memory: 7.589GiB
Name: docker-desktop
ID: SUBM:3FEQ:CJPR:6B5D:ZBWD:MHK2:7UMM:HUBY:LMML:ZBWO:DWCG:UKH3
Docker Root Dir: /var/lib/docker
Debug Mode: false
HTTP Proxy: http.docker.internal:3128
HTTPS Proxy: http.docker.internal:3128
No Proxy: hubproxy.docker.internal
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
hubproxy.docker.internal:5000
127.0.0.0/8
Live Restore Enabled: false

Steps to reproduce

  1. clean install of Ubuntu 22.04.1 LTS
  2. update, upgrade, install curl, nvidia-driver-520, qemu, uidmap, ca-certificates, gnupg, lsb-release
  3. add user to kvm group
  4. restart PC
  5. create folder /etc/apt/keyrings
  6. curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
  7. create /etc/apt/sources.list.d/docker.list and put in "deb [arch=amd64 signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu jammy stable"
  8. apt update and install docker-desktop-4.13.0-amd64.deb (sudo apt-get install docker-desktop-4.13.0-amd64.deb)
  9. add user to docker (usermod -aG docker $USER), then “newgrp docker”
  10. Install Nvidia container toolkit
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

curl -s -L https://nvidia.github.io/libnvidia-container/experimental/ubuntu22.04/libnvidia-container.list | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
         sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
  1. Actvivate Docker Desktop found in Ubuntu applications.
  2. On the terminal
docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi

When using docker engine (install via https://get.docker.com) it works without any hitch. I’ll get by using docker engine for now, but I’d like to be able to use docker desktop in the future. Hope someone can help me fix this problem. Is Docker Desktop for Linux still in beta?

Sorry for the lengthy post. Thank you for reading!

1 Like

I’m running into the same issue. It appears to be a permissions problem. Running docker as sudo works if you’re pulling an image from elsewhere, however sudo doesn’t see local images unless they’re built with sudo. And, running docker with sudo raises other issues/concerns.

@fantasy4 I think for your issue, you have two options.

  1. Either add your username to “docker” group. This will allow you to run docker in privileged mode.
  2. Or run the daemon in rootless mode.

Hope this helps.

I’m facing the same issue. Get the same exact error. Only three of these will work at a time

  1. Docker Desktop
  2. CUDA Compatibility
  3. Rootless docker calls

I can setup Docker via rootless mode and get CUDA working fine. I can install Docker Desktop and run rootless, but get the libnvidia-ml.so.1 error. I can install Desktop + CUDA, but am forced to use sudo which uses the non-Desktop context.

I’ve spent a week trying to resolve this. Different distros, installation routes, even tried switching to Podman (which sadly doesn’t support compose natively). Nothing works.

An issue has been filed over on the Nvidia side here

Good to see that someone undestands the relation of contexts, Docker Desktop and Docker CE. I wanted to post it, but now I don’t have to.

I don’t use GPU in Docker Desktop, but I tried it on Windows before, when I read that WSL 2 started to support GPU.

I have no idea if the GPU works on any other operating system in Docker Desktop. I know you could pass the GPU of the host to a KVM machine , but I don’t know if it is implemented in Docker Desktop.

Somehow I don’t think it is an isue of nvidia-docker. Working with Docker Desktop is almost like working with a remote machine. So I guess Docker Desktop needs to contain that not your physical host.

Please, anyone correct me if I am wrong :slight_smile:

Maybe a won’t-fix issue? Docker Desktop for Linux uses qemu. If the GPU can be supported on Docker Desktop, it means qemu can use the GPU as well. As I know, currently, passthrough cannot work for both VM and host at the same time with a single GPU.

So, Docker Desktop for Linux support GPU = qemu can use GPU = Game on Linux with Windows VM = won’t fix?

Maybe this is a reason why preferring the docker engine. I currently use podman desktop (podman is needed, but I don’t want to switch) to manage docker containers and images.

1 Like