Docker can only see GPUs with --privilidged flag

Docker can see GPUs when I use the --privileged flag:

$ docker run --rm --runtime=nvidia --gpus all --privileged nvidia/cuda:12.6.1-base-ubuntu22.04 nvidia-smi
Thu Sep 12 09:33:40 2024
±----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03 Driver Version: 560.35.03 CUDA Version: 12.6 |
|-----------------------------------------±-----------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A100 80GB PCIe Off | 00000000:18:00.0 Off | 0 |
| N/A 40C P0 48W / 300W | 14MiB / 81920MiB | 0% Default |
| | | Disabled |
±----------------------------------------±-----------------------±---------------------+
| 1 NVIDIA A100 80GB PCIe Off | 00000000:51:00.0 Off | 0 |
| N/A 43C P0 49W / 300W | 14MiB / 81920MiB | 0% Default |
| | | Disabled |
±----------------------------------------±-----------------------±---------------------+

±----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
±----------------------------------------------------------------------------------------+

However, if I don’t use the --priviledged flag it does not see them:

$ docker run --rm --runtime=nvidia --gpus all nvidia/cuda:12.6.1-base-ubuntu22.04 nvidia-smi
Failed to initialize NVML: Unknown Error

Information about the setup:

$ docker version
Client: Docker Engine - Community
Version: 27.2.1
API version: 1.47
Go version: go1.22.7
Git commit: 9e34c9b
Built: Fri Sep 6 12:08:10 2024
OS/Arch: linux/amd64
Context: default

Server: Docker Engine - Community
Engine:
Version: 27.2.1
API version: 1.47 (minimum version 1.24)
Go version: go1.22.7
Git commit: 8b539b8
Built: Fri Sep 6 12:08:10 2024
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.7.22
GitCommit: 7f7fdf5fed64eb6a7caf99b3e12efcf9d60e311c
nvidia:
Version: 1.1.14
GitCommit: v1.1.14-0-g2c9f560
docker-init:
Version: 0.19.0
GitCommit: de40ad0

$ docker info
Client: Docker Engine - Community
Version: 27.2.1
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.16.2
Path: /usr/libexec/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.29.2
Path: /usr/libexec/docker/cli-plugins/docker-compose

Server:
Containers: 5
Running: 0
Paused: 0
Stopped: 5
Images: 5
Server Version: 27.2.1
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 nvidia runc
Default Runtime: nvidia
Init Binary: docker-init
containerd version: 7f7fdf5fed64eb6a7caf99b3e12efcf9d60e311c
runc version: v1.1.14-0-g2c9f560
init version: de40ad0
Security Options:
apparmor
seccomp
Profile: builtin
cgroupns
Kernel Version: 6.8.0-40-generic
Operating System: Ubuntu 22.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 48
Total Memory: 188.1GiB
ID: 1e65b890-0da9-4cee-9e43-02a7eefd1ad3
Docker Root Dir: /var/lib/docker
Debug Mode: false
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false

$ nvidia-smi
Thu Sep 12 13:59:14 2024
±----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03 Driver Version: 560.35.03 CUDA Version: 12.6 |
|-----------------------------------------±-----------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A100 80GB PCIe Off | 00000000:18:00.0 Off | 0 |
| N/A 42C P0 49W / 300W | 14MiB / 81920MiB | 0% Default |
| | | Disabled |
±----------------------------------------±-----------------------±---------------------+
| 1 NVIDIA A100 80GB PCIe Off | 00000000:51:00.0 Off | 0 |
| N/A 45C P0 49W / 300W | 14MiB / 81920MiB | 0% Default |
| | | Disabled |
±----------------------------------------±-----------------------±---------------------+

±----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 1993 G /usr/lib/xorg/Xorg 4MiB |
| 1 N/A N/A 1993 G /usr/lib/xorg/Xorg 4MiB |
±----------------------------------------------------------------------------------------+

$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX Open Kernel Module for x86_64 560.35.03 Release Build (dvs-builder@U16-I1-N07-12-3) Fri Aug 16 21:42:42 UTC 2024
GCC version: gcc version 13.1.0 (Ubuntu 13.1.0-8ubuntu1~22.04)

$ lspci | grep -i nvidia
18:00.0 3D controller: NVIDIA Corporation GA100 [A100 PCIe 80GB] (rev a1)
51:00.0 3D controller: NVIDIA Corporation GA100 [A100 PCIe 80GB] (rev a1)

$ uname -a
Linux 6.8.0-40-generic #40~22.04.3-Ubuntu SMP PREEMPT_DYNAMIC Tue Jul 30 17:30:19 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

$ cat /etc/os-release
PRETTY_NAME=“Ubuntu 22.04.4 LTS”
NAME=“Ubuntu”
VERSION_ID=“22.04”
VERSION=“22.04.4 LTS (Jammy Jellyfish)”
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian

$ dpkg -l | grep nvidia-container-toolkit
ii nvidia-container-toolkit 1.16.1-1 amd64 NVIDIA Container toolkit
ii nvidia-container-toolkit-base 1.16.1-1 amd64 NVIDIA Container Toolkit Base