What is the latest proper way to use the Nvidia Container Toolkit with docker compose?

knpwrs · October 31, 2024, 11:30am

What is the equivalent of this docker command in Docker Compose?

docker run --rm -it --device=nvidia.com/gpu=all ubuntu:latest nvidia-smi

That command works for me:

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.78                 Driver Version: 550.78         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4060 Ti     Off |   00000000:01:00.0 Off |                  N/A |
| 30%   28C    P0             26W /  165W |       1MiB /  16380MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

However, with the following docker-compose.yml:

services:
  testing:
    image: ubuntu:latest
    command: nvidia-smi
    environment:
      NVIDIA_VISIBLE_DEVICES: all
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

… and running docker-compose up I get the following output:

[+] Running 2/2
 ✔ Network testing_default      Created                                                                             0.1s
 ✔ Container testing-testing-1  Created                                                                             0.1s
Attaching to testing-1
Error response from daemon: could not select device driver "nvidia" with capabilities: [[gpu]]

It seems like things have been in flux. The above yaml worked for me with nvidia-docker, but not nvidia-container-toolkit. I did have to specify runtime: nvidia before, but now when I specify that I get Error response from daemon: unknown or invalid runtime name: nvidia.

rimelek · October 31, 2024, 12:41pm

Of course, the only machine in which I had nvidia GPU broke yesterday… so without testing, I can say that nvidia-docker included nvidia-container-runtime too, which is an archived project now as well as nvidia-docker.

What I see as a difference in your docker run command and compose command is that the docker run command seems to refer to all GPUs, while in the compose file you only asked for one. If you have another integrated GPU on the motherboard, count: 1 could actually mean that one. try count: all or use device_ids

By the way for all GPUs with the docker run command, you can use --gpus all too instead of the device option.

knpwrs · October 31, 2024, 9:13pm

Thanks to ereslibre on github I have a solution!

services:
  testing:
    image: ubuntu:latest
    command: nvidia-smi
    deploy:
      resources:
        reservations:
          devices:
            - driver: cdi
              device_ids:
                - nvidia.com/gpu=all

rimelek · November 1, 2024, 12:15am

So Docker was installed using Nix?

Topic		Replies	Views
Applications not using GPU inside the container General	4	2127	April 12, 2024
New Docker User - What am I doing wrong? Docker Desktop linux	6	298	October 11, 2024
Using NVIDIA GPU with docker swarm started by docker-compose file Swarm	1	2598	April 6, 2021
Docker compose GPU device selection as no effect General docker-compose	4	1425	November 9, 2024
Docker-compose breaking nvidia-smi Compose	0	869	April 10, 2020

What is the latest proper way to use the Nvidia Container Toolkit with docker compose?

Related topics