Docker Community Forums

Share and learn in the Docker community.

How to execute docker-compose which has different services(--gpus)

I have a scenario where i should run services which include gpu and non-gpu.

For individual container run i’m can use

sudo docker run --gpus all --volume=docker-train:/src/ train

For docker-compose what can i include??

My docker-compose file looks as follows:

version 2.3 is required for NVIDIA runtime

version: ‘2.3’

services:
nvidia-driver:
# NVIDIA GPU driver used by the CUDA Toolkit
image: nvidia/driver:440.33.01-ubuntu18.04
environment:
- NVIDIA_VISIBLE_DEVICES=all
volumes:
# Do we need this volume to make the driver accessible by other containers in the network?
- nvidia_driver:/usr/local/nvidai/:ro # Taken from here: http://collabnix.com/deploying-application-in-the-gpu-accelerated-data-center-using-docker/
networks:
- net

nvidia-cuda:
depends_on:
- nvidia-driver
image: nvidia/cuda:10.1-base-ubuntu18.04
volumes:
# Do we need the driver volume here?
- nvidia_driver:/usr/local/nvidai/:ro # Taken from here: http://collabnix.com/deploying-application-in-the-gpu-accelerated-data-center-using-docker/
# Do we need to create an additional volume for this service to be accessible by the tensorflow service?
devices:
# Do we need to list the devices here, or only in the tensorflow service. Taken from here: http://collabnix.com/deploying-application-in-the-gpu-accelerated-data-center-using-docker/
- /dev/nvidiactl
- /dev/nvidia-uvm
- /dev/nvidia0
networks:
- net

tensorflow:
image: tensorflow/tensorflow:2.0.1-gpu # Does this ship with cuda10.0 installed or do I need a separate container for it?
runtime: nvidia
restart: always
privileged: true
depends_on:
- nvidia-cuda
environment:
- NVIDIA_VISIBLE_DEVICES=all
volumes:
# Volumes related to source code and config files
- ./src:/src
- ./configs:/configs
# Do we need the driver volume here?
- nvidia_driver:/usr/local/nvidai/:ro # Taken from here: http://collabnix.com/deploying-application-in-the-gpu-accelerated-data-center-using-docker/
# Do we need an additional volume from the nvidia-cuda service?
command: import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000]))); print(“SUCCESS”)
devices:
# Devices listed here: http://collabnix.com/deploying-application-in-the-gpu-accelerated-data-center-using-docker/
- /dev/nvidiactl
- /dev/nvidia-uvm
- /dev/nvidia0
- /dev/nvidia-uvm-tools
networks:
- net

volumes:
nvidia_driver:

networks:
net:
driver: bridge