I am trying to run a Docker container using nvidia/cuda:11.8.0-base-ubuntu22.04 as the base image, with PyTorch and CUDA-enabled dependencies to execute a FastAPI application. The application works perfectly on my local machine and correctly detects CUDA

btb4198 · November 25, 2024, 2:13am

I am trying to run a Docker container using nvidia/cuda:11.8.0-base-ubuntu22.04 as the base image, with PyTorch and CUDA-enabled dependencies to execute a FastAPI application. The application works perfectly on my local machine and correctly detects CUDA. However, inside the container, torch.cuda.is_available() consistently returns False, and the message “CUDA is not available” is logged. The container otherwise runs correctly.

Environment Setup

Local Environment

OS: Windows 11 with WSL2 enabled.
CUDA Toolkit: 11.8.0.
GPU: NVIDIA RTX 3070.
NVIDIA Driver Version: 560.94.
PyTorch Version: 2.0.1+cu118.

Docker Environment

Base Image: nvidia/cuda:11.8.0-base-ubuntu22.04.
Docker Desktop: WSL2 backend with Ubuntu as the WSL integration.
NVIDIA Container Toolkit Installed: Verified using docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi.

What Has Been Tried So Far

1. Verified GPU Access in Docker

Ran the following command to confirm that Docker can detect the GPU:

docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi

Output:

GPU is correctly detected.
NVIDIA driver and CUDA versions are displayed.

2. Ensured Compatibility Between CUDA and PyTorch

Used PyTorch with CUDA version 11.8 (2.0.1+cu118) in both local and containerized environments.
Updated the DiffI2I_Environment.yml file to specify:

dependencies:
  - python=3.9
  - cudatoolkit=11.8.0
  - pytorch=2.0.1
  - torchvision=0.15.2
  - torchaudio=2.0.2
  # Other dependencies

3. Updated Dockerfile

Revised the Dockerfile to ensure compatibility and included necessary steps:

FROM nvidia/cuda:11.8.0-base-ubuntu22.04

# Install Miniconda
WORKDIR /app
RUN apt-get update && apt-get install -y wget bzip2 build-essential libgl1 libglib2.0-0 && \
    wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O /miniconda.sh && \
    bash /miniconda.sh -b -p /opt/conda && \
    rm /miniconda.sh && \
    rm -rf /var/lib/apt/lists/*
ENV PATH="/opt/conda/bin:$PATH"

# Copy the Conda environment file and install
COPY DiffI2I_Environment.yml .
RUN conda install -n base -c conda-forge mamba && \
    mamba env update -f DiffI2I_Environment.yml && \
    conda clean --all --yes

# Set Conda environment
ENV PATH="/opt/conda/envs/StyleCanvasAI/bin:$PATH"
SHELL ["conda", "run", "-n", "StyleCanvasAI", "/bin/bash", "-c"]

# Copy application files
COPY . /app/

# Expose port for FastAPI application
EXPOSE 8000

# Entrypoint to launch the server
CMD ["uvicorn", "Diffi2i_Inference_Server:app", "--host", "0.0.0.0", "--port", "8000", "--log-level", "debug"]

4. Verified CUDA Setup Inside the Container

Ran the following inside the container:

python -c "import torch; print(torch.cuda.is_available())"

Output:
False.
Confirmed CUDA availability:

python -c "import torch; print(torch.version.cuda)"

Output:

11.8.

5. Confirmed NVIDIA Toolkit Installation

Ensured the NVIDIA Container Toolkit is installed and functional.
Verified with:

sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker

6. Tested with a Clean CUDA Container

Ran a clean test using:

bash

docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi

Output:
- The GPU is detected, and nvidia-smi works.

7. Verified Dependencies

Ensured PyTorch, torchvision, and torchaudio are installed correctly in the container.
Verified that torch.cuda.is_available() works on the same configuration locally.

Remaining Issue

Even after verifying GPU access and ensuring compatibility between CUDA, NVIDIA drivers, and PyTorch, the application inside the container consistently logs CUDA is not available. It’s unclear why the containerized PyTorch cannot detect CUDA.

Additional Context

Running the application directly on the host system (outside Docker) works flawlessly, and torch.cuda.is_available() returns True.
The same Conda environment and dependencies are used both locally and inside the container.

Help Needed

Are there additional steps needed to enable GPU access for PyTorch in Docker?
Is there a known issue with CUDA compatibility in containers using WSL2 backend?
Are there debugging steps or environment configurations I might have missed?

bluepuma77 · November 25, 2024, 7:04am

You never shared the Docker command or compose file of the container not working.

btb4198 · November 25, 2024, 2:00pm

Here is my docker file:

FROM nvidia/cuda:11.8.0-base-ubuntu22.04

# Install Miniconda
WORKDIR /app
RUN apt-get update && apt-get install -y wget bzip2 build-essential libgl1 libglib2.0-0 && \
    wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O /miniconda.sh && \
    bash /miniconda.sh -b -p /opt/conda && \
    rm /miniconda.sh && \
    rm -rf /var/lib/apt/lists/*
ENV PATH="/opt/conda/bin:$PATH"

# Copy the Conda environment file and install
COPY DiffI2I_Environment.yml .
RUN conda install -n base -c conda-forge mamba && \
    mamba env update -f DiffI2I_Environment.yml && \
    conda clean --all --yes

# Activate the Conda environment by default
ENV PATH="/opt/conda/envs/StyleCanvasAI/bin:$PATH"
SHELL ["conda", "run", "-n", "StyleCanvasAI", "/bin/bash", "-c"]

# Copy application files
COPY Diffi2i_Inference_Server.py DiffI2IModelEnum.py BetaSchedule.py S2_Parameters.py DiffI2I_S2.py common.py ddpm.py TensorMathTools.py style_canvas_utils.py InferenceImageProcessor.py DiffI2I_Inference.py S2ModelConfigurations.py FaceImageProcessor.py Face_Parsing_Model.py Diff_I2I_lib.py ./
COPY options/ ./options/
COPY ldm/ ./ldm/
COPY Resize_Model_Weights/yolov8l-face.pt Resize_Model_Weights/
COPY Resize_Model_Weights/RealESRGAN_x4plus.pth Resize_Model_Weights/
COPY checkpoints/OilPainting_SC3/DiffI2I_S2/diffi2i_s2_Model_1699.pth.tar checkpoints/OilPainting_SC3/DiffI2I_S2/
COPY checkpoints/OilPainting_SC3/settings.txt checkpoints/OilPainting_SC3/
COPY Test_Images/ ./Test_Images/

# Expose port 8000 for the FastAPI application
EXPOSE 8000

# Set entrypoint for running the server
CMD ["uvicorn", "Diffi2i_Inference_Server:app", "--host", "0.0.0.0", "--port", "8000", "--log-level", "debug"]

and here is my DiffI2I_Environment.yml file:

name: StyleCanvasAI

channels:
  - pytorch
  - conda-forge
  - defaults

dependencies:
  - python=3.9
  - matplotlib
  - numpy
  - opencv
  - pillow
  - scikit-image
  - tqdm
  - natsort
  - ultralytics
  - einops
  - flask
  - fastapi
  - dlib
  - lmdb
  - mamba
  - cudatoolkit=11.7
  - cudnn=9.2.1.18
  - pip
  - pytorch=2.0.1
  - torchvision=0.15.2
  - torchaudio=2.0.2
  - pip:
      - basicsr
      - realesrgan
      - uvicorn
      - dill

btb4198 · November 25, 2024, 2:01pm

and here is the command I have been using:

docker run --gpus all -it -p 8000:8000 --name OilPaintingSC3_Docker diffi2i_oilpainting3

rimelek · November 25, 2024, 9:30pm

Since you use Docker Desktop on Windows (assuing because you mentioned “WSL backend”, where do you try to install the nvidia toolkit?

I couldn’t find now where the documentation says you don’t have to install anything in WSL, but you could only install anything in your own WSL2 distribution not where the Docker daemon is running.

I read that this issue could be caused by incompatible drivers, but I’m not sure which version should suppot which cuda / pytorch.

btb4198 · November 25, 2024, 11:22pm

I have it on both my computer and the WSL see:

btb4198 · November 25, 2024, 11:24pm

Also, I can access cuda when I run my program on my computer just fine, it is when I try to run it in a docker container that’s the problem

rimelek · November 26, 2024, 1:38pm

Docker containers run in a virtual machine in Docker Desktop’s own distribution where you couldn’t even install anything even if you wanted to. The client side (where your docker command runs) doesn’t matter. So because you can’t install anything in Docker Desktop’s distribution, all that could matter is what you have on Windows and the Docker image. Maybe Docker Dekstop version, but I’m not sure about that as the GPU support is basically provided by WSL2. Sorry, but for now that is all I could share. Hopefully someone will come who actually used CUDA on Windows in Docker Desktop

btb4198 · November 30, 2024, 9:04pm

Does anyone know how to fix this???

btb4198 · November 30, 2024, 9:11pm

Has anyone gotten this to work before ??

system · December 10, 2024, 9:12pm

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Getting NVIDIA CUDA GPU to run on Windows Docker Desktop Docker Desktop windows	3	3272	October 27, 2024
Persistent CUDA GPU Detection Failure (device_count=0) in Docker/WSL2 Despite nvidia-smi Working (PaddlePaddle/PyTorch) Docker Desktop docker , windows	2	165	May 29, 2025
Torch.cuda.is_available() is False in container created by pytorch image General	0	2305	April 1, 2019
Docker container GPU access-passthrough. Nvidia-GPU on ZF-ProAI General	2	4466	February 3, 2022
Issue with Enabling GPU Usage in Docker (Option Not Appearing) Docker Desktop	2	1016	February 10, 2025