Lidar-Based Detection Code Runs Slower in Docker than Outside Docker

Will try to be brief here.

I have a code-base which utilizes lidar data from a 128Channel lidar and ROS2 to perform detections on it using OpenPCDet models.

Outside of Docker I re-play ros2 bag data and get around 10Hz + my code running at the same rate providing detections etc. Everything works fine.

I setup a docker image based on nvidia/cuda:12.1.0-cudnn8-devel-ubuntu22.04

and set the following enviroment variables

ENV TORCH_CUDA_ARCH_LIST="6.1;7.5;8.6;8.9"
ENV LANG=en_US
ENV FORCE_CUDA="1"

I use the exact same pytorch version which we use outside docker (though I’m using cuda 12.1 in container and 12.4 outside)

RUN pip3 install torch==2.4.0+cu121 torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu121

All of the dependencies for pip are the exact same as outside the container.

I initially saw extremely low performance and HUGE CPU usage (400-800% in htop) compared to around 120% outside docker.

After a great deal of debugging I found that setting the following variables fixed the cpu usage issue

ENV OMP_NUM_THREADS=1
ENV MKL_NUM_THREADS=1

In terms of performance of the model I see no different between having these variables set to 1 or not, but the CPU usage is completely normal now.

The remaining issue is that in the docker when I playback the ros2 bag data it plays at a lower rate of around 8hz and my detections between 4-6 Hz. (Note this was the same performance before I added the variables above to fix the CPU usage, the performance in docker has always been low, I just have normal cpu usage now)

I use docker compose with /dev:/dev mounted to access the actual lidar unit, network_mode as host, ipc as host and allow access to the GPU

    environment:
      - DISPLAY=${DISPLAY}
      - "QT_X11_NO_MITSHM=1"
    volumes:
      - /tmp/.X11-unix:/tmp/.X11-unix
      - /dev:/dev:rw
      - ./configs:/configs:rw
    network_mode: host
    ipc: host
    privileged: true
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]

I am at a loss of what could possible be wrong. If it’s a ROS thing, or a docker thing, or cuda.

I am hoping that someone here might have some experience with issues like this. If so I would greatly appreciate the help.

1 Like