Not sure this is relevant:
Last night I executed
echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf && sudo sysctl -pwhen NPM was raising some issues on React application server.
Since this morning, I see that docker-compose is unusually slow, raising errors, breaking
nvidia-smi and system reboot time increased. I reversed changes from the above command, but issues persist.
- Nvidia DGX Station
- Docker version: 18.09.2
- Docker-Compose version: 1.24.1
- Nvidia Driver version: 418.87
- CUDA version: 10.1
On spinning up a container with docker-compose, it works first, slowly.
Then after a while,
1- nvidia-smi stops working with following issues
$nvidia-smi Unable to determine the device handle for GPU 0000:0E:00.0: GPU is lost: Reboot the system to recover this GPU
2- Docker-Compose is too slow to respond and raises this sometimes
I am not sure how Docker is breaking nvidia-smi and its getting slower…
Any thoughts are appreciated.