You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
I encountered an issue while building a Docker image for deep learning model training, specifically when attempting to install DeepSpeed.
Issue
When building the Docker image, the DeepSpeed installation fails with a warning that NVML initialization is not possible.
However, if I create a container from the same image and install DeepSpeed inside the container, the installation works without any issues.
Environment
Base Image: nvcr.io/nvidia/pytorch:23.01-py3
DeepSpeed Version: 0.16.2
Hello,
I encountered an issue while building a Docker image for deep learning model training, specifically when attempting to install DeepSpeed.
Issue
When building the Docker image, the DeepSpeed installation fails with a warning that NVML initialization is not possible.
However, if I create a container from the same image and install DeepSpeed inside the container, the installation works without any issues.
Environment
Base Image:
nvcr.io/nvidia/pytorch:23.01-py3
DeepSpeed Version:
0.16.2
Build Log
docker_build.log
Additional Context
The problem does not occur with the newer base image
nvcr.io/nvidia/pytorch:24.05-py3
.Thank you.
The text was updated successfully, but these errors were encountered: