RunpodR
Runpod2y ago
4 replies
DreamGen

UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda

This is a reocurring problem on RunPod.

This time with 3090 -- tried 3 different pods in CA region (can't use US region because it has maintenance soon...).
ID: wmwxn9onlckqus

root@fd08183704a5:~# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0


root@fd08183704a5:~nvidia-smi
Sat Mar 16 07:26:26 2024       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.1     |
|-------------------------------+----------------------+----------------------+


root@fd08183704a5:~# python
Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'2.1.1+cu121'
Solution
You need to use the CUDA filter to select the correct CUDA version. CUDA is not forwards compatible. You need to select a machine that matches the CUDA version of your Docker image. The machine can have a higher version then your Docker image but not a lower version. CUDA is backwards compatible but not forwards compatible.
Was this page helpful?