Pod is unable to find/use GPU in python
Hi,
I'm trying to connect to this pod:
RunPod Pytorch 2.2.10
ID: zgel6p985mjmmn
1 x A30
8 vCPU 31 GB RAM
runpod/pytorch:2.2.0-py3.10-cuda12.1.1-devel-ubuntu22.04
On-Demand - Community Cloud
Running
40 GB Disk
20 GB Pod Volume
Volume Path: /workspace
I can see that it has a GPU with nvidia-smi, and the cuda and pytorch version seem correct, but I cannot use the GPU with torch...
Can anyone help?
Best
```
root@54be7382bee1:~# python
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
I'm trying to connect to this pod:
RunPod Pytorch 2.2.10
ID: zgel6p985mjmmn
1 x A30
8 vCPU 31 GB RAM
runpod/pytorch:2.2.0-py3.10-cuda12.1.1-devel-ubuntu22.04
On-Demand - Community Cloud
Running
40 GB Disk
20 GB Pod Volume
Volume Path: /workspace
I can see that it has a GPU with nvidia-smi, and the cuda and pytorch version seem correct, but I cannot use the GPU with torch...
Can anyone help?
Best
```
root@54be7382bee1:~# python
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
import torch
>>> torch.cuda.is_available()
/usr/local/lib/python3.10/dist-packages/torch/cuda/init.py:141: UserWarning: CUDA initialization: CUDA driver initialization failed, you might not have a CUDA gpu. (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.)
return torch._C._cuda_getDeviceCount() > 0
False
>>> torch.version
'2.2.0+cu121'
>>> exit()
root@54be7382bee1:~# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0
Solution
@Dhruv Mullick I don't think it has to do with the image... If you select it from the runpod website, there is a filter button at the top and then a drop down menu where you can select 12.2 as "Allowed CUDA Versions"
as @ashleyk pointed out earlier 'the machine is running CUDA 12.3 which is not production ready'. if I select 12.2 it works.
as @ashleyk pointed out earlier 'the machine is running CUDA 12.3 which is not production ready'. if I select 12.2 it works.


