Immich•8mo ago

CUDA stopped working after upgrading to newer driver version

Hi all,

Running immich in a docker container in a Proxmox LXC. GPU (RTX 2060) is exposed to the LXC, driver version 570.144, CUDA 12.8.

I used to use the 535 drivers from the Debian repo, 570 drivers are now installed using NVidia .run file. Other containers, such as Frigate, Beszel (monitoring) can see and use the GPU.

However, with Immich I am getting the following error:

[E:onnxruntime:Default, cuda_call.cc:118 CudaCall] CUDA failure 100: no CUDA-capable device is detected ; GPU=-1 ; hostname=f77b4a10892a ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc ; line=66 ; expr=cudaGetDeviceCount(&num_devices); 
immich_machine_learning  | *************** EP Error ***************
immich_machine_learning  | EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc:59 static onnxruntime::CUDAExecutionProviderInfo onnxruntime::CUDAExecutionProviderInfo::FromProviderOptions(const onnxruntime::ProviderOptions&) [ONNXRuntimeError] : 1 : FAIL : provider_options_utils.h:151 Parse Failed to parse provider option "device_id": CUDA failure 100: no CUDA-capable device is detected ; GPU=-1 ; hostname=f77b4a10892a ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc ; line=66 ; expr=cudaGetDeviceCount(&num_devices);

[E:onnxruntime:Default, cuda_call.cc:118 CudaCall] CUDA failure 100: no CUDA-capable device is detected ; GPU=-1 ; hostname=f77b4a10892a ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc ; line=66 ; expr=cudaGetDeviceCount(&num_devices); 
immich_machine_learning  | *************** EP Error ***************
immich_machine_learning  | EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc:59 static onnxruntime::CUDAExecutionProviderInfo onnxruntime::CUDAExecutionProviderInfo::FromProviderOptions(const onnxruntime::ProviderOptions&) [ONNXRuntimeError] : 1 : FAIL : provider_options_utils.h:151 Parse Failed to parse provider option "device_id": CUDA failure 100: no CUDA-capable device is detected ; GPU=-1 ; hostname=f77b4a10892a ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc ; line=66 ; expr=cudaGetDeviceCount(&num_devices);

[E:onnxruntime:Default, cuda_call.cc:118 CudaCall] CUDA failure 100: no CUDA-capable device is detected ; GPU=-1 ; hostname=f77b4a10892a ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc ; line=66 ; expr=cudaGetDeviceCount(&num_devices); 
immich_machine_learning  | *************** EP Error ***************
immich_machine_learning  | EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc:59 static onnxruntime::CUDAExecutionProviderInfo onnxruntime::CUDAExecutionProviderInfo::FromProviderOptions(const onnxruntime::ProviderOptions&) [ONNXRuntimeError] : 1 : FAIL : provider_options_utils.h:151 Parse Failed to parse provider option "device_id": CUDA failure 100: no CUDA-capable device is detected ; GPU=-1 ; hostname=f77b4a10892a ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc ; line=66 ; expr=cudaGetDeviceCount(&num_devices);

[E:onnxruntime:Default, cuda_call.cc:118 CudaCall] CUDA failure 100: no CUDA-capable device is detected ; GPU=-1 ; hostname=f77b4a10892a ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc ; line=66 ; expr=cudaGetDeviceCount(&num_devices); 
immich_machine_learning  | *************** EP Error ***************
immich_machine_learning  | EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc:59 static onnxruntime::CUDAExecutionProviderInfo onnxruntime::CUDAExecutionProviderInfo::FromProviderOptions(const onnxruntime::ProviderOptions&) [ONNXRuntimeError] : 1 : FAIL : provider_options_utils.h:151 Parse Failed to parse provider option "device_id": CUDA failure 100: no CUDA-capable device is detected ; GPU=-1 ; hostname=f77b4a10892a ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc ; line=66 ; expr=cudaGetDeviceCount(&num_devices);

Some outputs:

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Fri_Feb_21_20:23:50_PST_2025
Cuda compilation tools, release 12.8, V12.8.93
Build cuda_12.8.r12.8/compiler.35583870_0

nvidia-container-toolkit --version
NVIDIA Container Runtime Hook version 1.17.6
commit: e627eb2e21e167988e04c0579a1c941c1e263ff6

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Fri_Feb_21_20:23:50_PST_2025
Cuda compilation tools, release 12.8, V12.8.93
Build cuda_12.8.r12.8/compiler.35583870_0

nvidia-container-toolkit --version
NVIDIA Container Runtime Hook version 1.17.6
commit: e627eb2e21e167988e04c0579a1c941c1e263ff6

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Fri_Feb_21_20:23:50_PST_2025
Cuda compilation tools, release 12.8, V12.8.93
Build cuda_12.8.r12.8/compiler.35583870_0

nvidia-container-toolkit --version
NVIDIA Container Runtime Hook version 1.17.6
commit: e627eb2e21e167988e04c0579a1c941c1e263ff6

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Fri_Feb_21_20:23:50_PST_2025
Cuda compilation tools, release 12.8, V12.8.93
Build cuda_12.8.r12.8/compiler.35583870_0

nvidia-container-toolkit --version
NVIDIA Container Runtime Hook version 1.17.6
commit: e627eb2e21e167988e04c0579a1c941c1e263ff6

Docker compose:

immich-machine-learning:
    container_name: immich_machine_learning
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-cuda
    extends:
      file: hwaccel.ml.yml
      service: cuda

immich-machine-learning:
    container_name: immich_machine_learning
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-cuda
    extends:
      file: hwaccel.ml.yml
      service: cuda

immich-machine-learning:
    container_name: immich_machine_learning
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-cuda
    extends:
      file: hwaccel.ml.yml
      service: cuda

immich-machine-learning:
    container_name: immich_machine_learning
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-cuda
    extends:
      file: hwaccel.ml.yml
      service: cuda

Any help would be greatly appreciated.

CUDA stopped working after upgrading to newer driver version

Similar Threads

Similar Threads

Similar Threads