© 2026 Hedgehog Software, LLC

TwitterGitHubDiscord
More
CommunitiesDocsAboutTermsPrivacy
Search
Star
Setup for Free
ImmichI
Immich•4mo ago•
7 replies
Medo

Remote Machine Learning full of errors when processing starts

I want to use my GPU to process my media. I started on my Windows 11 the machine learning container with CUDA image. Then I changed the settings in Immich to use this container as a machine learning server.
When I started queued jobs I saw that it could successfully connect to the remote machine learning server, but a lot of errors were logged.
I attached a screenshot when I run
docker exec -it immich_machine_learning nvidia-smi
docker exec -it immich_machine_learning nvidia-smi
.

Somebody knows what the problem is?

docker-compose.yml:
name: immich_remote_ml
services:
  immich-machine-learning:
    container_name: immich_machine_learning
    image: altran1502/immich-machine-learning:v2.1.0-cuda
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities:
                - gpu
    volumes:
      - model-cache:/cache
    restart: always
    ports:
      - 3003:3003
volumes:
  model-cache:
name: immich_remote_ml
services:
  immich-machine-learning:
    container_name: immich_machine_learning
    image: altran1502/immich-machine-learning:v2.1.0-cuda
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities:
                - gpu
    volumes:
      - model-cache:/cache
    restart: always
    ports:
      - 3003:3003
volumes:
  model-cache:


Logs:
[10/21/25 12:21:37] INFO     Starting gunicorn 23.0.0                           
[10/21/25 12:21:37] INFO     Listening at: http://[::]:3003 (8)                          
2025-10-21 12:30:27.143106979 [E:onnxruntime:Default, cuda_call.cc:118 CudaCall] CUDNN failure 5000: CUDNN_STATUS_EXECUTION_FAILED ; GPU=0 ; hostname=b9f607173de1 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/nn/conv.cc ; line=455 ; expr=cudnnConvolutionForward(cudnn_handle, &alpha, s_.x_tensor, s_.x_data, s_.w_desc, s_.w_data, s_.conv_desc, s_.algo, workspace.get(), s_.workspace_bytes, &beta, s_.y_tensor, s_.y_data); 
2025-10-21 12:30:27.143238491 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'/visual/conv1/Conv' Status Message: CUDNN failure 5000: CUDNN_STATUS_EXECUTION_FAILED ; GPU=0 ; hostname=b9f607173de1 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/nn/conv.cc ; line=455 ; expr=cudnnConvolutionForward(cudnn_handle, &alpha, s_.x_tensor, s_.x_data, s_.w_desc, s_.w_data, s_.conv_desc, s_.algo, workspace.get(), s_.workspace_bytes, &beta, s_.y_tensor, s_.y_data); 
2025-10-21 12:30:27.145386284 [E:onnxruntime:Default, cuda_call.cc:118 CudaCall] CUDNN failure 5000: CUDNN_STATUS_EXECUTION_FAILED ; GPU=0 ; hostname=b9f607173de1 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/nn/conv.cc ; line=455 ; expr=cudnnConvolutionForward(cudnn_handle, &alpha, s_.x_tensor, s_.x_data, s_.w_desc, s_.w_data, s_.conv_desc, s_.algo, workspace.get(), s_.workspace_bytes, &beta, s_.y_tensor, s_.y_data); 
2025-10-21 12:30:27.145452290 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'/visual/conv1/Conv' Status Message: CUDNN failure 5000: CUDNN_STATUS_EXECUTION_FAILED ; GPU=0 ; hostname=b9f607173de1 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/nn/conv.cc ; line=455 ; expr=cudnnConvolutionForward(cudnn_handle, &alpha, s_.x_tensor, s_.x_data, s_.w_desc, s_.w_data, s_.conv_desc, s_.algo, workspace.get(), s_.workspace_bytes, &beta, s_.y_tensor, s_.y_data); 
[10/21/25 12:30:27] ERROR    Exception in ASGI application
[10/21/25 12:21:37] INFO     Starting gunicorn 23.0.0                           
[10/21/25 12:21:37] INFO     Listening at: http://[::]:3003 (8)                          
2025-10-21 12:30:27.143106979 [E:onnxruntime:Default, cuda_call.cc:118 CudaCall] CUDNN failure 5000: CUDNN_STATUS_EXECUTION_FAILED ; GPU=0 ; hostname=b9f607173de1 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/nn/conv.cc ; line=455 ; expr=cudnnConvolutionForward(cudnn_handle, &alpha, s_.x_tensor, s_.x_data, s_.w_desc, s_.w_data, s_.conv_desc, s_.algo, workspace.get(), s_.workspace_bytes, &beta, s_.y_tensor, s_.y_data); 
2025-10-21 12:30:27.143238491 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'/visual/conv1/Conv' Status Message: CUDNN failure 5000: CUDNN_STATUS_EXECUTION_FAILED ; GPU=0 ; hostname=b9f607173de1 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/nn/conv.cc ; line=455 ; expr=cudnnConvolutionForward(cudnn_handle, &alpha, s_.x_tensor, s_.x_data, s_.w_desc, s_.w_data, s_.conv_desc, s_.algo, workspace.get(), s_.workspace_bytes, &beta, s_.y_tensor, s_.y_data); 
2025-10-21 12:30:27.145386284 [E:onnxruntime:Default, cuda_call.cc:118 CudaCall] CUDNN failure 5000: CUDNN_STATUS_EXECUTION_FAILED ; GPU=0 ; hostname=b9f607173de1 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/nn/conv.cc ; line=455 ; expr=cudnnConvolutionForward(cudnn_handle, &alpha, s_.x_tensor, s_.x_data, s_.w_desc, s_.w_data, s_.conv_desc, s_.algo, workspace.get(), s_.workspace_bytes, &beta, s_.y_tensor, s_.y_data); 
2025-10-21 12:30:27.145452290 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'/visual/conv1/Conv' Status Message: CUDNN failure 5000: CUDNN_STATUS_EXECUTION_FAILED ; GPU=0 ; hostname=b9f607173de1 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/nn/conv.cc ; line=455 ; expr=cudnnConvolutionForward(cudnn_handle, &alpha, s_.x_tensor, s_.x_data, s_.w_desc, s_.w_data, s_.conv_desc, s_.algo, workspace.get(), s_.workspace_bytes, &beta, s_.y_tensor, s_.y_data); 
[10/21/25 12:30:27] ERROR    Exception in ASGI application
image.png
ImmichJoin
A place to hang out, get support, discuss Immich, get announcements about releases and anything else going on.
36,590Members
Resources
Was this page helpful?

Similar Threads

Recent Announcements

Similar Threads

Remote Machine Learning
ImmichIImmich / help-desk-support
4mo ago
remote machine learning
ImmichIImmich / help-desk-support
4mo ago
Remote Machine Learning
ImmichIImmich / help-desk-support
13mo ago
Remote Machine Learning Log
ImmichIImmich / help-desk-support
3y ago