I
Immich4mo ago
brconn

No Smart Search

- OS: Rocky9 - Deployment: Docker Compose - Immich Version: v1.121.0 & 1.134.0 - HW: - AMD 3600 - ASUS B550 mobo - Gigabyte 1080ti - Several TB free storage - Reverse Proxy: SWAG I was previously running immich v1.121.0 when I noticed that my smart search was no longer working. I decided I likely just needed an update, and updated to v1.134.0 and checked for updates to the docker-compose.yml & supporting files. I can't remember if I checked the logs before updating to the newer version, but I'm currently seeing the error(s). From what I can gather, I think the models are self contained in the ML image. Which ruled out my first thought that I may be blocking the location hosting the models. I've also tried to open port 3003 on the ML container and that didn't help. Any ideas?
immich | [Nest] 17 - 06/13/2025, 12:40:14 AM WARN [Api:MachineLearningRepository~0lkhpjdx] Machine learning request to "http://immich-machine-learning:3003" failed: fetch failed
immich | [Nest] 17 - 06/13/2025, 12:40:14 AM ERROR [Api:ErrorInterceptor~0lkhpjdx] Unknown error: Error: Machine learning request '{"clip":{"textual":{"modelName":"ViT-B-32__openai","options":{"language":"en-US"}}}}' failed for all URLs
immich | [Nest] 17 - 06/13/2025, 12:40:14 AM WARN [Api:MachineLearningRepository~0lkhpjdx] Machine learning request to "http://immich-machine-learning:3003" failed: fetch failed
immich | [Nest] 17 - 06/13/2025, 12:40:14 AM ERROR [Api:ErrorInterceptor~0lkhpjdx] Unknown error: Error: Machine learning request '{"clip":{"textual":{"modelName":"ViT-B-32__openai","options":{"language":"en-US"}}}}' failed for all URLs
14 Replies
Immich
Immich4mo ago
:wave: Hey @brconn, Thanks for reaching out to us. Please carefully read this message and follow the recommended actions. This will help us be more effective in our support effort and leave more time for building Immich :immich:. References - Container Logs: docker compose logs docs - Container Status: docker ps -a docs - Reverse Proxy: https://immich.app/docs/administration/reverse-proxy - Code Formatting https://support.discord.com/hc/en-us/articles/210298617-Markdown-Text-101-Chat-Formatting-Bold-Italic-Underline#h_01GY0DAKGXDEHE263BCAYEGFJA Checklist I have... 1. :ballot_box_with_check: verified I'm on the latest release(note that mobile app releases may take some time). 2. :ballot_box_with_check: read applicable release notes. 3. :ballot_box_with_check: reviewed the FAQs for known issues. 4. :ballot_box_with_check: reviewed Github for known issues. 5. :ballot_box_with_check: tried accessing Immich via local ip (without a custom reverse proxy). 6. :ballot_box_with_check: uploaded the relevant information (see below). 7. :ballot_box_with_check: tried an incognito window, disabled extensions, cleared mobile app cache, logged out and back in, different browsers, etc. as applicable (an item can be marked as "complete" by reacting with the appropriate number) Information In order to be able to effectively help you, we need you to provide clear information to show what the problem is. The exact details needed vary per case, but here is a list of things to consider: - Your docker-compose.yml and .env files. - Logs from all the containers and their status (see above). - All the troubleshooting steps you've tried so far. - Any recent changes you've made to Immich or your system. - Details about your system (both software/OS and hardware). - Details about your storage (filesystems, type of disks, output of commands like fdisk -l and df -h). - The version of the Immich server, mobile app, and other relevant pieces. - Any other information that you think might be relevant. Please paste files and logs with proper code formatting, and especially avoid blurry screenshots. Without the right information we can't work out what the problem is. Help us help you ;) If this ticket can be closed you can use the /close command, and re-open it later if needed.
brconn
brconnOP4mo ago
IMMICH_VERSION=v1.134.0
immich-server:
container_name: immich
image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION}
deploy:
replicas: 1
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities:
- gpu
- compute
- video
env_file:
- .env
ports:
- 2283:2283
volumes:
- ${UPLOAD_LOCATION}:/usr/src/app/upload
- /etc/localtime:/etc/localtime:ro
restart: unless-stopped
depends_on:
- redis
- database

immich-machine-learning:
container_name: immich_machine_learning
image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION}-cuda
deploy:
replicas: 1
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities:
- gpu
env_file:
- .env
volumes:
- model-cache:/cache
restart: unless-stopped

redis:
container_name: immich_redis
image: docker.io/valkey/valkey:8-bookworm@sha256:a19bebed6a91bd5e6e2106fef015f9602a3392deeb7c9ed47548378dcee3dfc2
healthcheck:
test: redis-cli ping || exit 1
deploy:
replicas: 1
restart: unless-stopped

database:
container_name: immich_postgres
image: ghcr.io/immich-app/postgres:14-vectorchord0.3.0-pgvectors0.3.0
deploy:
replicas: 1
environment:
POSTGRES_PASSWORD: ${DB_PASSWORD}
POSTGRES_USER: ${DB_USERNAME}
POSTGRES_DB: ${DB_DATABASE_NAME}
POSTGRES_INITDB_ARGS: '--data-checksums'
DB_STORAGE_TYPE: 'HDD'
volumes:
- ${DB_DATA_LOCATION}:/var/lib/postgresql/data
restart: unless-stopped
immich-server:
container_name: immich
image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION}
deploy:
replicas: 1
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities:
- gpu
- compute
- video
env_file:
- .env
ports:
- 2283:2283
volumes:
- ${UPLOAD_LOCATION}:/usr/src/app/upload
- /etc/localtime:/etc/localtime:ro
restart: unless-stopped
depends_on:
- redis
- database

immich-machine-learning:
container_name: immich_machine_learning
image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION}-cuda
deploy:
replicas: 1
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities:
- gpu
env_file:
- .env
volumes:
- model-cache:/cache
restart: unless-stopped

redis:
container_name: immich_redis
image: docker.io/valkey/valkey:8-bookworm@sha256:a19bebed6a91bd5e6e2106fef015f9602a3392deeb7c9ed47548378dcee3dfc2
healthcheck:
test: redis-cli ping || exit 1
deploy:
replicas: 1
restart: unless-stopped

database:
container_name: immich_postgres
image: ghcr.io/immich-app/postgres:14-vectorchord0.3.0-pgvectors0.3.0
deploy:
replicas: 1
environment:
POSTGRES_PASSWORD: ${DB_PASSWORD}
POSTGRES_USER: ${DB_USERNAME}
POSTGRES_DB: ${DB_DATABASE_NAME}
POSTGRES_INITDB_ARGS: '--data-checksums'
DB_STORAGE_TYPE: 'HDD'
volumes:
- ${DB_DATA_LOCATION}:/var/lib/postgresql/data
restart: unless-stopped
This has occurred with the existing model-cache as well as a new & fresh model-cache
brconn
brconnOP4mo ago
FWIW, when drop -cuda from the ML image it works great
bo0tzz
bo0tzz4mo ago
ERROR Worker (pid:9) was sent code 139!
It's segfaulting What driver version are you running?
brconn
brconnOP4mo ago
NVIDIA-SMI 550.100 Driver Version: 575.57.08 CUDA Version: 12.4 Running on a 1080ti which should have compute capability 6.1
bo0tzz
bo0tzz4mo ago
And do you meet the other requirements from https://immich.app/docs/features/ml-hardware-acceleration#cuda ?
brconn
brconnOP4mo ago
I believe so
dnf list installed | grep nvidia-container
libnvidia-container-tools.x86_64 1.17.8-1 @cuda-rhel9-x86_64
libnvidia-container1.x86_64 1.17.8-1 @cuda-rhel9-x86_64
nvidia-container-toolkit.x86_64 1.17.8-1 @cuda-rhel9-x86_64
nvidia-container-toolkit-base.x86_64 1.17.8-1 @cuda-rhel9-x86_64
dnf list installed | grep nvidia-container
libnvidia-container-tools.x86_64 1.17.8-1 @cuda-rhel9-x86_64
libnvidia-container1.x86_64 1.17.8-1 @cuda-rhel9-x86_64
nvidia-container-toolkit.x86_64 1.17.8-1 @cuda-rhel9-x86_64
nvidia-container-toolkit-base.x86_64 1.17.8-1 @cuda-rhel9-x86_64
bo0tzz
bo0tzz4mo ago
@sogan any idea?
brconn
brconnOP4mo ago
dnf list installed | grep nvidia
kmod-nvidia-latest-dkms.x86_64 3:575.57.08-1.el9 @cuda-rhel9-x86_64
libnvidia-container-tools.x86_64 1.17.8-1 @cuda-rhel9-x86_64
libnvidia-container1.x86_64 1.17.8-1 @cuda-rhel9-x86_64
libnvidia-gpucomp.x86_64 3:575.57.08-1.el9 @cuda-rhel9-x86_64
libnvidia-ml.x86_64 3:575.57.08-1.el9 @cuda-rhel9-x86_64
nvidia-container-toolkit.x86_64 1.17.8-1 @cuda-rhel9-x86_64
nvidia-container-toolkit-base.x86_64 1.17.8-1 @cuda-rhel9-x86_64
nvidia-driver.x86_64 3:575.57.08-1.el9 @cuda-rhel9-x86_64
nvidia-driver-libs.x86_64 3:575.57.08-1.el9 @cuda-rhel9-x86_64
nvidia-kmod-common.noarch 3:575.57.08-1.el9 @cuda-rhel9-x86_64
nvidia-modprobe.x86_64 3:575.57.08-1.el9 @cuda-rhel9-x86_64
dnf list installed | grep nvidia
kmod-nvidia-latest-dkms.x86_64 3:575.57.08-1.el9 @cuda-rhel9-x86_64
libnvidia-container-tools.x86_64 1.17.8-1 @cuda-rhel9-x86_64
libnvidia-container1.x86_64 1.17.8-1 @cuda-rhel9-x86_64
libnvidia-gpucomp.x86_64 3:575.57.08-1.el9 @cuda-rhel9-x86_64
libnvidia-ml.x86_64 3:575.57.08-1.el9 @cuda-rhel9-x86_64
nvidia-container-toolkit.x86_64 1.17.8-1 @cuda-rhel9-x86_64
nvidia-container-toolkit-base.x86_64 1.17.8-1 @cuda-rhel9-x86_64
nvidia-driver.x86_64 3:575.57.08-1.el9 @cuda-rhel9-x86_64
nvidia-driver-libs.x86_64 3:575.57.08-1.el9 @cuda-rhel9-x86_64
nvidia-kmod-common.noarch 3:575.57.08-1.el9 @cuda-rhel9-x86_64
nvidia-modprobe.x86_64 3:575.57.08-1.el9 @cuda-rhel9-x86_64
mertalev
mertalev4mo ago
Segfaults are generally driver-related. I'm not sure why that specific driver would be problematic though
brconn
brconnOP4mo ago
Looks like that's the latest available nvidia driver on Rocky9 atm. I could try and replace it with the dkms version maybe? Actually I'm already on dkms
mertalev
mertalev4mo ago
Do you have the latest nvidia-container-toolkit installed?
brconn
brconnOP9h ago
Yup 1.17.8 is the latest per their GitHub I did have a kernel update to do. So I did that and let DKMS rebuild but that didn't help @mertalev If I set the LD_LIBRARY_PATH for the ML container to include the path to cuda I'm able to get farther
immich_machine_learning | 2025-10-02 22:04:45.616101011 [E:onnxruntime:Default, cuda_call.cc:118 CudaCall] CUDA failure 803: system has unsupported display driver / cuda driver combination ; GPU=-1 ; hostname=1aeb1e8be6a3 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc ; line=66 ; expr=cudaGetDeviceCount(&num_devices);
immich_machine_learning | *************** EP Error ***************
immich_machine_learning | EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc:59 static onnxruntime::CUDAExecutionProviderInfo onnxruntime::CUDAExecutionProviderInfo::FromProviderOptions(const onnxruntime::ProviderOptions&) [ONNXRuntimeError] : 1 : FAIL : provider_options_utils.h:151 Parse Failed to parse provider option "device_id": CUDA failure 803: system has unsupported display driver / cuda driver combination ; GPU=-1 ; hostname=1aeb1e8be6a3 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc ; line=66 ; expr=cudaGetDeviceCount(&num_devices);
immich_machine_learning | when using ['CUDAExecutionProvider', 'CPUExecutionProvider']
immich_machine_learning | Falling back to ['CPUExecutionProvider'] and retrying.
immich_machine_learning | ****************************************
immich_machine_learning | 2025-10-02 22:04:45.616101011 [E:onnxruntime:Default, cuda_call.cc:118 CudaCall] CUDA failure 803: system has unsupported display driver / cuda driver combination ; GPU=-1 ; hostname=1aeb1e8be6a3 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc ; line=66 ; expr=cudaGetDeviceCount(&num_devices);
immich_machine_learning | *************** EP Error ***************
immich_machine_learning | EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc:59 static onnxruntime::CUDAExecutionProviderInfo onnxruntime::CUDAExecutionProviderInfo::FromProviderOptions(const onnxruntime::ProviderOptions&) [ONNXRuntimeError] : 1 : FAIL : provider_options_utils.h:151 Parse Failed to parse provider option "device_id": CUDA failure 803: system has unsupported display driver / cuda driver combination ; GPU=-1 ; hostname=1aeb1e8be6a3 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc ; line=66 ; expr=cudaGetDeviceCount(&num_devices);
immich_machine_learning | when using ['CUDAExecutionProvider', 'CPUExecutionProvider']
immich_machine_learning | Falling back to ['CPUExecutionProvider'] and retrying.
immich_machine_learning | ****************************************

Did you find this page helpful?