Facial Recognition stopped working
I noticed that face detection has stopped working on my system.
I can go to expore => faces, and see the results of older face detection, but newer pictures don't appear to have any face detection working.
I see this in my logs:
| [Nest] 7 - 01/18/2025, 9:19:28 PM ERROR [Microservices:JobService] Unable to run job handler (faceDetection/face-detection): Error:>
| [Nest] 7 - 01/18/2025, 9:19:28 PM ERROR [Microservices:JobService] Error: Machine learning request '{"facial-recognition":{"detecti>
| at MachineLearningRepository.predict (/usr/src/app/dist/repositories/machine-learning.repository.js:41:15)
| at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
| at async MachineLearningRepository.detectFaces (/usr/src/app/dist/repositories/machine-learning.repository.js:50:26)
| at async PersonService.handleDetectFaces (/usr/src/app/dist/services/person.service.js:235:52)
| at async JobService.onJobStart (/usr/src/app/dist/services/job.service.js:148:28)
| at async EventRepository.onEvent (/usr/src/app/dist/repositories/event.repository.js:134:13)
| at async Worker.processJob (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:394:28)
| at async Worker.retryIfFailed (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:581:24)
| [Nest] 7 - 01/18/2025, 9:19:28 PM ERROR [Microservices:JobService] Object:
| {
| "id": "0518ac24-b747-484e-962a-2dfa2602750d"
| }
Any ideas on where I should look next? I am able to see pictures in my library just fine.8 Replies
:wave: Hey @satmandu,
Thanks for reaching out to us. Please carefully read this message and follow the recommended actions. This will help us be more effective in our support effort and leave more time for building Immich :immich:.
References
- Container Logs:
docker compose logs
docs
- Container Status: docker ps -a
docs
- Reverse Proxy: https://immich.app/docs/administration/reverse-proxy
- Code Formatting https://support.discord.com/hc/en-us/articles/210298617-Markdown-Text-101-Chat-Formatting-Bold-Italic-Underline#h_01GY0DAKGXDEHE263BCAYEGFJAChecklist
I have...
1. :ballot_box_with_check: verified I'm on the latest release(note that mobile app releases may take some time).
2. :ballot_box_with_check: read applicable release notes.
3. :ballot_box_with_check: reviewed the FAQs for known issues.
4. :ballot_box_with_check: reviewed Github for known issues.
5. :ballot_box_with_check: tried accessing Immich via local ip (without a custom reverse proxy).
6. :blue_square: uploaded the relevant information (see below).
7. :blue_square: tried an incognito window, disabled extensions, cleared mobile app cache, logged out and back in, different browsers, etc. as applicable
(an item can be marked as "complete" by reacting with the appropriate number)
Information
In order to be able to effectively help you, we need you to provide clear information to show what the problem is. The exact details needed vary per case, but here is a list of things to consider:
- Your docker-compose.yml and .env files.
- Logs from all the containers and their status (see above).
- All the troubleshooting steps you've tried so far.
- Any recent changes you've made to Immich or your system.
- Details about your system (both software/OS and hardware).
- Details about your storage (filesystems, type of disks, output of commands like
fdisk -l
and df -h
).
- The version of the Immich server, mobile app, and other relevant pieces.
- Any other information that you think might be relevant.
Please paste files and logs with proper code formatting, and especially avoid blurry screenshots.
Without the right information we can't work out what the problem is. Help us help you ;)
If this ticket can be closed you can use the /close
command, and re-open it later if needed.GitHub
immich-app immich · Discussions
Explore the GitHub Discussions forum for immich-app immich. Discuss code, ask questions & collaborate with the developer community.
FAQ | Immich
User
GitHub
Issues · immich-app/immich
High performance self-hosted photo and video management solution. - Issues · immich-app/immich
I have allocated 15Gb to machine learning thus:
immich-machine-learning:
container_name: immich_machine_learning
# For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag.
# Example tag: ${IMMICH_VERSION:-release}-cuda
image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
# extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration
# file: hwaccel.ml.yml
# service: cpu # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the
-wsl version for WSL2 where applicable
volumes:
- model-cache:/cache
env_file:
- .env
restart: always
healthcheck:
disable: false
deploy:
restart_policy:
condition: on-failure
delay: 5s
resources:
limits:
cpus: '0.50'
memory: 15G
Doing a text search also gives me an error:
immich_server | Error: Machine learning request '{"clip":{"textual":{"modelName":"ViT-B-32__openai"}}}' failed for all URLs
immich_server | at MachineLearningRepository.predict (/usr/src/app/dist/repositories/machine-learning.repository.js:41:15)
immich_server | at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
immich_server | at async MachineLearningRepository.encodeText (/usr/src/app/dist/repositories/machine-learning.repository.js:64:26)
immich_server | at async SearchService.searchSmart (/usr/src/app/dist/services/search.service.js:66:27)
And on switching face recognition models to antelope:
immich_server | [Nest] 7 - 01/18/2025, 9:52:42 PM ERROR [Microservices:JobService] Unable to run job handler (faceDetection/face-detection): Error: Machine learning request '{"facial-recognition":{"detection":{"modelName":"antelopev2","options":{"minScore":0.7}},"recognition":{"modelName":"antelopev2"}}}' failed for all URLs
immich_postgres |
immich_server | [Nest] 7 - 01/18/2025, 9:52:42 PM ERROR [Microservices:JobService] Error: Machine learning request '{"facial-recognition":{"detection":{"modelName":"antelopev2","options":{"minScore":0.7}},"recognition":{"modelName":"antelopev2"}}}' failed for all URLs
immich_server | at MachineLearningRepository.predict (/usr/src/app/dist/repositories/machine-learning.repository.js:41:15)
immich_server | at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
immich_server | at async MachineLearningRepository.detectFaces (/usr/src/app/dist/repositories/machine-learning.repository.js:50:26)
immich_server | at async PersonService.handleDetectFaces (/usr/src/app/dist/services/person.service.js:235:52)
immich_server | at async JobService.onJobStart (/usr/src/app/dist/services/job.service.js:148:28)
immich_server | at async EventRepository.onEvent (/usr/src/app/dist/repositories/event.repository.js:134:13)
immich_server | at async Worker.processJob (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:394:28)
immich_server | at async Worker.retryIfFailed (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:581:24)
immich_server | [Nest] 7 - 01/18/2025, 9:52:42 PM ERROR [Microservices:JobService] Object:
immich_server | {
immich_server | "id": "0da4c392-09c3-4b3d-8659-ecfbfc97cf37"
immich_server | }
ok gunicorn is running, so I guess I will let it run overnight and see what happens tomorrow...
I don't think this is working though. The logs are full of this: [Nest] 7 - 01/18/2025, 10:08:02 PM ERROR [Microservices:JobService] Unable to run job handler (faceDetection/face-detection): Error: Machine learning request '{"facial-recognition":{"detection":{"modelName":"antelopev2","options":{"minScore":0.7}},"recognition":{"modelName":"antelopev2"}}}' failed for all URLs
immich_server | [Nest] 7 - 01/18/2025, 10:08:02 PM ERROR [Microservices:JobService] Error: Machine learning request '{"facial-recognition":{"detection":{"modelName":"antelopev2","options":{"minScore":0.7}},"recognition":{"modelName":"antelopev2"}}}' failed for all URLs
Are there any errors in the ML container? Also 0.5 cpu is super low for ML, that’s like nothing
Isn't 0.5 cpu 50% of the cpu? Those are the errors I'm getting from the ML container... I'm working on moving this to another machine though...
no, it's 50% of a core (sort of)
Ah! I was mistaken! I guess I need to figure out the
compose deploy
equivalent of the docker cpu-shares
option. I just want to set a soft limit. I "only" have ~ 3TB of photos... so I'm trying to figure out the best way to manage this. (I only added the deploy limits because I was having OOM issues on my system, so I figured better to add limits to keep random processes from being killed.)
In any case, I'm doing a zfs send to a new disk I'm going to attach to a faster machine (11th gen i5) with more ram (64Gb), so maybe I can figure out a way to set these limits for that best. Is there a supported way to run the immich machine learning processes/server with nice to lower the priority? (I'm already running a self-hosted github runner on this machine, and I have all builds under there running using docker with--cpu-shares 512
and that's been good about not interrupting other things I have been doing under there.Other than limiting the resources of the container, no