Immich machine learning not working
Hello all I am running image server on Intel eighth generation process in Proxmox OMV. When i ask immich to run facial recognition, it runs the jobs . However logs shows ECONNECTRESET error . In my debug, I found that image server can reach to muscle learning container without any problems on mission. Learning container is also running properly because I can ping it in the browser. I get a message saying machine learning container . However, when server is trying to process images, it can only process the first request after that every request returns above error message. Please help me to solve this by the way this worked properly for a couple days and broken after that.
35 Replies
:wave: Hey @Hydrogen,
Thanks for reaching out to us. Please carefully read this message and follow the recommended actions. This will help us be more effective in our support effort and leave more time for building Immich :immich:.
References
- Container Logs:
docker compose logs
docs
- Container Status: docker ps -a
docs
- Reverse Proxy: https://immich.app/docs/administration/reverse-proxy
- Code Formatting https://support.discord.com/hc/en-us/articles/210298617-Markdown-Text-101-Chat-Formatting-Bold-Italic-Underline#h_01GY0DAKGXDEHE263BCAYEGFJA
Checklist
I have...
1. :ballot_box_with_check: verified I'm on the latest release(note that mobile app releases may take some time).
2. :ballot_box_with_check: read applicable release notes.
3. :ballot_box_with_check: reviewed the FAQs for known issues.
4. :ballot_box_with_check: reviewed Github for known issues.
5. :ballot_box_with_check: tried accessing Immich via local ip (without a custom reverse proxy).
6. :blue_square: uploaded the relevant information (see below).
7. :ballot_box_with_check: tried an incognito window, disabled extensions, cleared mobile app cache, logged out and back in, different browsers, etc. as applicable
(an item can be marked as "complete" by reacting with the appropriate number)
Information
In order to be able to effectively help you, we need you to provide clear information to show what the problem is. The exact details needed vary per case, but here is a list of things to consider:
- Your docker-compose.yml and .env files.
- Logs from all the containers and their status (see above).
- All the troubleshooting steps you've tried so far.
- Any recent changes you've made to Immich or your system.
- Details about your system (both software/OS and hardware).
- Details about your storage (filesystems, type of disks, output of commands like fdisk -l
and df -h
).
- The version of the Immich server, mobile app, and other relevant pieces.
- Any other information that you think might be relevant.
Please paste files and logs with proper code formatting, and especially avoid blurry screenshots.
Without the right information we can't work out what the problem is. Help us help you ;)
If this ticket can be closed you can use the /close
command, and re-open it later if needed.
Successfully submitted, a tag has been added to inform contributors. :white_check_mark:

Please paste files and logs with proper code formatting, and especially avoid blurry screenshots.Read the damn message




Thanks, that helps!
What did you set the ML URL to in Immich?
Details of the system : Running Proxmox as hypervisor on Intel 8thgen processor. No gfx card. Immich is running in OMV on docker. Used portainer to deploy Immich . Using EXT4 file system .

That won't work
Change that back to
http://immich-machine-learning:3003
BTW, I am very sorry for my mistake. should have logged in on discord on my pc.
I didnt know the devs on this program will be this responsive. this is my first time on immich .
It's all good. It's just incredibly annoying having to ask for every single piece of information while there's one message asking for exactly what we need
Totally get it . Thanks a lot for your time !
That's fair, we're usually pretty quick on first response
doing it now.
changed the url shall I restart Immich stack on portainer ?
Ideally
ok restarted. now ran the smart search for missing items. getting timeout error now. This is from server -
"id": "257abb88-4df1-4765-9e1f-35f0e8beea8b"
}
[Nest] 7 - 11/29/2024, 11:16:52 PM ERROR [Microservices:JobService] Unable to run job handler (smartSearch/smart-search): Error: Machine learning request to "http://immich-machine-learning:3003" failed with Error: connect EHOSTUNREACH 172.19.0.3:3003
[Nest] 7 - 11/29/2024, 11:16:52 PM ERROR [Microservices:JobService] Error: Machine learning request to "http://immich-machine-learning:3003" failed with Error: connect EHOSTUNREACH 172.19.0.3:3003
at /usr/src/app/dist/repositories/machine-learning.repository.js:18:19
at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
at async MachineLearningRepository.predict (/usr/src/app/dist/repositories/machine-learning.repository.js:17:21)
at async MachineLearningRepository.encodeImage (/usr/src/app/dist/repositories/machine-learning.repository.js:41:26)
at async SmartInfoService.handleEncodeClip (/usr/src/app/dist/services/smart-info.service.js:104:27)
at async JobService.onJobStart (/usr/src/app/dist/services/job.service.js:151:28)
at async EventRepository.onEvent (/usr/src/app/dist/repositories/event.repository.js:122:13)
at async Worker.processJob (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:394:28)
at async Worker.retryIfFailed (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:581:24)
[Nest] 7 - 11/29/2024, 11:16:52 PM ERROR [Microservices:JobService] Object:
{
"id": "3e9d10a1-db46-45fc-8103-1b1683c703d6"
}
on the machine learning container no errors:

(btw there's a link in the bot message for code block formatting, that makes is prettier :))
attaching immich server and machine learning inspect network output here :


I have twingate , cloudfared, nextcloud containers in portainer. nothing else. shall I pause twingate and cloudfare now ? just to see ?
I don't see how that should make a difference
But I also don't understand why it cannot reach that host while on the same network
yes sir. I am wondering the same. shall I restart entire OMV instance from proxmox
?
Maybe worth a shot tbh
after restart , unable to access database now
it says same for data base ip

did a repull and deploy. now the immich is able to load the picture. running smart search now. Dont see any errors
from machine learning and server


This looks good!
and cpu usage at 100% seems like its running fine now !
thanks a lot sir !
BTW, am I using correct model for smart search ? will it be ok ? or do you recommend anything else
That's the best indicator haha
There isn't a "correct" model
But I don't know all the ins and outs of all the models
That's probably a good question for #immich or #off-topic :)
Ok thank youu !
model seems to be a tradeoff always between speed and quality etc.
if you ever see mertalev writing somewhere, he is the expert you can ask about that for some details 😄
The model you chose is very good and offers a nice balance between speed and quality. Compared to larger models, it will also be much quicker to load the first time you search in a 5 minute period
The main thing to note is that it doesn’t understand any language besides English