I
Immich•2y ago
Chacsam

Face recognition Issue

Face recognition has been running fine (if you believe the interface), but the People menu doesn't appear in the explore menu. The ip:2283/api/person only displays [], no thumbnail. Errors in the log of the machine learning container:
File "/opt/venv/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 397, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from /cache/models/buffalo_l/1k3d68.onnx failed:Protobuf parsing failed.
2023-05-18 19:13:56.676680506 [E:onnxruntime:Default, env.cc:251 ThreadMain] pthread_setaffinity_np failed for thread: 734, index: 2, mask: {3, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.
INFO: 172.18.0.8:57840 - "POST /facial-recognition/detect-faces HTTP/1.1" 500 Internal Server Error
ERROR: Exception in ASGI application
File "/opt/venv/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 397, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from /cache/models/buffalo_l/1k3d68.onnx failed:Protobuf parsing failed.
2023-05-18 19:13:56.676680506 [E:onnxruntime:Default, env.cc:251 ThreadMain] pthread_setaffinity_np failed for thread: 734, index: 2, mask: {3, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.
INFO: 172.18.0.8:57840 - "POST /facial-recognition/detect-faces HTTP/1.1" 500 Internal Server Error
ERROR: Exception in ASGI application
50 Replies
Alex Tran
Alex Tran•2y ago
So this is an issue of the container trying to load the model Can you perform the exact step 1. clear the queue 2. docker-compose down 3. docker-compose up 4. run the job 5. check /api/person What is your system spec? OS? architecture
Chacsam
ChacsamOP•2y ago
Running on a Proxmox CT, Ubuntu 22.04, 3Gb RAM, 2 CPU
Alex Tran
Alex Tran•2y ago
Not relevance but you will need at least 5 GB of RAM for Immich, for when all ML jobs run
Kryptonian
Kryptonian•2y ago
I'm running into that too. onnxruntime.capi.onnxruntime_pybind11_state.InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from /cache/models/buffalo_l/1k3d68.onnx failed:Protobuf parsing failed.
Alex Tran
Alex Tran•2y ago
No description
Alex Tran
Alex Tran•2y ago
Try changing the type to host
Chacsam
ChacsamOP•2y ago
Still [] in API, I can increase ressources even though that didn't look necessary
No description
Chacsam
ChacsamOP•2y ago
maybe relevant, the pictures are in a mounted folder from a Synology NAS
Alex Tran
Alex Tran•2y ago
That shouldn't be the issue, the error message shows the machine learning cannot load the facial recognition model
Alex Tran
Alex Tran•2y ago
What is the processors looks like on your VM?
No description
Chacsam
ChacsamOP•2y ago
Sorry, I don't have that host option, where do you find it?
No description
Kryptonian
Kryptonian•2y ago
Clicking advanced
Chacsam
ChacsamOP•2y ago
it's a container not a VM
Kryptonian
Kryptonian•2y ago
Ah
Alex Tran
Alex Tran•2y ago
If it is a container then it should have use the host processor already So it might not be relevance
Alex Tran
Alex Tran•2y ago
No description
Alex Tran
Alex Tran•2y ago
Might be related to resources issue
Chacsam
ChacsamOP•2y ago
will try again with 6 GB ram for the sake of completeness
Alex Tran
Alex Tran•2y ago
try to increase the RAM to 4GB just to try even better
Chacsam
ChacsamOP•2y ago
Cleared the queue; Upgraded resources; Docker compose down, docker compose up; Wait a bit; Restart face recognition, but API page is still blank 😦 Resource look fine
No description
Alex Tran
Alex Tran•2y ago
Ok try this bring down all the container then docker volume rm docker_model-cache and bring up the stack it would trigger the redownload of the model
Chacsam
ChacsamOP•2y ago
The rm was after a docker compose down, right? I get: Error response from daemon: get docker_model-cache: no such volume
Alex Tran
Alex Tran•2y ago
Correct try docker volume ls
Chacsam
ChacsamOP•2y ago
DRIVER VOLUME NAME
local 1a792da7738ecc56790a4e7a691cf30157a3c894c21bdda10567e242eece965f
local 3ad1cc6cae058e9ff406cc00a0a33d7e9145934d968c84a8dfeedae06fc64213
local 4ecfd384af3bb670f51bf9f7fc6b6a8d0eb77316217ffab8c0a54346e6de4b90
local 5d237f60cfbfd54ce23a63664ddcb99023f53f4d9f1b13d0eb048f877bcde4a2
local 9dd69779143cc084685d8cec3c216caca6d4566332e7285e11110eb0c8a6e9d7
local 101c0b4a394af5b6128dc6ed4f5f39d14ddb1f3a110703fc5a137e21261340eb
local d92fd775b75044604b67ddd6e04e66abd91aeef2182cfbf047389a38bb183331
local f3b87f699115aa7b4263e05ccf504b1b8d064d1483abc3f910d544464e649a17
local fde610f4c1cdb4e23c9d3422c25053fd8b17ce07785d132419e4cb1b98744cc8
local immich-docker_model-cache
local immich-docker_pgdata
local immich-docker_tsdata
local immich_model-cache
local immich_pgdata
local immich_tsdata
DRIVER VOLUME NAME
local 1a792da7738ecc56790a4e7a691cf30157a3c894c21bdda10567e242eece965f
local 3ad1cc6cae058e9ff406cc00a0a33d7e9145934d968c84a8dfeedae06fc64213
local 4ecfd384af3bb670f51bf9f7fc6b6a8d0eb77316217ffab8c0a54346e6de4b90
local 5d237f60cfbfd54ce23a63664ddcb99023f53f4d9f1b13d0eb048f877bcde4a2
local 9dd69779143cc084685d8cec3c216caca6d4566332e7285e11110eb0c8a6e9d7
local 101c0b4a394af5b6128dc6ed4f5f39d14ddb1f3a110703fc5a137e21261340eb
local d92fd775b75044604b67ddd6e04e66abd91aeef2182cfbf047389a38bb183331
local f3b87f699115aa7b4263e05ccf504b1b8d064d1483abc3f910d544464e649a17
local fde610f4c1cdb4e23c9d3422c25053fd8b17ce07785d132419e4cb1b98744cc8
local immich-docker_model-cache
local immich-docker_pgdata
local immich-docker_tsdata
local immich_model-cache
local immich_pgdata
local immich_tsdata
Kryptonian
Kryptonian•2y ago
I had to remove memory limit and now it's happy.
Alex Tran
Alex Tran•2y ago
Good to know, thank you! can you rm the immich-docker_model-cache and immich_model-cache volume? You are running on k8s right? From my testing, the machine learning container will need at least 5GB of RAM to work correctly
Kryptonian
Kryptonian•2y ago
Correct, and limits of 1 nor 2GB of RAM was enough for ml.
Chacsam
ChacsamOP•2y ago
So, I did: docker volume rm immich-docker_model-cache docker volume rm immich_model-cache docker compose up -d Restarted the job, but still empty API page
Alex Tran
Alex Tran•2y ago
What is the machine learning log say?
Chacsam
ChacsamOP•2y ago
hold on, something is coming up API starts to display data
[{"id":"dceb2d05-5a09-4182-8240-23baa7059821","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/dceb2d05-5a09-4182-8240-23baa7059821.jpeg"},{"id":"f9912f67-1f35-4a14-bcee-b9da1cefa719","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/f9912f67-1f35-4a14-bcee-b9da1cefa719.jpeg"},{"id":"d47fd5c8-5e7d-4ae7-80f0-366796e97990","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/d47fd5c8-5e7d-4ae7-80f0-366796e97990.jpeg"},{"id":"80f54d94-34cb-463f-a27f-d98073f18854","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/80f54d94-34cb-463f-a27f-d98073f18854.jpeg"},{"id":"b443f955-0f9b-451e-bbc5-f8c5200c2657","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/b443f955-0f9b-451e-bbc5-f8c5200c2657.jpeg"},{"id":"a6bd693f-1697-4c63-bfe1-24141c7ec6be","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/a6bd693f-1697-4c63-bfe1-24141c7ec6be.jpeg"},{"id":"23df9b98-cac0-4533-af00-595e02d8c6a2","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/23df9b98-cac0-4533-af00-595e02d8c6a2.jpeg"},{"id":"3a678f87-0835-466a-88db-e128045cc54f","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/3a678f87-0835-466a-88db-e128045cc54f.jpeg"}]
[{"id":"dceb2d05-5a09-4182-8240-23baa7059821","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/dceb2d05-5a09-4182-8240-23baa7059821.jpeg"},{"id":"f9912f67-1f35-4a14-bcee-b9da1cefa719","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/f9912f67-1f35-4a14-bcee-b9da1cefa719.jpeg"},{"id":"d47fd5c8-5e7d-4ae7-80f0-366796e97990","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/d47fd5c8-5e7d-4ae7-80f0-366796e97990.jpeg"},{"id":"80f54d94-34cb-463f-a27f-d98073f18854","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/80f54d94-34cb-463f-a27f-d98073f18854.jpeg"},{"id":"b443f955-0f9b-451e-bbc5-f8c5200c2657","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/b443f955-0f9b-451e-bbc5-f8c5200c2657.jpeg"},{"id":"a6bd693f-1697-4c63-bfe1-24141c7ec6be","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/a6bd693f-1697-4c63-bfe1-24141c7ec6be.jpeg"},{"id":"23df9b98-cac0-4533-af00-595e02d8c6a2","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/23df9b98-cac0-4533-af00-595e02d8c6a2.jpeg"},{"id":"3a678f87-0835-466a-88db-e128045cc54f","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/3a678f87-0835-466a-88db-e128045cc54f.jpeg"}]
Alex Tran
Alex Tran•2y ago
Ok now should see something in the explore page
Chacsam
ChacsamOP•2y ago
yes, I do indeed!
Alex Tran
Alex Tran•2y ago
So in your case I suspect the corrupted model when it was downloading
Chacsam
ChacsamOP•2y ago
and CPU is working much harder now 😉
No description
Alex Tran
Alex Tran•2y ago
yeah it should be 😄
Chacsam
ChacsamOP•2y ago
Anyway, thanks for the support! (and the great app by the way)
Alex Tran
Alex Tran•2y ago
No problem! Enjoy!
Sherlock79
Sherlock79•2y ago
Hi @Alex, so are you saying that Immich would need 5GB+ RAM?
Alex Tran
Alex Tran•2y ago
Yes The machine learning would use almost 4GB of RAM when loading all the models to the memory
Sherlock79
Sherlock79•2y ago
That's good to know. I think docs mention 2GB and I was just about to get a 4GB VPS or low end dedicated server.
Alex Tran
Alex Tran•2y ago
No description
Alex Tran
Alex Tran•2y ago
Yes it was outdated, I forgot to update
Sherlock79
Sherlock79•2y ago
Does ML keep the models in memory all the time? Would I get away with 4 GB and some HDD swapping?
Alex Tran
Alex Tran•2y ago
It is right now, since we haven't figured out the best way to unload it from RAM yet
satish
satish•2y ago
What's the command to see that
Chacsam
ChacsamOP•2y ago
My RAM usage has been pretty stable around 3 GB all night. Still 50% of my 45.000 asssets to be recognized, but it's moving on (it takes much more time when it actually does something :-))
No description
Alex Tran
Alex Tran•2y ago
yeah 3CPU will take a wile
Chacsam
ChacsamOP•2y ago
I only have 4 and don't want to strangle the other apps no my machine. No problem, it's only the first catch up that will be painful, daily uploads will go unnoticed I am sure. First resulsts seem promising, but I let him finish its job before playing around
Alex Tran
Alex Tran•2y ago
Correct
Hullah
Hullah•2y ago
I had a similar issue as the OP when I first upgraded and ran Facial Recognition on all photos. I don't know if it's related or not, but wanted to share my experience of what I saw. Since I was running the full job, there were 3 tasks going on in parallel, which meant it was downloading the model from Github in each of those processes, and then once it was downloaded I assume it attempted to then use it. My question is, would downloading and loading the model in parallel like that cause issues and contention? I feel like that could cause collisions with the loaded model state. In my case, as soon as it downloaded them model and I assume load and try and use it, it starting throwing many errors. I immediately paused the job from the admin page and then cleared the jobs. I then restarted the Machine Learning service and tried running the job for all items again. This time since the model was already downloaded, I didn't receive any error and it completed successfully.

Did you find this page helpful?