Immich•2y ago

Face recognition Issue

Face recognition has been running fine (if you believe the interface), but the People menu doesn't appear in the explore menu. The ip:2283/api/person only displays [], no thumbnail. Errors in the log of the machine learning container:

  File "/opt/venv/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 397, in _create_inference_session
    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from /cache/models/buffalo_l/1k3d68.onnx failed:Protobuf parsing failed.
2023-05-18 19:13:56.676680506 [E:onnxruntime:Default, env.cc:251 ThreadMain] pthread_setaffinity_np failed for thread: 734, index: 2, mask: {3, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.
INFO:     172.18.0.8:57840 - "POST /facial-recognition/detect-faces HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application

  File "/opt/venv/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 397, in _create_inference_session
    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from /cache/models/buffalo_l/1k3d68.onnx failed:Protobuf parsing failed.
2023-05-18 19:13:56.676680506 [E:onnxruntime:Default, env.cc:251 ThreadMain] pthread_setaffinity_np failed for thread: 734, index: 2, mask: {3, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.
INFO:     172.18.0.8:57840 - "POST /facial-recognition/detect-faces HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application

50 Replies

Alex Tran•2y ago

So this is an issue of the container trying to load the model Can you perform the exact step 1. clear the queue 2. docker-compose down 3. docker-compose up 4. run the job 5. check /api/person What is your system spec? OS? architecture

ChacsamOP•2y ago

Running on a Proxmox CT, Ubuntu 22.04, 3Gb RAM, 2 CPU

Alex Tran•2y ago

Not relevance but you will need at least 5 GB of RAM for Immich, for when all ML jobs run

Kryptonian•2y ago

I'm running into that too.

onnxruntime.capi.onnxruntime_pybind11_state.InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from /cache/models/buffalo_l/1k3d68.onnx failed:Protobuf parsing failed.

Alex Tran•2y ago

Alex Tran•2y ago

Try changing the type to host

ChacsamOP•2y ago

Still [] in API, I can increase ressources even though that didn't look necessary

ChacsamOP•2y ago

maybe relevant, the pictures are in a mounted folder from a Synology NAS

Alex Tran•2y ago

That shouldn't be the issue, the error message shows the machine learning cannot load the facial recognition model

Alex Tran•2y ago

What is the processors looks like on your VM?

ChacsamOP•2y ago

Sorry, I don't have that host option, where do you find it?

Kryptonian•2y ago

Clicking advanced

ChacsamOP•2y ago

it's a container not a VM

Kryptonian•2y ago

Alex Tran•2y ago

If it is a container then it should have use the host processor already So it might not be relevance

Alex Tran•2y ago

Might be related to resources issue

ChacsamOP•2y ago

will try again with 6 GB ram for the sake of completeness

Alex Tran•2y ago

try to increase the RAM to 4GB just to try even better

ChacsamOP•2y ago

Cleared the queue; Upgraded resources; Docker compose down, docker compose up; Wait a bit; Restart face recognition, but API page is still blank 😦 Resource look fine

Alex Tran•2y ago

Ok try this bring down all the container then docker volume rm docker_model-cache and bring up the stack it would trigger the redownload of the model

ChacsamOP•2y ago

The rm was after a docker compose down, right? I get: Error response from daemon: get docker_model-cache: no such volume

Alex Tran•2y ago

Correct try docker volume ls

ChacsamOP•2y ago

DRIVER    VOLUME NAME
local     1a792da7738ecc56790a4e7a691cf30157a3c894c21bdda10567e242eece965f
local     3ad1cc6cae058e9ff406cc00a0a33d7e9145934d968c84a8dfeedae06fc64213
local     4ecfd384af3bb670f51bf9f7fc6b6a8d0eb77316217ffab8c0a54346e6de4b90
local     5d237f60cfbfd54ce23a63664ddcb99023f53f4d9f1b13d0eb048f877bcde4a2
local     9dd69779143cc084685d8cec3c216caca6d4566332e7285e11110eb0c8a6e9d7
local     101c0b4a394af5b6128dc6ed4f5f39d14ddb1f3a110703fc5a137e21261340eb
local     d92fd775b75044604b67ddd6e04e66abd91aeef2182cfbf047389a38bb183331
local     f3b87f699115aa7b4263e05ccf504b1b8d064d1483abc3f910d544464e649a17
local     fde610f4c1cdb4e23c9d3422c25053fd8b17ce07785d132419e4cb1b98744cc8
local     immich-docker_model-cache
local     immich-docker_pgdata
local     immich-docker_tsdata
local     immich_model-cache
local     immich_pgdata
local     immich_tsdata

DRIVER    VOLUME NAME
local     1a792da7738ecc56790a4e7a691cf30157a3c894c21bdda10567e242eece965f
local     3ad1cc6cae058e9ff406cc00a0a33d7e9145934d968c84a8dfeedae06fc64213
local     4ecfd384af3bb670f51bf9f7fc6b6a8d0eb77316217ffab8c0a54346e6de4b90
local     5d237f60cfbfd54ce23a63664ddcb99023f53f4d9f1b13d0eb048f877bcde4a2
local     9dd69779143cc084685d8cec3c216caca6d4566332e7285e11110eb0c8a6e9d7
local     101c0b4a394af5b6128dc6ed4f5f39d14ddb1f3a110703fc5a137e21261340eb
local     d92fd775b75044604b67ddd6e04e66abd91aeef2182cfbf047389a38bb183331
local     f3b87f699115aa7b4263e05ccf504b1b8d064d1483abc3f910d544464e649a17
local     fde610f4c1cdb4e23c9d3422c25053fd8b17ce07785d132419e4cb1b98744cc8
local     immich-docker_model-cache
local     immich-docker_pgdata
local     immich-docker_tsdata
local     immich_model-cache
local     immich_pgdata
local     immich_tsdata

Kryptonian•2y ago

I had to remove memory limit and now it's happy.

Alex Tran•2y ago

Good to know, thank you! can you rm the immich-docker_model-cache and immich_model-cache volume? You are running on k8s right? From my testing, the machine learning container will need at least 5GB of RAM to work correctly

Kryptonian•2y ago

Correct, and limits of 1 nor 2GB of RAM was enough for ml.

ChacsamOP•2y ago

So, I did: docker volume rm immich-docker_model-cache docker volume rm immich_model-cache docker compose up -d Restarted the job, but still empty API page

Alex Tran•2y ago

What is the machine learning log say?

ChacsamOP•2y ago

hold on, something is coming up API starts to display data

[{"id":"dceb2d05-5a09-4182-8240-23baa7059821","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/dceb2d05-5a09-4182-8240-23baa7059821.jpeg"},{"id":"f9912f67-1f35-4a14-bcee-b9da1cefa719","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/f9912f67-1f35-4a14-bcee-b9da1cefa719.jpeg"},{"id":"d47fd5c8-5e7d-4ae7-80f0-366796e97990","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/d47fd5c8-5e7d-4ae7-80f0-366796e97990.jpeg"},{"id":"80f54d94-34cb-463f-a27f-d98073f18854","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/80f54d94-34cb-463f-a27f-d98073f18854.jpeg"},{"id":"b443f955-0f9b-451e-bbc5-f8c5200c2657","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/b443f955-0f9b-451e-bbc5-f8c5200c2657.jpeg"},{"id":"a6bd693f-1697-4c63-bfe1-24141c7ec6be","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/a6bd693f-1697-4c63-bfe1-24141c7ec6be.jpeg"},{"id":"23df9b98-cac0-4533-af00-595e02d8c6a2","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/23df9b98-cac0-4533-af00-595e02d8c6a2.jpeg"},{"id":"3a678f87-0835-466a-88db-e128045cc54f","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/3a678f87-0835-466a-88db-e128045cc54f.jpeg"}]

[{"id":"dceb2d05-5a09-4182-8240-23baa7059821","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/dceb2d05-5a09-4182-8240-23baa7059821.jpeg"},{"id":"f9912f67-1f35-4a14-bcee-b9da1cefa719","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/f9912f67-1f35-4a14-bcee-b9da1cefa719.jpeg"},{"id":"d47fd5c8-5e7d-4ae7-80f0-366796e97990","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/d47fd5c8-5e7d-4ae7-80f0-366796e97990.jpeg"},{"id":"80f54d94-34cb-463f-a27f-d98073f18854","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/80f54d94-34cb-463f-a27f-d98073f18854.jpeg"},{"id":"b443f955-0f9b-451e-bbc5-f8c5200c2657","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/b443f955-0f9b-451e-bbc5-f8c5200c2657.jpeg"},{"id":"a6bd693f-1697-4c63-bfe1-24141c7ec6be","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/a6bd693f-1697-4c63-bfe1-24141c7ec6be.jpeg"},{"id":"23df9b98-cac0-4533-af00-595e02d8c6a2","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/23df9b98-cac0-4533-af00-595e02d8c6a2.jpeg"},{"id":"3a678f87-0835-466a-88db-e128045cc54f","name":"","thumbnailPath":"upload/thumbs/78295968-61b0-48a1-9306-b1fa648e532c/3a678f87-0835-466a-88db-e128045cc54f.jpeg"}]

Alex Tran•2y ago

Ok now should see something in the explore page

ChacsamOP•2y ago

yes, I do indeed!

Alex Tran•2y ago

So in your case I suspect the corrupted model when it was downloading

ChacsamOP•2y ago

and CPU is working much harder now 😉

Alex Tran•2y ago

yeah it should be 😄

ChacsamOP•2y ago

Anyway, thanks for the support! (and the great app by the way)

Alex Tran•2y ago

No problem! Enjoy!

Sherlock79•2y ago

Hi @Alex, so are you saying that Immich would need 5GB+ RAM?

Alex Tran•2y ago

Yes The machine learning would use almost 4GB of RAM when loading all the models to the memory

Sherlock79•2y ago

That's good to know. I think docs mention 2GB and I was just about to get a 4GB VPS or low end dedicated server.

Alex Tran•2y ago

Yes it was outdated, I forgot to update

Sherlock79•2y ago

Does ML keep the models in memory all the time? Would I get away with 4 GB and some HDD swapping?

Alex Tran•2y ago

It is right now, since we haven't figured out the best way to unload it from RAM yet

satish•2y ago

What's the command to see that

ChacsamOP•2y ago

My RAM usage has been pretty stable around 3 GB all night. Still 50% of my 45.000 asssets to be recognized, but it's moving on (it takes much more time when it actually does something :-))

Alex Tran•2y ago

yeah 3CPU will take a wile

ChacsamOP•2y ago

I only have 4 and don't want to strangle the other apps no my machine. No problem, it's only the first catch up that will be painful, daily uploads will go unnoticed I am sure. First resulsts seem promising, but I let him finish its job before playing around

Alex Tran•2y ago

Correct

Hullah•2y ago

I had a similar issue as the OP when I first upgraded and ran Facial Recognition on all photos. I don't know if it's related or not, but wanted to share my experience of what I saw. Since I was running the full job, there were 3 tasks going on in parallel, which meant it was downloading the model from Github in each of those processes, and then once it was downloaded I assume it attempted to then use it. My question is, would downloading and loading the model in parallel like that cause issues and contention? I feel like that could cause collisions with the loaded model state. In my case, as soon as it downloaded them model and I assume load and try and use it, it starting throwing many errors. I immediately paused the job from the admin page and then cleared the jobs. I then restarted the Machine Learning service and tried running the job for all items again. This time since the model was already downloaded, I didn't receive any error and it completed successfully.

Gaming

Programming

Face recognition Issue

Did you find this page helpful?