Facial recognition double faces, not deleting old ones.
Hello,
Today I have opted to re-run the all job on FR for the third time, hoping it will fix my double-face issues I'm having. I have understood that when running the job the faces should be removed when the runs starts, this however is not what is happening for me. I'm afraid that the faces will show up for a third time, further damaging my overview. I have been looking through the logs and I do not find any weird behavior from immich_microservices.
I could really use some help with this one as I'm beginning to wonder if my setup is stable currently.
Along with this post I have reported a bug on github, where my information is provided.
https://github.com/immich-app/immich/issues/2596
GitHub
[BUG] Facial Recognition double faces. · Issue #2596 · immich-app/i...
The bug Facial recognition feature has marked people double in some of my images. See the image below for reference. I can't unfortunately check how this happened, my guess is when I ran the FR...
163 Replies
It doesn't do this to all images per say, but to some.
I'm starting to think that it might be a database issue again, just like when I moved over from the unraid app to docker compose.
I have started the job again, I will report back tomorrow morning with the results.
I don't think running it again will magically do something different
Well, the idea was to see if running again would prompt a remove on the faces before starting
it did not
What version are you on?
latest, 1.58
Something is very strange. The confirmation modal had been released already...
I know that it still has the old faces because of one picture where I know the cover photo is the same as before.
I just checked and the demo env prompts before running all faces btw
So it works like it should
Wonder what is going on for me that it is erroring out like this.
or well, not working, no errors.
Is there a way for me to check in postgres perhaps?
Normally I know things to check, but I'm completely clueless as to how to resolve this one. The only thing I can think of is completely ditching db and creating a new one
and uploading all assets again via CLI
hm the database for asset_faces is completely unreadable for human.
It seems to remove a majority of the faces, however it also leaves some behind. Did you also check for that jrasm?
What is that a picture of?
at random
No I mean the database screenshot. What are the headers for each column?

I now have proper access thanks to the suggestion in general to the logs of microservices
the only errors I see at the moment are these:
Immich_machine learning is spamming this over and over:
I'm going to head offline now, I'm going to let the job run and check tomorrow what has happened and if the numbers in DB have gone up as well.
Alright, so I have just checked, and FR seems to be identical. I could not find any pictures with triple faces, however the pictures that had double assets before still have them now.
There are also assets that have some faces double and some not.
So it almost seems like some assets aren't properly done.

They are also separate. The first image only has 1 image, so probably didn't get deleted.

The same is the case with the pictures I sent to you @jrasm91 and @Alex. It seems like som3 are left behind by the job, although there is no way of confirming this and it also seems like DB isn't being cleared.
What also is odd, is that the image that is presumably stuck, with me and my sister on it, is first in the view of myself and my sister. Ignoring timeline completely.
They also have a different ID
I will now have a quick look at DB

Numbers on faces have gone up as well
Perhaps I can query the so to say new faces
I have queried the DB with the following:
These are the new results, as far as I understand:

I have just tracked down personId, the first one. Unfortunately it just seems like the FR has found a new person in the image, no doubles.
I did manage to find the assetId of this image, where I might've found some actual duplicates within DB.
If they are this similar we could opt for searching the DB on duplicates?
I have just now executed the following query:
This has listed all the assetId's, where the embedding is similar. As shown in the image below. As far as I can see there are around 6100 doubles. However, I don't know to what extent my query is accurate. I have checked the assetId which I could find linked to my account, since I can't easily view others libraries.
Being
6ab930f2-49e5-483b-9d85-cfb322b212ec
, the results are as follows, as shown in double.png.
When going in to the UI and searching for the corresponding image I see that there are indeed two faces. With this I have also discovered it isn't an issue related to just my account, but also to all the members on my immich instance (4). I have checked how many times I could see the UUID in my query, and that was five. Below I have displayed the results, where I also see that there are 5 duplicates: https://i.imgur.com/s9wqK9F.png
I hope this gives a bit more info, and also steps that perhaps can be taken by you on the test env.

The same AssetId is listed multiple times per double FR embedding.
Perhaps it can be achieved by deleting 1 of all with:
However, I don't know to what extent this will happen again when I rerun the job or add new images. So I would really like to know if you are able to find duplicates in your tables
I personally find this approach extremely risky, but I will create a backup beforehand (duhh) to make sure I have something to fall back on.
I assume the person id is different though?
This could be explained by the face delete not working when you queue the all job.
Can you check the timestamps on the duplicates?
Oh I'm just seeing the last screenshot
Yes it is, since they have been regenerated. However the embedding is exactly the same it seems. The regeneration is an assumption.
Yeah I have outlined what steps I have taken
And how to reproduce
I will have to see if there is ever a reason why face delete would fail
In the end there are 6k duplicates on a 40k library.
You never saw errors in immich-microservices or postgres?
Not that I could see. What I can do I start the job and closely monitor logs with dozzle?
Then we can really see if there are errors.
With job I'm referring to FR all
But this command can be run on dev
Right. That is all I can think of.
Alright, I will test this later today and report back.
I can check if I have duplicates.
Yes please
With that query it becomes apparent right away
To add it to this thread as well, I have just updated to 1.59 and ran the FR all job again. It prompted me to it would delete all, which it didn't. No errors in any of the containers.
Is there a way to set the logging level?
you can add LOG_LEVEL=verbose to the env file
Thank you, will test and restart the containers.
On job start
within immich_server
I also found that if you start, pause, start. The force=true turns in to force=false.
Immich_web has reported an issue, however I don't know if I can safely put the logs in here.
@Alex It seems, I don't know if this is after 1.59 update, that immich_web is reporting the following error on startup:
Hvae you tried restarting the stack?
I restarted, and it does not do it again.
However, recognize faces is stuck on this

since 1.59
Seemingly no errors in any of the containers
You can remove the redis container, it will remove everything in the queue
I composed down the whole stack and built it back up again, hit the recognize faces all again.
same result
This is the usual when starting up the immich_machine-learning
None of the containers are showing any signs of errors, but the job is stuck at 1. I will let it keep running for a while and see if it does anything.
bring all the containers down
them remove redis container
after that bring everything backup
Sure thing, but the redis container is also in the stack so it will take that down automatically.

I assume so but sometimes the container doesn't necessarily removed
so perform docker container prune to make sure
alright, one moment.
Alright, I removed all the images to be sure and restarted the containers.
as well as the postgres container
okay now bring them up
did that, all fine.
run recognize faces all again?
Nope, still on active 1.
Very interesting
This is the only thing I see related to the FR job.
It simply refuses the execute the FR job, other jobs are executing without problem.
I do notice a ton of these error messages when I browse through my photos:
These are video streaming getting stop/destroy midway and can be ignored
thank you
The compose is the same as reported in here:
TL;DR FR job is borked for me and the containers aren't reporting errors even on verbose.
Yeah I am not sure which state the instance is in now @@
What would you need from me so that we can diagnose the issue?
I'm starting to think it is postgres related, however that is also not showing errors.
I'm almost at a point of nuking the postgres db and setting up a new instance.
Have you guys tested postgres 15?
I am not sure tbh
no we haven't
I am on 15, that is the only thing I can think of
OHHH
Hmm, nevermind.
I'm going to start the job with the debug logging on.
The delete is this

I mean, having doubles faces is one. But isn't it weird that the delete is not invoked on job start?
seemingly at least
That is very strange, because it should be. That's literally what the job is supposed to do.
It also gives me the heads up
will check right now and see what happens
LOG_LEVEL=debug
Yeah, I'd be interested to see if it shows people were deleted or not.
Currently the job is also not executing, seemingly since 1.59 oddly enough.
yes, same here.
hm
It is not really showing me debug logs.
I threw the log_level at the top of the .env file:
Why are the containers refusing to give me any information. This is so odd 😂
@jrasm91 Is the recognize all FR working for you on dev 1.59?
hey it failed
no log of it anywhere
hey this is odd

it is randomly starting to spin up the waiting queue
@jrasm91 I finally have a log!

I may have an idea what is happening. Perhaps it takes so long to delete all faces that the job times out.
It would be really nice to get logging on what it is doing instead of just de debug on deletion
17k people oh man
Because it has re-activated itself

Yes it deleted everyone
I think
But the weird thing is, I have 14k assets and we had 53k rows within asset_faces.
17k people is insane though
I will let this run and check for doubles again.
How many photos in your library again?
40k
well
37.7k

Can you send a screenshot from Administration > Server Stats
(total usage)
sure thing
Why is your queue at 80k
All my jobs have this
x2 assets?
When I activate a job it goes to double my assets
yes
Always?
yes
you want me to activate another job to show you?
I thought that @Alex mentioned that the doubles assets were normal on jobs?
Only on thumbnails
That queue generates a webp and a jpeg for each asset.
Doesn't it generate double assets for you?
But it actually only queues all the assets once up front. When an asset finishes with jpeg it then queues another one, so it shouldn't actually ever be that high.
Object detection actuall does 2 jobs for each asset right from the start, so that would have 2x the asset count up front.
I have 15k assets locally in dev atm. I just ran "all"


It doesn't do it right away though, after like 5 min
sometimes less
it is also creating a massive amount of new people
Well, queueing all 80k is going to result in duplicates again most likely.
big chance
Can you test on postgres 14?
Well
How?
Because I mostlikely can't simply do postgres:15 to postgres:14?
No, not with the same volume.
I will kill the containers either way.
So you want me to rebuild?
You could do a backup of the data, spin up a 14 database, do a restore, then bring up the rest of the stack
https://immich.app/docs/administration/backup-and-restore#database
Backup and Restore | Immich
Database
good idea
let's do that
You don't even have to delete the pg volume, you could just point the stack to a new one
And switch it back afterwards
I will first create the backup
Yeah, it should be pretty quick.
alright the dump is done

that was quick work
Okay so now I move postgres to 14?
ah wait I see
one moment
ugh It backed up everything
Yeah, I'd just add a new volume to the docker compose file pgdata2
What do you mean?
These are all the databases

in 15, I have used freshrss momentarily the other ones I have never touched
I could just opt to remove them
Ah, you're saying it backup other stuff in addition to immich?
yes
givng me this wild mess

You probably just need the immich db
yes
How do I change the backup command to only include immich db?
Good question, time for Googling
rofl
Looks like
-d <dnname>
So -d immich
docker exec -t postgres_immich pg_dumpall -c -U root -d immich | gzip > "/mnt/user/appdata/immich/dump.sql.gz"
look right
or instead of pg_dumpall?
will just test it.
hm that is 1kb
lol
oh wait
pg_dumpall > pg_dump
that looks better.
the command is
pg_dumpall
yes to drop all databases
I just want immich right?
with pg_dumpall it was 1kb
not to drop,
-d
indicate the database nameOh, I think that's just what database to connect to originally, it'll still dump all of them.
oh
lol
pg_dump
does look like it'll do a single databasethis is the answer I got
Sounds good 🙂 TIL
alright time to gunzip this thing
You asked ChatGPT?
Nah, probably a warning/error for using
pg_dumpall
+ -d
Or did you?restored on 14
Looks identical
Now bring up the rest of the stack and try again?
In this instance I did actually, it gives really well explained answers with links to docs these days.
since it has web search function.
Dang, thjat's pretty cool.
It is scary how well it codes :)
will see how that goes yes.
no apparent errors on startup it seems
this always takes a bit

yeah
same thing for me
Typesense is the database server we use for vector queries. It needs to load the asset collection from disk into memory, so it takes longer the bigger the collection is.
you pushed 1.59.1?
I will not update yet
Fixed the missing faces job 🙂
that is not out yet
Oh right, fixed the broken search page 🙂
alright I can access the web UI with postgres14
Great.
Lets hit the FR job
thusfar it stays on 40k
🫣
How did we miss that you were on Postgres15?
ah shit I've rmoved the logging level in .env
I did not think of it before
Not a huge deal tbh.
In the end no, but it shows that 15 is not working as it should
Queuing the records 2x explains why you are getting duplicates.
I've done some janky stuff with this db
yes
I'm not 100% convinced it works yet, will wait it out.
your CPU must be hot
yes
I have messed a lot with this DB already, because I originally used the application from the unraid store. The all in one. The database paths were different.
so I had to move in the DB and change the paths
With this beauty I believe it was
The application used '/photos'
Okay, so it seem the job is running fine now.
Makes me wonder if I should also rerun all other jobs
Thank god this is fixed.
End result: don't use postgres15 when postgres14 is recommended 🤣
haha
This is the only job the deletes everything before running.
So I think you're OK.
I presume so as well
closed the ticket