Duplicate images being created
Background information: the service was hosted on the pi4, then moved to a proper x86 platform.
I already sat and cleaned up quite a bit of images, but i do have a few images which I didnt clean up.
These images can be viewed here:
https://immich.assid.com/share/1561aad560df0883e93057ca88a435e78926d2f48c4e835e9b1cae30b7ab1daaf8b230b6c6f4212cb792b1f17ff6bae26a69
Example with 1 file:
find . -name 20221211_184746.heic
./mediaassets/library/06de71da-123e-4733-9448-77196690ac16/2022/2022-12-11/20221211_184746.heic
Only 1 file found
Exif extract provided
Public Share
Duplicate images being uploaded
118 Replies
Can't easily check on my phone, but I can verify tomorrow. My guess is one has gps data and one does not.
there is only 1 physical file
find . -name "20221211_184746*"
./mediaassets/library/06de71da-123e-4733-9448-77196690ac16/2022/2022-12-11/20221211_184746+1.heic
./mediaassets/library/06de71da-123e-4733-9448-77196690ac16/2022/2022-12-11/20221211_184746.heic
apparently the new file is called +1 .. how / why would that be generated
For example,
You can upload a file, then edit the exif and upload it again
It is not the same file anymore
only uploading via the mobile device.. no changes are done on device level
Phones will strip gps data if location permission isn't enabled for the app
the gps data exists
On both?
no.. but there is only supposed to be 1 image
I think you have two versions. One with gps data and one without
not on my phone or google photos
Yeah, I understand that, but I'm pretty sure the same file has been uploaded twice to immich
heres what i think can happen
And it didn't deduplicate it because one had location data stripped out because the location permission wasn't included.
the original file which was uploaded initially.. has GPS data .. along with shutter speed
then.. i must have logged out.. and logged back in .. during which .. it must have tried to re-sync.. and uploaded the data without the gps information and shutter speed
the phone itself has only original image.. not 2
Have you switched phones recently or anything like that?
Iphone to Android?
no.. i did logout / login multiple times..
and i also changed from the pi4 to x64 system
Did you recreate immich or just migrate hardware?
are you sure immich has no code to create a "+1" file
i moved the sql database and media assets ..
then i re-created my docker compose (since 1.51/1.52 changes with ML)
The +1 comes from the storage template code
ok.. so there you go.. its not device generated
Correct, it was originally uploaded with the same name twice
But it needs a different name to store it in the same directory
we need to check when it was parsing the information during upload. Why didnt it check the exif data correctly
because there is only 1 original file to upload.. so what changed
I feel like i have said this a few times - if the location permission is not enabled for the app, Android will strip gps exif tags during upload automatically
So that's definitely one possibility
oh.. so even if the file exists with the exif information and uploaded as a blob/binary.. Android still strips it ?
Have you reinstalled from another source? APK vs fdroid vs play store?
Yes, truly annoying
yeah.. i used to have the fdroid version.. then moved to play store..
When you did that, it tried to reupload all the assets again.
i couldnt wrap my head around this.
is there an easy way to clean this up?
Spend how many duplicates
also can we add a permission check in the app on startup ?
i cleaned up around 100 yesterday
but i dont know if i deleted the original or the stipped one
Possibly, you can ask in general chat maybe. Or open a bug about it
You can SQL query for exif
Are you proficient with SQL?
i could also query for +1.heic files
Here is what I would do
Query for assets with the same device asset id
Join with the exif table and check for gps data
There should be two records for a single device asset id and hopefully one has gps and one does not
but then i need to be able to delete those phgysical files as well right
You could see how many there are at least
You could always start over if that is easier, or maybe do an API call to bulk delete
You could potentially select the IDs to delete and then send a http request to delete them, which would clean up the files
Well, it's past my bedtime ๐. Let me know how it goes. Happy to help tomorrow if you're still stuck.
much appreciated
I think these guys are tired with the number of issues i keep running into ๐
Lol you seem to be finding lots of bugs.
Did you figure out oauth btw?
yep
Sweet
apparently .. dont use regex even though it says it can handle it in authentik.. And dont enable "MOBILE REDIRECT URI OVERRIDE" in immich
xargs -I{} rm -r "{}" < /ssdroot/data/immich/deleteme # -- delete the files
delete from assets where "id" in ( select "assetId" from exif where "model" = 'SM-S906E' and "latitude" is null) ; # delete the records to the files
Ensure proper permissions in the app, then backup (reupload..
Lets see if this solves the problem
ok heres something interesting.. as i see it populate the gallery once again. Some images on the same date have gps information and some dont..
do you think it could be a difference in the background / foreground backup services ?
Do you currently have the location permission enabled in Immich?
yes
Did it detect new images to reupload?
after i deleted those images.. yes..
i could see those images being populated from another phone
They got reuploaded again?
yes.. cause i deleted those records and files remember
You deleted the ones without gps, right?
these
and i did specific for this phone.. cause i have all my images since the time i got this device
Did it upload all the same ones?
That you just removed?
yep. i saw it repopulate
wait i;ll check if theres duplicates now
nope.. those re gone
i see some videos that are duplicate though..
Problem solved? Or you have more dupes?
video dupes still exist
Is one missing gps?
yep
Are the device asset IDs different?
same
Can you query the asset table and group by device asset id
can you make the query you want me to run ?
I can try, still on my phone lol
Select id, count (*) from assets group by deviceAssetId
Select "id", count ("deviceAssetId") from assets group by "deviceAssetId","id" having count("deviceAssetId") > 1;
(0 rows)
Hmm but you definitely have two videos with the same device asset id?
yes
that query is wrong
You should not be grouping by id
immich=> Select id,count ("deviceAssetId") from assets group by "deviceAssetId" having count("deviceAssetId") > 1;
ERROR: column "assets.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: Select id,count ("deviceAssetId") from assets group by "dev...
Maybe removing the having and add order by 2
Oh, i see.
Remove id from the select
that shows a lot of records
How many are over one?
all
if i do
Select count ("deviceAssetId") from assets group by "deviceAssetId" having count("deviceAssetId") > 1;
then i;ll get 2 2 2 2 2 2
Cool
Can you see/count those results?
64 rows
Not too bad
yeah its whatever in the middle that got screwed
im guessing when i moved from f-droid to play store
Can you do another query of select where gps is null and device asset id in that list?
Should be able to use where device asset id in a sub select of that query we just did
664 rows
It is interesting though, same device asset id and same filename and same date. We might be able to detect that and auto recommend fixing
however.. there is no way to check if its from my phone .. or older
since the video data doesnt mention model number
Are there 600 dupes or 60 dupes?
600 dupes
no..
Ah.
600 records without GPS data which are video (fps=30)
You need to specifically look at only records from our duplicate device asset id query
select "assetId" from exif where "fps" = 30 and "latitude" is null and "dateTimeOriginal" > '2022-09-01'::timestamptz;
-- 62
I think the best way is to find duplicate device asset IDs and target those records
yeah.. or i;ll just sit down this sunday and just quickly see videos which look wrong and just delete it in the web interface..
That works too ๐
its only 60+ . and i wont have to deal with re-uploading
It's 64 results for the duplicate device asset id, right?
yep .. i think so
Yeah, that's pretty small
alteast the orinal 150+ images are cleaned up.. and i had manually cleaned aroud 100 before
Just make sure to delete the ones without gps
yeah.. that means i need to click... see info.. then delete .. and then move on
if the ui had a info available on "select" .. then i could have saved a lot more time..
i think the application should have a location permission check on startup.. and pre-hash and posthash check to make sure the right files are being uploaded
The problem is, if permission is disabled when you read the file to complete the hash the location data is already missing
exactly.. thats why i said location permission check on app startup
Yeah that one is doable, but you can't detect this problem with a hash
yeah its taking my brain alot of stress to undertand the app not being able to read the location data.. even though it has access to the file itself
It's a privacy "feature" implemented at the operating system level on some phones
i keep thinking it has file access.. it has file access
The very fs.readFile returns different data depending on that permission
yeah
well i think a mandatory location permissions check on startup with a manual setting to keep ti disabled for people who dont want location data .. should definitely help in this.
Why don't you open a feature request on GitHub about asking for location permission or an option to require it in the settings that defaults to true or something
cause i think half the people in the group are a bit tired of dealing with my crazy problems ๐
You are correct in that several people have run into this now and it is super annoying and also hard to understand
and not everyone would be able to clean up the data the way i did for images..
Correct. I think a good feature would be to extend duplication to detect stuff like this and have a page to show them with some options to auto clean.
yep and to avoid it happening in the first place
There obviously preferred ๐
but tell me this.. i had 2-3 images on the same day.. 1 was uploaded with EXIF .. other 2 were not and was re-uploaded
why would that happen
I've never seen that. It is possible there were uploaded with exif, but it was missing from the database because of an error extracting it or the server was shutdown while it was in the queue.
not 1 .. but quite a few were like that
You can always run extract metadata for missing
as i said , i was seeing it populate from another phone in real time
And check the microservices logs
In immich?
yes .
GitHub
[Feature] Location Permission check on startup ยท immich-app immich ...
The feature I suggest there should be a check and popup on application launch to confirm Location permission is available. Incase a person doesnt want it. they can explicitly set " I dont want...
Ty