I
Immich•2d ago
dotfortun3

Removing duplicates from Immich that are found in an external library.

Hi there, I am wondering if there is a way to remove duplicates that were uploaded to Immich, but exist in an external library without having to go one at a time (there are 35k duplicates).
38 Replies
Immich
Immich•2d ago
:wave: Hey @dotfortun3, Thanks for reaching out to us. Please carefully read this message and follow the recommended actions. This will help us be more effective in our support effort and leave more time for building Immich :immich:. References - Container Logs: docker compose logs docs - Container Status: docker ps -a docs - Reverse Proxy: https://immich.app/docs/administration/reverse-proxy - Code Formatting https://support.discord.com/hc/en-us/articles/210298617-Markdown-Text-101-Chat-Formatting-Bold-Italic-Underline#h_01GY0DAKGXDEHE263BCAYEGFJA Checklist I have... 1. :ballot_box_with_check: verified I'm on the latest release(note that mobile app releases may take some time). 2. :ballot_box_with_check: read applicable release notes. 3. :ballot_box_with_check: reviewed the FAQs for known issues. 4. :ballot_box_with_check: reviewed Github for known issues. 5. :blue_square: tried accessing Immich via local ip (without a custom reverse proxy). 6. :ballot_box_with_check: uploaded the relevant information (see below). 7. :ballot_box_with_check: tried an incognito window, disabled extensions, cleared mobile app cache, logged out and back in, different browsers, etc. as applicable (an item can be marked as "complete" by reacting with the appropriate number) Information In order to be able to effectively help you, we need you to provide clear information to show what the problem is. The exact details needed vary per case, but here is a list of things to consider: - Your docker-compose.yml and .env files. - Logs from all the containers and their status (see above). - All the troubleshooting steps you've tried so far. - Any recent changes you've made to Immich or your system. - Details about your system (both software/OS and hardware). - Details about your storage (filesystems, type of disks, output of commands like fdisk -l and df -h). - The version of the Immich server, mobile app, and other relevant pieces. - Any other information that you think might be relevant. Please paste files and logs with proper code formatting, and especially avoid blurry screenshots. Without the right information we can't work out what the problem is. Help us help you ;) If this ticket can be closed you can use the /close command, and re-open it later if needed. Successfully submitted, a tag has been added to inform contributors. :white_check_mark:
Tempest
Tempest•2d ago
there is no official way with immich. Perhaps there's an opensource tool that'd allow you to do this from github? At that point why not just use an internal library for everything?
dotfortun3
dotfortun3OP•2d ago
I'd rather keep them in the external library because that is already backed up to multiple locations and I want to avoid having to backup 35k files again, and would rather upload the new ones in the internal library. But I was looking at the API, I see there is a duplicates endpoint. It looks like that might return all of them?
Tempest
Tempest•2d ago
you could give it a try, I myself am not familiar with that endpoint
dotfortun3
dotfortun3OP•2d ago
Ok, I will... Do you happen to know which property would indicate it is an external library? I see the libraryId property but it says that was depricated
Tempest
Tempest•2d ago
would originalPath not be sufficient to build a script around?
dotfortun3
dotfortun3OP•2d ago
That was my second thought, but wasn't sure if there was another property.
Mraedis
Mraedis•2d ago
deduplication does not work across internal/external libraries I'm afraid
dotfortun3
dotfortun3OP•2d ago
But it does detect the duplicates, so a tool could be made in theory. Right?
Tempest
Tempest•2d ago
if external libraries get hashed you could do a file search by hash - but then again I'm not sure they get hashed?
Mraedis
Mraedis•2d ago
Yes for the internal ones 🙃
dotfortun3
dotfortun3OP•2d ago
but this shows a duplicate as external
dotfortun3
dotfortun3OP•2d ago
No description
Mraedis
Mraedis•2d ago
updates brain
Tempest
Tempest•2d ago
silently wonders how does the deduplication work, if the asset that's in an album gets deleted, does the other asset gets automatically added to that album?
dotfortun3
dotfortun3OP•2d ago
It doesn't yet There is an issue working on that https://github.com/immich-app/immich/pull/13851
Immich
Immich•2d ago
[Pull Request] feat(web): synchronise metadata and album membership between duplicate images (immich-app/immich#13851)
NoMachine
NoMachine•2d ago
that's duplicate finder tool, is not the same as the duplication check mechanism during upload. the later use hash comparison while the other does a similarity comparison
dotfortun3
dotfortun3OP•2d ago
But these would still get returned by the duplicates API which I could in theory use to find duplicates where one is in the external library and delete the internal one my god, the duplicates response is 107MB lol
Tempest
Tempest•2d ago
key word here is similar images vs exact duplicates
dotfortun3
dotfortun3OP•2d ago
It returns the checksum, so I could use that to identify exact duplicates? Great, it looks like it does hash external library photos too
Zeus
Zeus•2d ago
No it doesn’t
dotfortun3
dotfortun3OP•2d ago
Oh
Zeus
Zeus•2d ago
The checksum for external lib is just the file path
dotfortun3
dotfortun3OP•2d ago
ohhh ok I was looking through and none of them matched so I was wondering I am familiar with hashing, but not well versed. Is it just an MD5? So I could generate a hash and compare myself?
Tempest
Tempest•2d ago
what's the plan here, are you going to be re-uploading duplicates to the internal storage, or is this a one-off?
NoMachine
NoMachine•2d ago
a simple hash comparison script between your library and external library should be enough, if you found one just delete the asset on the EX lib and it will be removed from Immich during the refresh job
dotfortun3
dotfortun3OP•2d ago
It's a one off problem. Basically had everything in PhotoPrism and switch to Immich and don't want to reupload all the files to the cloud But everything going forward will be uploaded via the app
Tempest
Tempest•2d ago
personally I'd re-upload everything as then immich handles all of my media storage paths, allowing me to have everything together. Alternatively why not bring photoPrism current, copy assets as external library, sync devices, delete all images within immich, and then continue using immich from that point forwards. If you ever log out and back into the app you'll get all duplicates again, but you would in this case anyways
dotfortun3
dotfortun3OP•2d ago
oh so if I delete the internal images it will reupload them?
Tempest
Tempest•2d ago
I mean delete them from the webui, and no, after the phone has done the sync it should not re-upload the images (to my knowledge) again.
dotfortun3
dotfortun3OP•2d ago
ok, that is what I thought but wanted to verify
Zeus
Zeus•2d ago
We use SHA-1
dotfortun3
dotfortun3OP•2d ago
Ok, I will think about the approach and see what I can figure out. I think I have all the information I need.
NoMachine
NoMachine•2d ago
just to clarify, if you go a different route and decide to keep the assets in your exlib, removing them from the internal library, your phone will upload them again since there's no duplication check against exlib
dotfortun3
dotfortun3OP•2d ago
I have the external library mounted readonly... if I delete them from Immich will it just remove them from the DB, or will it fail? Or do I have to make it RW
Zeus
Zeus•2d ago
It will put in trash for 30 days then it will come back when re scanned
dotfortun3
dotfortun3OP•2d ago
That was my next question ok

Did you find this page helpful?