Removing duplicates from Immich that are found in an external library.
Hi there, I am wondering if there is a way to remove duplicates that were uploaded to Immich, but exist in an external library without having to go one at a time (there are 35k duplicates).
38 Replies
:wave: Hey @dotfortun3,
Thanks for reaching out to us. Please carefully read this message and follow the recommended actions. This will help us be more effective in our support effort and leave more time for building Immich :immich:.
References
- Container Logs:
docker compose logs
docs
- Container Status: docker ps -a
docs
- Reverse Proxy: https://immich.app/docs/administration/reverse-proxy
- Code Formatting https://support.discord.com/hc/en-us/articles/210298617-Markdown-Text-101-Chat-Formatting-Bold-Italic-Underline#h_01GY0DAKGXDEHE263BCAYEGFJA
Checklist
I have...
1. :ballot_box_with_check: verified I'm on the latest release(note that mobile app releases may take some time).
2. :ballot_box_with_check: read applicable release notes.
3. :ballot_box_with_check: reviewed the FAQs for known issues.
4. :ballot_box_with_check: reviewed Github for known issues.
5. :blue_square: tried accessing Immich via local ip (without a custom reverse proxy).
6. :ballot_box_with_check: uploaded the relevant information (see below).
7. :ballot_box_with_check: tried an incognito window, disabled extensions, cleared mobile app cache, logged out and back in, different browsers, etc. as applicable
(an item can be marked as "complete" by reacting with the appropriate number)
Information
In order to be able to effectively help you, we need you to provide clear information to show what the problem is. The exact details needed vary per case, but here is a list of things to consider:
- Your docker-compose.yml and .env files.
- Logs from all the containers and their status (see above).
- All the troubleshooting steps you've tried so far.
- Any recent changes you've made to Immich or your system.
- Details about your system (both software/OS and hardware).
- Details about your storage (filesystems, type of disks, output of commands like fdisk -l
and df -h
).
- The version of the Immich server, mobile app, and other relevant pieces.
- Any other information that you think might be relevant.
Please paste files and logs with proper code formatting, and especially avoid blurry screenshots.
Without the right information we can't work out what the problem is. Help us help you ;)
If this ticket can be closed you can use the /close
command, and re-open it later if needed.
Successfully submitted, a tag has been added to inform contributors. :white_check_mark:there is no official way with immich. Perhaps there's an opensource tool that'd allow you to do this from github?
At that point why not just use an internal library for everything?
I'd rather keep them in the external library because that is already backed up to multiple locations and I want to avoid having to backup 35k files again, and would rather upload the new ones in the internal library.
But I was looking at the API, I see there is a duplicates endpoint. It looks like that might return all of them?
you could give it a try, I myself am not familiar with that endpoint
Ok, I will... Do you happen to know which property would indicate it is an external library? I see the libraryId property but it says that was depricated
would
originalPath
not be sufficient to build a script around?That was my second thought, but wasn't sure if there was another property.
deduplication does not work across internal/external libraries I'm afraid
But it does detect the duplicates, so a tool could be made in theory. Right?
if external libraries get hashed you could do a file search by hash - but then again I'm not sure they get hashed?
Yes for the internal ones 🙃
but this shows a duplicate as external

updates brain
silently wonders
how does the deduplication work, if the asset that's in an album gets deleted, does the other asset gets automatically added to that album?
It doesn't yet
There is an issue working on that
https://github.com/immich-app/immich/pull/13851
[Pull Request] feat(web): synchronise metadata and album membership between duplicate images (immich-app/immich#13851)
that's duplicate finder tool, is not the same as the duplication check mechanism during upload. the later use hash comparison while the other does a similarity comparison
But these would still get returned by the duplicates API which I could in theory use to find duplicates where one is in the external library and delete the internal one
my god, the duplicates response is 107MB lol
key word here is
similar images
vs exact duplicates
It returns the checksum, so I could use that to identify exact duplicates?
Great, it looks like it does hash external library photos too
No it doesn’t
Oh
The checksum for external lib is just the file path
ohhh
ok
I was looking through and none of them matched so I was wondering
I am familiar with hashing, but not well versed. Is it just an MD5? So I could generate a hash and compare myself?
what's the plan here, are you going to be re-uploading duplicates to the internal storage, or is this a one-off?
a simple hash comparison script between your library and external library should be enough, if you found one just delete the asset on the EX lib and it will be removed from Immich during the refresh job
It's a one off problem. Basically had everything in PhotoPrism and switch to Immich and don't want to reupload all the files to the cloud
But everything going forward will be uploaded via the app
personally I'd re-upload everything as then immich handles all of my media storage paths, allowing me to have everything together.
Alternatively why not bring photoPrism current, copy assets as external library, sync devices, delete all images within immich, and then continue using immich from that point forwards. If you ever log out and back into the app you'll get all duplicates again, but you would in this case anyways
oh so if I delete the internal images it will reupload them?
I mean delete them from the webui, and no, after the phone has done the sync it should not re-upload the images (to my knowledge) again.
ok, that is what I thought but wanted to verify
We use SHA-1
Ok, I will think about the approach and see what I can figure out. I think I have all the information I need.
just to clarify, if you go a different route and decide to keep the assets in your exlib, removing them from the internal library, your phone will upload them again since there's no duplication check against exlib
I have the external library mounted readonly... if I delete them from Immich will it just remove them from the DB, or will it fail?
Or do I have to make it RW
It will put in trash for 30 days then it will come back when re scanned
That was my next question
ok