Immich backup doubts
https://immich.app/docs/guides/template-backup-script/
The presently recommended backup method (for the database) is to use pg_dumpall to create a logical SQL-query based backup from the PG database. And then use borgbackup to deduplicate this and back it up somewhere.
Won't de-duplication fail to work on this pg_dumpall query file?
I'm assuming that there's no regular ordering for these files, and that even small DB changes might lead to the dump being changed in its middle. Then the whole file (or at least the contents after the edit) will appear to be new to borg.
(I'm aware that borg does de-duplication block-wise. I'm guessing this is how it works for text files)
Potentially a 100s of MB sized dump could look new with a few lines of DB changes, right?
22 Replies
:wave: Hey @Athul,
Thanks for reaching out to us. Please carefully read this message and follow the recommended actions. This will help us be more effective in our support effort and leave more time for building Immich :immich:.
References
- Container Logs:
docker compose logs
docs
- Container Status: docker ps -a
docs
- Reverse Proxy: https://immich.app/docs/administration/reverse-proxy
- Code Formatting https://support.discord.com/hc/en-us/articles/210298617-Markdown-Text-101-Chat-Formatting-Bold-Italic-Underline#h_01GY0DAKGXDEHE263BCAYEGFJAChecklist
I have...
1. :ballot_box_with_check: verified I'm on the latest release(note that mobile app releases may take some time).
2. :ballot_box_with_check: read applicable release notes.
3. :ballot_box_with_check: reviewed the FAQs for known issues.
4. :ballot_box_with_check: reviewed Github for known issues.
5. :ballot_box_with_check: tried accessing Immich via local ip (without a custom reverse proxy).
6. :ballot_box_with_check: uploaded the relevant information (see below).
7. :ballot_box_with_check: tried an incognito window, disabled extensions, cleared mobile app cache, logged out and back in, different browsers, etc. as applicable
(an item can be marked as "complete" by reacting with the appropriate number)
Information
In order to be able to effectively help you, we need you to provide clear information to show what the problem is. The exact details needed vary per case, but here is a list of things to consider:
- Your docker-compose.yml and .env files.
- Logs from all the containers and their status (see above).
- All the troubleshooting steps you've tried so far.
- Any recent changes you've made to Immich or your system.
- Details about your system (both software/OS and hardware).
- Details about your storage (filesystems, type of disks, output of commands like
fdisk -l
and df -h
).
- The version of the Immich server, mobile app, and other relevant pieces.
- Any other information that you think might be relevant.
Please paste files and logs with proper code formatting, and especially avoid blurry screenshots.
Without the right information we can't work out what the problem is. Help us help you ;)
If this ticket can be closed you can use the /close
command, and re-open it later if needed.GitHub
immich-app immich · Discussions
Explore the GitHub Discussions forum for immich-app immich. Discuss code, ask questions & collaborate with the developer community.
FAQ | Immich
User
GitHub
Issues · immich-app/immich
High performance self-hosted photo and video management solution. - Issues · immich-app/immich
Has someone here tried this? And practically how effective has borg's de-dup been for you?
Dedup might fail. But what's a few 100 MB of database content on (presumably) many many gigabytes of asset files?
I'm facing a similar issue, but with Duplicati and the assets file
I wouldn't worry about it
this was the last run

but it's a few 100 MB everytime I do a backup (that's maybe like once a week). Since the media is always deduped, after a while the backup might have 1 copy of the media and hundreds of copies of the DB (assuming dedup is not good)
ohh right... which files/folders are you backing up? if you have just 5 versions and it's now taking 3 times the size, there's likely something wrong with the approach.
I followed this guide https://immich.app/docs/administration/backup-and-restore so
my script generate 2 files, one for the DB and other for the assets, then those two files are backed up by Duplicati to an IDrive bucket
got it. How does your script generate 2 files, which you then back up?
@vazquezjm if you make it a compressed archive before backing up, then deduplication probably cannot work.
right! that's the problem. basically the bash script does this:
so, probably it'd be a better idea to expose the 3 directories mentioned in the doc to Duplicati instead of compressing them, right?
@vazquezjm yes!! almost surely. You'll save a lot of space I think.
anyone with answers to this though?
You’ll get better results if you dump it without gzip
Restic does totally fine at de duping small changes. I imagine Borg does as well
Yes. Same for Duplicati
@Zeus also, it seems that the official backup script doesn't shut down immich before backing up with borg. I thought that would be mandatory, since you might get db or file changes during the backup otherwise. Should I add lines that stop (and finally restart) the immich containers to the backup script? If not, why isn't this a concern?
also, since I have multiple questions, it would be really useful if I could connect with the person/s who wrote the official backup script - Maybe via email. Would that be possible?
https://immich.app/docs/guides/template-backup-script/
Pg_dump can be taken during activity. Worst case you would have 1 or 2 files uploaded that aren’t tracked in the DB
No, we won’t provide email support
yeah I'm aware.. but I don't get why that's okay to do.. 😅
That’s just how Postgres works. It can safely dump while active
no not PG, the untracked files part. Having a borg backup with untracked files.. you said that's fine, but I would've thought that was a problem.
Anyways, adding "docker compose down" and "docker compose up -d" to the script should be okay, right?
also the script can be set to run at some odd hour in the night, where nobody uploads anything... but that's not a guarantee still.
If an asset isn't in the DB, the server doesn't know about it and so clients will try to upload it - even if they uploaded that file successfully in the past
So it'll just fix itself
Just learned the answer to this previous question.
Borg uses something called Content Defined Chunking (CDC) so that it doesn't cut the user's data at fixed or predefined intervals while creating de-duplication chunks. Instead, for each file it computes a rolling hash over all the bytes of the data, and whenever the hash-value at a byte gets some 'k' zeros at the end, that byte is where a cut is made.
So editing any file in the middle will not affect more than 1-2 chunks, as the data after the edits will still get hashed the same and therefore cut at the same points (even though those points have moved inside the file).
fyi.. @bo0tzz @Zeus
Yep, this is what we meant haha