I
Immich4mo ago
Athul

Immich backup doubts

https://immich.app/docs/guides/template-backup-script/ The presently recommended backup method (for the database) is to use pg_dumpall to create a logical SQL-query based backup from the PG database. And then use borgbackup to deduplicate this and back it up somewhere. Won't de-duplication fail to work on this pg_dumpall query file? I'm assuming that there's no regular ordering for these files, and that even small DB changes might lead to the dump being changed in its middle. Then the whole file (or at least the contents after the edit) will appear to be new to borg. (I'm aware that borg does de-duplication block-wise. I'm guessing this is how it works for text files) Potentially a 100s of MB sized dump could look new with a few lines of DB changes, right?
22 Replies
Immich
Immich4mo ago
:wave: Hey @Athul, Thanks for reaching out to us. Please carefully read this message and follow the recommended actions. This will help us be more effective in our support effort and leave more time for building Immich :immich:. References - Container Logs: docker compose logs docs - Container Status: docker ps -a docs - Reverse Proxy: https://immich.app/docs/administration/reverse-proxy - Code Formatting https://support.discord.com/hc/en-us/articles/210298617-Markdown-Text-101-Chat-Formatting-Bold-Italic-Underline#h_01GY0DAKGXDEHE263BCAYEGFJA
Immich
Immich4mo ago
Checklist I have... 1. :ballot_box_with_check: verified I'm on the latest release(note that mobile app releases may take some time). 2. :ballot_box_with_check: read applicable release notes. 3. :ballot_box_with_check: reviewed the FAQs for known issues. 4. :ballot_box_with_check: reviewed Github for known issues. 5. :ballot_box_with_check: tried accessing Immich via local ip (without a custom reverse proxy). 6. :ballot_box_with_check: uploaded the relevant information (see below). 7. :ballot_box_with_check: tried an incognito window, disabled extensions, cleared mobile app cache, logged out and back in, different browsers, etc. as applicable (an item can be marked as "complete" by reacting with the appropriate number) Information In order to be able to effectively help you, we need you to provide clear information to show what the problem is. The exact details needed vary per case, but here is a list of things to consider: - Your docker-compose.yml and .env files. - Logs from all the containers and their status (see above). - All the troubleshooting steps you've tried so far. - Any recent changes you've made to Immich or your system. - Details about your system (both software/OS and hardware). - Details about your storage (filesystems, type of disks, output of commands like fdisk -l and df -h). - The version of the Immich server, mobile app, and other relevant pieces. - Any other information that you think might be relevant. Please paste files and logs with proper code formatting, and especially avoid blurry screenshots. Without the right information we can't work out what the problem is. Help us help you ;) If this ticket can be closed you can use the /close command, and re-open it later if needed.
GitHub
immich-app immich · Discussions
Explore the GitHub Discussions forum for immich-app immich. Discuss code, ask questions & collaborate with the developer community.
GitHub
Issues · immich-app/immich
High performance self-hosted photo and video management solution. - Issues · immich-app/immich
Athul
AthulOP4mo ago
Has someone here tried this? And practically how effective has borg's de-dup been for you?
bo0tzz
bo0tzz4mo ago
Dedup might fail. But what's a few 100 MB of database content on (presumably) many many gigabytes of asset files?
vazquezjm
vazquezjm4mo ago
I'm facing a similar issue, but with Duplicati and the assets file
bo0tzz
bo0tzz4mo ago
I wouldn't worry about it
vazquezjm
vazquezjm4mo ago
this was the last run
No description
Athul
AthulOP4mo ago
but it's a few 100 MB everytime I do a backup (that's maybe like once a week). Since the media is always deduped, after a while the backup might have 1 copy of the media and hundreds of copies of the DB (assuming dedup is not good) ohh right... which files/folders are you backing up? if you have just 5 versions and it's now taking 3 times the size, there's likely something wrong with the approach.
vazquezjm
vazquezjm4mo ago
I followed this guide https://immich.app/docs/administration/backup-and-restore so
/immich/library
/immich/upload
/immich/profile
/immich/library
/immich/upload
/immich/profile
my script generate 2 files, one for the DB and other for the assets, then those two files are backed up by Duplicati to an IDrive bucket
Athul
AthulOP4mo ago
got it. How does your script generate 2 files, which you then back up? @vazquezjm if you make it a compressed archive before backing up, then deduplication probably cannot work.
vazquezjm
vazquezjm4mo ago
right! that's the problem. basically the bash script does this:
tar -czf "$DESTINATION/$BACKUP_FILE" $DIRECTORIES
tar -czf "$DESTINATION/$BACKUP_FILE" $DIRECTORIES
so, probably it'd be a better idea to expose the 3 directories mentioned in the doc to Duplicati instead of compressing them, right?
Athul
AthulOP4mo ago
@vazquezjm yes!! almost surely. You'll save a lot of space I think. anyone with answers to this though?
Zeus
Zeus4mo ago
You’ll get better results if you dump it without gzip Restic does totally fine at de duping small changes. I imagine Borg does as well
vazquezjm
vazquezjm4mo ago
Yes. Same for Duplicati
Athul
AthulOP4mo ago
@Zeus also, it seems that the official backup script doesn't shut down immich before backing up with borg. I thought that would be mandatory, since you might get db or file changes during the backup otherwise. Should I add lines that stop (and finally restart) the immich containers to the backup script? If not, why isn't this a concern? also, since I have multiple questions, it would be really useful if I could connect with the person/s who wrote the official backup script - Maybe via email. Would that be possible? https://immich.app/docs/guides/template-backup-script/
Zeus
Zeus4mo ago
Pg_dump can be taken during activity. Worst case you would have 1 or 2 files uploaded that aren’t tracked in the DB No, we won’t provide email support
Athul
AthulOP4mo ago
yeah I'm aware.. but I don't get why that's okay to do.. 😅
Zeus
Zeus4mo ago
That’s just how Postgres works. It can safely dump while active
Athul
AthulOP4mo ago
no not PG, the untracked files part. Having a borg backup with untracked files.. you said that's fine, but I would've thought that was a problem. Anyways, adding "docker compose down" and "docker compose up -d" to the script should be okay, right? also the script can be set to run at some odd hour in the night, where nobody uploads anything... but that's not a guarantee still.
bo0tzz
bo0tzz4mo ago
If an asset isn't in the DB, the server doesn't know about it and so clients will try to upload it - even if they uploaded that file successfully in the past So it'll just fix itself
Athul
AthulOP4mo ago
Just learned the answer to this previous question. Borg uses something called Content Defined Chunking (CDC) so that it doesn't cut the user's data at fixed or predefined intervals while creating de-duplication chunks. Instead, for each file it computes a rolling hash over all the bytes of the data, and whenever the hash-value at a byte gets some 'k' zeros at the end, that byte is where a cut is made. So editing any file in the middle will not affect more than 1-2 chunks, as the data after the edits will still get hashed the same and therefore cut at the same points (even though those points have moved inside the file). fyi.. @bo0tzz @Zeus
Zeus
Zeus4mo ago
Yep, this is what we meant haha

Did you find this page helpful?