Corrupt database - any options? (Solution: use the backups Immich already saves by default)

Evidently my DB got corrupted at some point in the last two months, and now I'm stuck trying to figure out if there's any hope of recovery or if I need to rebuild from scratch I was previously on I think 1.126, and had tried to update to 1.129 over the weekend, only to find that the main container wouldn't finish connecting to the DB container, and that the DB container logs showed it had been failing to fix some issues for quite some time. (The app itself never notified me that it couldn't connect to the server, that may be a separate feature request) Logs in the thread:
113 Replies
Immich
Immich2mo ago
:wave: Hey @HashtagOctothorp, Thanks for reaching out to us. Please carefully read this message and follow the recommended actions. This will help us be more effective in our support effort and leave more time for building Immich :immich:. References - Container Logs: docker compose logs docs - Container Status: docker ps -a docs - Reverse Proxy: https://immich.app/docs/administration/reverse-proxy - Code Formatting https://support.discord.com/hc/en-us/articles/210298617-Markdown-Text-101-Chat-Formatting-Bold-Italic-Underline#h_01GY0DAKGXDEHE263BCAYEGFJA Checklist I have... 1. :blue_square: verified I'm on the latest release(note that mobile app releases may take some time). 2. :blue_square: read applicable release notes. 3. :blue_square: reviewed the FAQs for known issues. 4. :blue_square: reviewed Github for known issues. 5. :blue_square: tried accessing Immich via local ip (without a custom reverse proxy). 6. :blue_square: uploaded the relevant information (see below). 7. :blue_square: tried an incognito window, disabled extensions, cleared mobile app cache, logged out and back in, different browsers, etc. as applicable (an item can be marked as "complete" by reacting with the appropriate number) Information In order to be able to effectively help you, we need you to provide clear information to show what the problem is. The exact details needed vary per case, but here is a list of things to consider: - Your docker-compose.yml and .env files. - Logs from all the containers and their status (see above). - All the troubleshooting steps you've tried so far. - Any recent changes you've made to Immich or your system. - Details about your system (both software/OS and hardware). - Details about your storage (filesystems, type of disks, output of commands like fdisk -l and df -h). - The version of the Immich server, mobile app, and other relevant pieces. - Any other information that you think might be relevant. Please paste files and logs with proper code formatting, and especially avoid blurry screenshots. Without the right information we can't work out what the problem is. Help us help you ;) If this ticket can be closed you can use the /close command, and re-open it later if needed.
HashtagOctothorp
HashtagOctothorpOP2mo ago
.env
UPLOAD_LOCATION=/volume1/immich/upload
DB_DATA_LOCATION=/volume1/immich/postgres
TZ=America/Phoenix
IMMICH_VERSION=release
DB_PASSWORD=immichpg******* #commented out for now
DB_USERNAME=postgres
DB_DATABASE_NAME=immich
UPLOAD_LOCATION=/volume1/immich/upload
DB_DATA_LOCATION=/volume1/immich/postgres
TZ=America/Phoenix
IMMICH_VERSION=release
DB_PASSWORD=immichpg******* #commented out for now
DB_USERNAME=postgres
DB_DATABASE_NAME=immich
iiokiirok
iiokiirok2mo ago
Do you have storage-level snapshots from (over) 2mo ago? There might be an option to roll back changes on Synology's software side.
HashtagOctothorp
HashtagOctothorpOP2mo ago
Logs on the main container that clued me into it:
Error: getaddrinfo ENOTFOUND database
at GetAddrInfoReqWrap.onlookupall [as oncomplete] (node:dns:120:26)
[Nest] 16 - 03/15/2025, 3:45:23 AM ERROR [ExceptionHandler] Error: getaddrinfo ENOTFOUND database
at GetAddrInfoReqWrap.onlookupall [as oncomplete] (node:dns:120:26) {
errno: -3008,
code: 'ENOTFOUND',
syscall: 'getaddrinfo',
hostname: 'database'
}
Error: getaddrinfo ENOTFOUND database
at GetAddrInfoReqWrap.onlookupall [as oncomplete] (node:dns:120:26)
[Nest] 16 - 03/15/2025, 3:45:23 AM ERROR [ExceptionHandler] Error: getaddrinfo ENOTFOUND database
at GetAddrInfoReqWrap.onlookupall [as oncomplete] (node:dns:120:26) {
errno: -3008,
code: 'ENOTFOUND',
syscall: 'getaddrinfo',
hostname: 'database'
}
The DB docker container was also missing an IP, so I started digging into the logs, pulling a few interesting ones: Most of the logs from the last several weeks look like the 03-09 logs, with these missing file logs. Then the first notice of an error: 03-15 with the issue
HashtagOctothorp
HashtagOctothorpOP2mo ago
No I don't have any storage-level backups of the container
iiokiirok
iiokiirok2mo ago
I'm not exactly sure how experienced you are. How did you determine the db to be corrupted?
HashtagOctothorp
HashtagOctothorpOP2mo ago
All the subsequent logs
2025-03-15 10:38:52.573 UTC [1] LOG: starting PostgreSQL 14.17 (Debian 14.17-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
2025-03-15 10:38:52.573 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
2025-03-15 10:38:52.573 UTC [1] LOG: listening on IPv6 address "::", port 5432
2025-03-15 10:38:52.672 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2025-03-15 10:38:52.769 UTC [26] LOG: database system was shut down at 2025-03-15 10:12:17 UTC
2025-03-15 10:38:52.769 UTC [26] LOG: invalid primary checkpoint record
2025-03-15 10:38:52.769 UTC [26] PANIC: could not locate a valid checkpoint record
[2025-03-15T10:38:52Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17332".
[2025-03-15T10:38:52Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17576".
[2025-03-15T10:38:52Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17576/segments/ddb25af1-8278-4232-9e58-49c9e46c35cd".
[2025-03-15T10:38:52Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17332/segments/b89b74f3-8935-444d-9f8d-847f6a26fca1".
2025-03-15 10:38:55.475 UTC [37] FATAL: the database system is starting up
2025-03-15 10:38:55.478 UTC [36] FATAL: the database system is starting up
2025-03-15 10:38:55.814 UTC [1] LOG: startup process (PID 26) was terminated by signal 6: Aborted
2025-03-15 10:38:55.814 UTC [1] LOG: aborting startup due to startup process failure
2025-03-15 10:38:55.817 UTC [1] LOG: database system is shut down
2025-03-15 10:38:52.573 UTC [1] LOG: starting PostgreSQL 14.17 (Debian 14.17-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
2025-03-15 10:38:52.573 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
2025-03-15 10:38:52.573 UTC [1] LOG: listening on IPv6 address "::", port 5432
2025-03-15 10:38:52.672 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2025-03-15 10:38:52.769 UTC [26] LOG: database system was shut down at 2025-03-15 10:12:17 UTC
2025-03-15 10:38:52.769 UTC [26] LOG: invalid primary checkpoint record
2025-03-15 10:38:52.769 UTC [26] PANIC: could not locate a valid checkpoint record
[2025-03-15T10:38:52Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17332".
[2025-03-15T10:38:52Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17576".
[2025-03-15T10:38:52Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17576/segments/ddb25af1-8278-4232-9e58-49c9e46c35cd".
[2025-03-15T10:38:52Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17332/segments/b89b74f3-8935-444d-9f8d-847f6a26fca1".
2025-03-15 10:38:55.475 UTC [37] FATAL: the database system is starting up
2025-03-15 10:38:55.478 UTC [36] FATAL: the database system is starting up
2025-03-15 10:38:55.814 UTC [1] LOG: startup process (PID 26) was terminated by signal 6: Aborted
2025-03-15 10:38:55.814 UTC [1] LOG: aborting startup due to startup process failure
2025-03-15 10:38:55.817 UTC [1] LOG: database system is shut down
2025-03-15 11:11:05.462 UTC [1] LOG: starting PostgreSQL 14.17 (Debian 14.17-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
2025-03-15 11:11:05.462 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
2025-03-15 11:11:05.462 UTC [1] LOG: listening on IPv6 address "::", port 5432
2025-03-15 11:11:05.562 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2025-03-15 11:11:05.652 UTC [26] LOG: database system was shut down at 2025-03-15 10:12:17 UTC
2025-03-15 11:11:05.652 UTC [26] LOG: invalid primary checkpoint record
2025-03-15 11:11:05.652 UTC [26] PANIC: could not locate a valid checkpoint record
[2025-03-15T11:11:05Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17332".
[2025-03-15T11:11:05Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17576".
[2025-03-15T11:11:05Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17332/segments/b89b74f3-8935-444d-9f8d-847f6a26fca1".
[2025-03-15T11:11:05Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17576/segments/ddb25af1-8278-4232-9e58-49c9e46c35cd".
2025-03-15 11:11:08.568 UTC [1] LOG: startup process (PID 26) was terminated by signal 6: Aborted
2025-03-15 11:11:08.568 UTC [1] LOG: aborting startup due to startup process failure
2025-03-15 11:11:08.570 UTC [1] LOG: database system is shut down
2025-03-15 11:11:05.462 UTC [1] LOG: starting PostgreSQL 14.17 (Debian 14.17-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
2025-03-15 11:11:05.462 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
2025-03-15 11:11:05.462 UTC [1] LOG: listening on IPv6 address "::", port 5432
2025-03-15 11:11:05.562 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2025-03-15 11:11:05.652 UTC [26] LOG: database system was shut down at 2025-03-15 10:12:17 UTC
2025-03-15 11:11:05.652 UTC [26] LOG: invalid primary checkpoint record
2025-03-15 11:11:05.652 UTC [26] PANIC: could not locate a valid checkpoint record
[2025-03-15T11:11:05Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17332".
[2025-03-15T11:11:05Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17576".
[2025-03-15T11:11:05Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17332/segments/b89b74f3-8935-444d-9f8d-847f6a26fca1".
[2025-03-15T11:11:05Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17576/segments/ddb25af1-8278-4232-9e58-49c9e46c35cd".
2025-03-15 11:11:08.568 UTC [1] LOG: startup process (PID 26) was terminated by signal 6: Aborted
2025-03-15 11:11:08.568 UTC [1] LOG: aborting startup due to startup process failure
2025-03-15 11:11:08.570 UTC [1] LOG: database system is shut down
iiokiirok
iiokiirok2mo ago
Did you in any way touch the setup during the last 2-3 days?
HashtagOctothorp
HashtagOctothorpOP2mo ago
Minus the "rebuild stack" option, attempting to get the latest version, no
iiokiirok
iiokiirok2mo ago
Rebuild the stack is a synology-specific thing? hm
HashtagOctothorp
HashtagOctothorpOP2mo ago
No, portainer
HashtagOctothorp
HashtagOctothorpOP2mo ago
No description
Zeus
Zeus2mo ago
If you don’t have a backup the only option is to do a full re upload of all images
HashtagOctothorp
HashtagOctothorpOP2mo ago
Can reuse the images already in the location?
Zeus
Zeus2mo ago
You can re upload them or add an external library
HashtagOctothorp
HashtagOctothorpOP2mo ago
Are there docs for recommendations on backup patterns?
iiokiirok
iiokiirok2mo ago
My 2 clues is networking problems between immich and db container and possible incompatabilities between immich and db software versions, and/or incomplete migrations. My gut says Immich was probing the db, it failed a healthcheck and the db got in to a restart loop. Either fs-level backups work fine, or Immich has built-in db (metadata-only) dumping (which you must combine with images from the uploads dir). https://immich.app/docs/administration/backup-and-restore/ (in addition) Keeping track of container SHAs/versions is usually not needed, but increases sanity.
HashtagOctothorp
HashtagOctothorpOP2mo ago
there was one log that seemed interesting too:
2025-03-15 09:55:09.661 UTC [26] LOG: database system was not properly shut down; automatic recovery in progress
2025-03-15 09:55:09.790 UTC [26] LOG: redo starts at 0/11D3A378
2025-03-15 09:55:09.790 UTC [26] LOG: invalid record length at 0/11D3A460: wanted 24, got 0
2025-03-15 09:55:09.790 UTC [26] LOG: redo done at 0/11D3A428 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
2025-03-15 09:55:12.367 UTC [1] LOG: database system is ready to accept connections
2025-03-15 09:55:54.747 UTC [54] ERROR: column "updateId" of relation "users" contains null values
2025-03-15 09:55:54.747 UTC [54] STATEMENT: ALTER TABLE "users" ALTER COLUMN "updateId" SET NOT NULL, ALTER COLUMN "updateId" SET DEFAULT immich_uuid_v7()
2025-03-15 09:55:54.827 UTC [56] LOG: could not receive data from client: Connection reset by peer
2025-03-15 09:56:23.655 UTC [71] ERROR: column "updateId" of relation "users" contains null values
2025-03-15 09:56:23.655 UTC [71] STATEMENT: ALTER TABLE "users" ALTER COLUMN "updateId" SET NOT NULL, ALTER COLUMN "updateId" SET DEFAULT immich_uuid_v7()
2025-03-15 09:55:09.661 UTC [26] LOG: database system was not properly shut down; automatic recovery in progress
2025-03-15 09:55:09.790 UTC [26] LOG: redo starts at 0/11D3A378
2025-03-15 09:55:09.790 UTC [26] LOG: invalid record length at 0/11D3A460: wanted 24, got 0
2025-03-15 09:55:09.790 UTC [26] LOG: redo done at 0/11D3A428 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
2025-03-15 09:55:12.367 UTC [1] LOG: database system is ready to accept connections
2025-03-15 09:55:54.747 UTC [54] ERROR: column "updateId" of relation "users" contains null values
2025-03-15 09:55:54.747 UTC [54] STATEMENT: ALTER TABLE "users" ALTER COLUMN "updateId" SET NOT NULL, ALTER COLUMN "updateId" SET DEFAULT immich_uuid_v7()
2025-03-15 09:55:54.827 UTC [56] LOG: could not receive data from client: Connection reset by peer
2025-03-15 09:56:23.655 UTC [71] ERROR: column "updateId" of relation "users" contains null values
2025-03-15 09:56:23.655 UTC [71] STATEMENT: ALTER TABLE "users" ALTER COLUMN "updateId" SET NOT NULL, ALTER COLUMN "updateId" SET DEFAULT immich_uuid_v7()
maybe it got shutdown mid-upgrade or something? Not sure how I might have done that though, I never forced-shutdown any of the containers.
iiokiirok
iiokiirok2mo ago
lemme look it on a computer might be the pg-vect sha is pinned to an old version
HashtagOctothorp
HashtagOctothorpOP2mo ago
Prior to any LLM-prompted troubleshooting, I duplicated the postgres folder, hoping that would help any recovery efforts, if that helps. I have re-initialized with an empty folder to speed up my potential future of "manually recreate/reupload everything", and will be more diligent about backups moving forward.
iiokiirok
iiokiirok2mo ago
meaning you want no further help and will recreate?
Zeus
Zeus2mo ago
You can backup the dumps however you want. Such as Borg or restic
HashtagOctothorp
HashtagOctothorpOP2mo ago
No, I'd like to recover if possible
Zeus
Zeus2mo ago
You should keep at least a years worth, with interval pruning
iiokiirok
iiokiirok2mo ago
(Using the old data, not newly-initted), can you run only the database container with the healthcheck test removed? with the goal being getting the database up before asking it to do anything (database system is ready to accept connections)
HashtagOctothorp
HashtagOctothorpOP2mo ago
checking.
iiokiirok
iiokiirok2mo ago
if it is in anything close to being up, you can probably dump it (portainer is docker afaik? somewhere docker compose exec database pg_dump immich > olddump.db) https://www.postgresql.org/docs/current/backup-dump.html
PostgreSQL Documentation
25.1. SQL Dump
25.1. SQL Dump # 25.1.1. Restoring the Dump 25.1.2. Using pg_dumpall 25.1.3. Handling Large Databases The idea behind this dump method …
HashtagOctothorp
HashtagOctothorpOP2mo ago
actually, Q: how would I remove the healthcheck on the existing container?
iiokiirok
iiokiirok2mo ago
modifying the compose file, commenting the 'healthcheck:' portion out
HashtagOctothorp
HashtagOctothorpOP2mo ago
ah, then recreating the stack
iiokiirok
iiokiirok2mo ago
although disclaimer, I don't know anything about portainer according to the screenshot above, recreating the stack sounds like docker compose pull && docker compose up -d If you can modify the compose file in-between, then I guess, yes
HashtagOctothorp
HashtagOctothorpOP2mo ago
yeah, that button is on the editor page remove the entire healthcheck section?
iiokiirok
iiokiirok2mo ago
yes, you can just comment it out (adding # to the beginning of the lines) while you are at it you can probably comment out the immich: (container) section as well, which would ensure that immich doesn't try to access it meanwhile
HashtagOctothorp
HashtagOctothorpOP2mo ago
this part?
HashtagOctothorp
HashtagOctothorpOP2mo ago
No description
iiokiirok
iiokiirok2mo ago
yes
HashtagOctothorp
HashtagOctothorpOP2mo ago
worth adding DB_SKIP_MIGRATIONS=true while I'm in here?
iiokiirok
iiokiirok2mo ago
doesn't matter right now, since immich won't be running anyway (and thus will not migrate) but to ask again, you are sure you have 0 snapshots for containers? that sounds like an unreasonable default to have on synology, the set it and everything backed up appliane (:
HashtagOctothorp
HashtagOctothorpOP2mo ago
I setup the stack through portainer and probably don't have any backup stuff setup herp derp and the folder locations are on-disk rather than docker volume
iiokiirok
iiokiirok2mo ago
yeah, but on-disk where?
HashtagOctothorp
HashtagOctothorpOP2mo ago
on the synology NAS, that's running the containers
iiokiirok
iiokiirok2mo ago
I'd expect the whole synology fs to be snapshotted (no clue where you'd check-restore backups, haven't had the need to touch synology either)
HashtagOctothorp
HashtagOctothorpOP2mo ago
hm let me check
HashtagOctothorp
HashtagOctothorpOP2mo ago
if you're talking this: ... I don't have it setup
No description
HashtagOctothorp
HashtagOctothorpOP2mo ago
I don't think synology sets this up during setup by default. Either that or I missed it.
iiokiirok
iiokiirok2mo ago
what, are synologyes non-resilient to ransomware by default?
HashtagOctothorp
HashtagOctothorpOP2mo ago
seems like. Which is ironic, since being ransomwared (long story from a younger, dumber me) was the primary push to by the NAS in teh first place
iiokiirok
iiokiirok2mo ago
anything under 'backup tasks'? the docs for synology are quite questionable asw, at least assuming I don't have a synology to play around with
HashtagOctothorp
HashtagOctothorpOP2mo ago
can't find anything re backup
iiokiirok
iiokiirok2mo ago
Hyper Backup and Snapshot Replication ¯\_(ツ)_/¯
HashtagOctothorp
HashtagOctothorpOP2mo ago
also, not installed
No description
iiokiirok
iiokiirok2mo ago
:^)
HashtagOctothorp
HashtagOctothorpOP2mo ago
🤡
iiokiirok
iiokiirok2mo ago
that's solid dumb that there is no snapshots by default at all on any place ok, well let's see if the db container is alive on some terms
HashtagOctothorp
HashtagOctothorpOP2mo ago
ah yes. TLDR, prob not
2025-03-17 17:43:11.922 UTC [1] LOG: starting PostgreSQL 14.17 (Debian 14.17-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
2025-03-17 17:43:11.922 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
2025-03-17 17:43:11.922 UTC [1] LOG: listening on IPv6 address "::", port 5432
2025-03-17 17:43:12.304 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2025-03-17 17:43:12.428 UTC [26] LOG: database system was shut down at 2025-03-15 10:12:17 UTC
2025-03-17 17:43:12.428 UTC [26] LOG: invalid primary checkpoint record
2025-03-17 17:43:12.428 UTC [26] PANIC: could not locate a valid checkpoint record
[2025-03-17T17:43:12Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17332".
[2025-03-17T17:43:12Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17576".
[2025-03-17T17:43:12Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17332/segments/b89b74f3-8935-444d-9f8d-847f6a26fca1".
[2025-03-17T17:43:12Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17576/segments/ddb25af1-8278-4232-9e58-49c9e46c35cd".
2025-03-17 17:43:15.940 UTC [1] LOG: startup process (PID 26) was terminated by signal 6: Aborted
2025-03-17 17:43:15.940 UTC [1] LOG: aborting startup due to startup process failure
2025-03-17 17:43:15.943 UTC [1] LOG: database system is shut down
2025-03-17 17:43:11.922 UTC [1] LOG: starting PostgreSQL 14.17 (Debian 14.17-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
2025-03-17 17:43:11.922 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
2025-03-17 17:43:11.922 UTC [1] LOG: listening on IPv6 address "::", port 5432
2025-03-17 17:43:12.304 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2025-03-17 17:43:12.428 UTC [26] LOG: database system was shut down at 2025-03-15 10:12:17 UTC
2025-03-17 17:43:12.428 UTC [26] LOG: invalid primary checkpoint record
2025-03-17 17:43:12.428 UTC [26] PANIC: could not locate a valid checkpoint record
[2025-03-17T17:43:12Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17332".
[2025-03-17T17:43:12Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17576".
[2025-03-17T17:43:12Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17332/segments/b89b74f3-8935-444d-9f8d-847f6a26fca1".
[2025-03-17T17:43:12Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17576/segments/ddb25af1-8278-4232-9e58-49c9e46c35cd".
2025-03-17 17:43:15.940 UTC [1] LOG: startup process (PID 26) was terminated by signal 6: Aborted
2025-03-17 17:43:15.940 UTC [1] LOG: aborting startup due to startup process failure
2025-03-17 17:43:15.943 UTC [1] LOG: database system is shut down
previously I'd tried setting a recovery.signal file, as per AI debugging recommendation. It didn't work.
iiokiirok
iiokiirok2mo ago
well, there's probably not much recovery unless you go surgical :) can you access /volume1/immich/postgres from a shell somewhere? do you know what a shell is?
HashtagOctothorp
HashtagOctothorpOP2mo ago
yeppers I have SSH keys setup to the NAS, don't worry
iiokiirok
iiokiirok2mo ago
the internet says that you may be able to recover with partial metadata loss and/or broken state with pg_resetwal volume1/immich/postgres for pg_resetwal you have to have postgres (or only the tools) installed on the host
HashtagOctothorp
HashtagOctothorpOP2mo ago
I think there's a pg package I can install
iiokiirok
iiokiirok2mo ago
but the risk is we end up with an immich instance which will break down the road probably the main things you have are album organizations? otherwise just recreating the instance and re-uploading (with say immich-go) all the files from the old instance's upload directory
HashtagOctothorp
HashtagOctothorpOP2mo ago
album was one of them, yeah.
iiokiirok
iiokiirok2mo ago
oh wait do you have the app
HashtagOctothorp
HashtagOctothorpOP2mo ago
yep
iiokiirok
iiokiirok2mo ago
or can open it in a browser you have opened it before it might have offline date still so you could list what you have in albums although especially on a phone, it is probably very inconvenient OR if we (doesn't hurt) get the db online (even if it doesn't work with Immich) we can do it much more conveniently
HashtagOctothorp
HashtagOctothorpOP2mo ago
as in, use the phone app to recreate the albums while reimporting the images from the folder?
iiokiirok
iiokiirok2mo ago
as in, use the phone app to view what the server used to be (the phone doesn't know the server is dead, it has cached its state, at least filenames, in some amount, probably) but try the pg_resetwal thing see what pg thinks of it and it'd be smart, while running the resetwal command, to make sure the db container isn't running (and restarting) at the same time
HashtagOctothorp
HashtagOctothorpOP2mo ago
oh neat... it seems postgress is already native on the nas OS. do you think the version matters? pg_resetwal (PostgreSQL) 11.11
Zeus
Zeus2mo ago
Holy yes do not use that
HashtagOctothorp
HashtagOctothorpOP2mo ago
lmao welp. presumably I could copy this folder anywhere run the commands, and pull it back? going to make a linux vm gotta do some real work tho, will pick this up tonight
iiokiirok
iiokiirok2mo ago
or you can modify the docker container start command (entrypoint is what you need) so that the container just starts without starting postgres in it
HashtagOctothorp
HashtagOctothorpOP2mo ago
command part, commented out... then get into the container shell?
iiokiirok
iiokiirok2mo ago
yes but no if you comment it out, it will be the default command and still start pg
HashtagOctothorp
HashtagOctothorpOP2mo ago
ah hmm
iiokiirok
iiokiirok2mo ago
set command to 'sleep infinity' for example or straight to the pg_resetwal /var/lib/postgresql/data and make sure restart is "no" (with quotes)
HashtagOctothorp
HashtagOctothorpOP2mo ago
sleep worked. I'm in the container hmm. know the pg user by chance?
pg_resetwal: error: cannot be executed by "root"
pg_resetwal: You must run pg_resetwal as the PostgreSQL superuser.
pg_resetwal: error: cannot be executed by "root"
pg_resetwal: You must run pg_resetwal as the PostgreSQL superuser.
iiokiirok
iiokiirok2mo ago
immich pg_resetwal -U immich ... (the user is set in your env from above)
HashtagOctothorp
HashtagOctothorpOP2mo ago
invalid option -- 'U' oh, as the command hmm...
root@cb8b28e8bb81:/# pg_resetwal --help
pg_resetwal resets the PostgreSQL write-ahead log.

Usage:
pg_resetwal [OPTION]... DATADIR

Options:
-c, --commit-timestamp-ids=XID,XID
set oldest and newest transactions bearing
commit timestamp (zero means no change)
[-D, --pgdata=]DATADIR data directory
-e, --epoch=XIDEPOCH set next transaction ID epoch
-f, --force force update to be done
-l, --next-wal-file=WALFILE set minimum starting location for new WAL
-m, --multixact-ids=MXID,MXID set next and oldest multitransaction ID
-n, --dry-run no update, just show what would be done
-o, --next-oid=OID set next OID
-O, --multixact-offset=OFFSET set next multitransaction offset
-u, --oldest-transaction-id=XID set oldest transaction ID
-V, --version output version information, then exit
-x, --next-transaction-id=XID set next transaction ID
--wal-segsize=SIZE size of WAL segments, in megabytes
-?, --help show this help, then exit
root@cb8b28e8bb81:/# pg_resetwal --help
pg_resetwal resets the PostgreSQL write-ahead log.

Usage:
pg_resetwal [OPTION]... DATADIR

Options:
-c, --commit-timestamp-ids=XID,XID
set oldest and newest transactions bearing
commit timestamp (zero means no change)
[-D, --pgdata=]DATADIR data directory
-e, --epoch=XIDEPOCH set next transaction ID epoch
-f, --force force update to be done
-l, --next-wal-file=WALFILE set minimum starting location for new WAL
-m, --multixact-ids=MXID,MXID set next and oldest multitransaction ID
-n, --dry-run no update, just show what would be done
-o, --next-oid=OID set next OID
-O, --multixact-offset=OFFSET set next multitransaction offset
-u, --oldest-transaction-id=XID set oldest transaction ID
-V, --version output version information, then exit
-x, --next-transaction-id=XID set next transaction ID
--wal-segsize=SIZE size of WAL segments, in megabytes
-?, --help show this help, then exit
iiokiirok
iiokiirok2mo ago
hmm, weird
HashtagOctothorp
HashtagOctothorpOP2mo ago
> su immich
su: user immich does not exist or the user entry does not contain all the required fields
> su immich
su: user immich does not exist or the user entry does not contain all the required fields
iiokiirok
iiokiirok2mo ago
ah wait maybe? internet says gosu immich pg_resetwal /var/lib/postgresql/data
HashtagOctothorp
HashtagOctothorpOP2mo ago
still no user immich I think immich is the PG user, not the OS user there's a postgres user tho gunna try that
iiokiirok
iiokiirok2mo ago
sorry, I didn't double-check above is DB_USERNAME=postgres, not immich :^)
HashtagOctothorp
HashtagOctothorpOP2mo ago
lmao that's what's in all the default .env files no?
iiokiirok
iiokiirok2mo ago
so using 'postgres' instead of 'immich' here then
HashtagOctothorp
HashtagOctothorpOP2mo ago
yep Write-ahead log reset 🤞
iiokiirok
iiokiirok2mo ago
I have mine to totally different from a old habit to prevent multiple dbses getting mixed up between multiple instances ¯\_(ツ)_/¯ well, try booting the database container up
HashtagOctothorp
HashtagOctothorpOP2mo ago
makes sense recreating with default command, one sec, unless you want me to run the command manually
iiokiirok
iiokiirok2mo ago
all good, recreate it (but keep immich-server commented)
iiokiirok
iiokiirok2mo ago
meanwhile on my end :D
No description
iiokiirok
iiokiirok2mo ago
(i have very nice multiple backups and fs snapshots; still it refuses, seems to be a problem of uptime-kuma cross-machine offline migrations (my own troubleshooting, unrelated))
Zeus
Zeus2mo ago
V2 supports a real DB (sadly Maria but better than SQLite)
iiokiirok
iiokiirok2mo ago
sqlite is a real db!! :D and I'm still in wonder how cockroachdb is about the only one with a db that just stays up, just stays healthy, just heals itself, you don't have to hold its hand for upgrades or anything sad part is its licensing makes it unattractive to take up in any project that's not new corporative we-will-buy-it-in choice i've had most problems with maria-mysql (not mentioning msdb, that's just unreasonable :D ft. upgrade migrations), pg and sqlite have been so-so mongo tends to be bad, redis tends to be ok, but abused by devlopers :D there is one other db, i don't remember the name of atm, and can't check :/ @HashtagOctothorp how is it going, do you have anything up or logs?
HashtagOctothorp
HashtagOctothorpOP2mo ago
hmmm... main container:
[Nest] 7 - 03/17/2025, 11:37:08 AM LOG [Microservices:StorageService] Verifying system mount folder checks, current state: {"mountChecks":{"thumbs":true,"upload":true,"backups":true,"library":true,"profile":true,"encoded-video":true}}

[Nest] 7 - 03/17/2025, 11:37:08 AM LOG [Microservices:StorageService] Successfully verified system mount folder checks
[Nest] 7 - 03/17/2025, 11:37:08 AM LOG [Microservices:StorageService] Verifying system mount folder checks, current state: {"mountChecks":{"thumbs":true,"upload":true,"backups":true,"library":true,"profile":true,"encoded-video":true}}

[Nest] 7 - 03/17/2025, 11:37:08 AM LOG [Microservices:StorageService] Successfully verified system mount folder checks
But still "unhealthy", while the DB container says "starting" but normal log output
2025-03-17 18:35:53.673 UTC [1] LOG: starting PostgreSQL 14.17 (Debian 14.17-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
2025-03-17 18:35:53.684 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
2025-03-17 18:35:53.684 UTC [1] LOG: listening on IPv6 address "::", port 5432
2025-03-17 18:35:54.178 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2025-03-17 18:35:54.808 UTC [26] LOG: database system was shut down at 2025-03-17 18:35:38 UTC
[2025-03-17T18:35:54Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17332".
[2025-03-17T18:35:54Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17576".
[2025-03-17T18:35:54Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17576/segments/ddb25af1-8278-4232-9e58-49c9e46c35cd".
[2025-03-17T18:35:54Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17332/segments/b89b74f3-8935-444d-9f8d-847f6a26fca1".
2025-03-17 18:35:55.186 UTC [1] LOG: database system is ready to accept connections
2025-03-17 18:35:53.673 UTC [1] LOG: starting PostgreSQL 14.17 (Debian 14.17-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
2025-03-17 18:35:53.684 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
2025-03-17 18:35:53.684 UTC [1] LOG: listening on IPv6 address "::", port 5432
2025-03-17 18:35:54.178 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2025-03-17 18:35:54.808 UTC [26] LOG: database system was shut down at 2025-03-17 18:35:38 UTC
[2025-03-17T18:35:54Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17332".
[2025-03-17T18:35:54Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17576".
[2025-03-17T18:35:54Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17576/segments/ddb25af1-8278-4232-9e58-49c9e46c35cd".
[2025-03-17T18:35:54Z INFO service::utils::clean] Find directory "pg_vectors/indexes/17332/segments/b89b74f3-8935-444d-9f8d-847f6a26fca1".
2025-03-17 18:35:55.186 UTC [1] LOG: database system is ready to accept connections
iiokiirok
iiokiirok2mo ago
main container was supposed to be down? :) but ok
HashtagOctothorp
HashtagOctothorpOP2mo ago
oh sorry
iiokiirok
iiokiirok2mo ago
the db itself doesn't produce any more logs other than what it said?
HashtagOctothorp
HashtagOctothorpOP2mo ago
I didn't realize you wanted me not to do the main container when booting the pg container nope
iiokiirok
iiokiirok2mo ago
can you dump the db, if you know how? (before proceeding) no worries, it's just one step at a time, usually safer and saner
HashtagOctothorp
HashtagOctothorpOP2mo ago
from here?
docker compose down -v # CAUTION! Deletes all Immich data to start from scratch
## Uncomment the next line and replace DB_DATA_LOCATION with your Postgres path to permanently reset the Postgres database
# rm -rf DB_DATA_LOCATION # CAUTION! Deletes all Immich data to start from scratch
docker compose pull # Update to latest version of Immich (if desired)
docker compose create # Create Docker containers for Immich apps without running them
docker start immich_postgres # Start Postgres server
sleep 10 # Wait for Postgres server to start up
# Check the database user if you deviated from the default
gunzip < "/path/to/backup/dump.sql.gz" \
| sed "s/SELECT pg_catalog.set_config('search_path', '', false);/SELECT pg_catalog.set_config('search_path', 'public, pg_catalog', true);/g" \
| docker exec -i immich_postgres psql --dbname=postgres --username=<DB_USERNAME> # Restore Backup
docker compose up -d # Start remainder of Immich apps
docker compose down -v # CAUTION! Deletes all Immich data to start from scratch
## Uncomment the next line and replace DB_DATA_LOCATION with your Postgres path to permanently reset the Postgres database
# rm -rf DB_DATA_LOCATION # CAUTION! Deletes all Immich data to start from scratch
docker compose pull # Update to latest version of Immich (if desired)
docker compose create # Create Docker containers for Immich apps without running them
docker start immich_postgres # Start Postgres server
sleep 10 # Wait for Postgres server to start up
# Check the database user if you deviated from the default
gunzip < "/path/to/backup/dump.sql.gz" \
| sed "s/SELECT pg_catalog.set_config('search_path', '', false);/SELECT pg_catalog.set_config('search_path', 'public, pg_catalog', true);/g" \
| docker exec -i immich_postgres psql --dbname=postgres --username=<DB_USERNAME> # Restore Backup
docker compose up -d # Start remainder of Immich apps
iiokiirok
iiokiirok2mo ago
that's restoring the db, afaik we want to back it up we don't have a backup but if something goes wrong (mainly dumb human mistakes) in the next 24h, we have something to come back to so answer is no
HashtagOctothorp
HashtagOctothorpOP2mo ago
🤦‍♂️
No description
No description
iiokiirok
iiokiirok2mo ago
ig should be docker compose exec database pg_dump -U postgres immich > mydump.xyz ah well :D you can make the backup anyway but this looks a lot greener i thought you didn't have any of those
HashtagOctothorp
HashtagOctothorpOP2mo ago
yall went straight to synology snapshots
iiokiirok
iiokiirok2mo ago
well losing the wal shouldn't be not that much, but it might be smarter to recover anyway after backing up our current state, I'd start restoring the db backups from newest, testing, if fails then an older copy (after backing up) then yes except the immich_postgres is database in your case
HashtagOctothorp
HashtagOctothorpOP2mo ago
kk. I'm going to pause for now becuase I REALLY need to do some work-work. will pick up tonight after the kids are down. but this is promising.
iiokiirok
iiokiirok2mo ago
yup, jfyi i might not be available, quite late atm and I'm planning to eat (smh cooking like 3rd or 4th time in 24h, I question where I can fit it all, quite skinny)
HashtagOctothorp
HashtagOctothorpOP2mo ago
Sounds like me in my youth... I was eating ~6000 calories a day and barely keeping the weight on. Its gunna be an 8-9 hour break XD
iiokiirok
iiokiirok2mo ago
kids → make it 3d-7d
HashtagOctothorp
HashtagOctothorpOP2mo ago
fml lol
No description
HashtagOctothorp
HashtagOctothorpOP2mo ago
BUT! The backup from the existing auto-generated backups... *chefs kiss* *hacker voice* I'm in. Thanks for your help guys.
Immich
Immich2mo ago
This thread has been closed. To re-open, use the button below.
iiokiirok
iiokiirok4w ago
lol, totally forgot about discord existing meanwhile

Did you find this page helpful?