NAS completely froze overnight
Hi! I'm running immich on my DS918+ with 8GB RAM and tonight the entire system completely froze around 5AM and I had to hard restart it.
I already had a crash like this when I started using immich, and then I limited all jobs to only run 1 parallel, and restricted the CPU cores each container can use a bit to not hammer the CPU as much. The RAM usage is around 35% usually. unfortunately the NAS did not manage to save any RAM usage charts for this night (probably due to the crash/hang..)
the only pointer I have to why this could maybe happen is the immich_server logs:
(in next message)
26 Replies
:wave: Hey @Thunder,
Thanks for reaching out to us. Please carefully read this message and follow the recommended actions. This will help us be more effective in our support effort and leave more time for building Immich :immich:.
References
- Container Logs:
docker compose logs
docs
- Container Status: docker ps -a
docs
- Reverse Proxy: https://immich.app/docs/administration/reverse-proxy
- Code Formatting https://support.discord.com/hc/en-us/articles/210298617-Markdown-Text-101-Chat-Formatting-Bold-Italic-Underline#h_01GY0DAKGXDEHE263BCAYEGFJA
Checklist
I have...
1. :ballot_box_with_check: verified I'm on the latest release(note that mobile app releases may take some time).
2. :blue_square: read applicable release notes.
3. :blue_square: reviewed the FAQs for known issues.
4. :blue_square: reviewed Github for known issues.
5. :ballot_box_with_check: tried accessing Immich via local ip (without a custom reverse proxy).
6. :ballot_box_with_check: uploaded the relevant information (see below).
7. :ballot_box_with_check: tried an incognito window, disabled extensions, cleared mobile app cache, logged out and back in, different browsers, etc. as applicable
(an item can be marked as "complete" by reacting with the appropriate number)
Information
In order to be able to effectively help you, we need you to provide clear information to show what the problem is. The exact details needed vary per case, but here is a list of things to consider:
- Your docker-compose.yml and .env files.
- Logs from all the containers and their status (see above).
- All the troubleshooting steps you've tried so far.
- Any recent changes you've made to Immich or your system.
- Details about your system (both software/OS and hardware).
- Details about your storage (filesystems, type of disks, output of commands like fdisk -l
and df -h
).
- The version of the Immich server, mobile app, and other relevant pieces.
- Any other information that you think might be relevant.
Please paste files and logs with proper code formatting, and especially avoid blurry screenshots.
Without the right information we can't work out what the problem is. Help us help you ;)
If this ticket can be closed you can use the /close
command, and re-open it later if needed.logs for immich_server container for that point in time:
docker-compose file is here
note that I also run immich-power-tools for a few weeks now but I don't see much there in the logs that points to errors
also notable: the system log of the nas from the time of the hang:
what would the next steps be to try and find out where this system hang could come from, maybe it's not even related to immich? (but I only had 2 crashes like this in the time when I had immich on this system)
looks like it lost connection to the DB around that time, do you see anything in your postgres logs?
immich can be a resource hog when loading it up with images. check your memory limits on immich to make sure it doesn't crash your system. maybe you need more memory .
the logs for that container don't contain much unfortunately:
can you explain what you mean with "check your memory limits on immich to make sure it doesn't crash your system."?
maybe you need more memory .unfortunately 8GB is the max officially supported amount of RAM for this NAS
just FYI because of how docker works it should be impossible for the whole system to hang based on immich
So likely it’s just getting massively overloaded or there’s a hardware fault
massively overloaded could be the case tbh
I never really had issues with this NAS, but now with immich I managed to crash it twice
also redis has a warning in the logs when booting up?
# WARNING Memory overcommit must be enabled! Without it, a background save or replication may fail under low memory condition. Being disabled, it can also cause failures without low memory condition, see https://github.com/jemalloc/jemalloc/issues/1328. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
That doesn’t matter
kk
You could check the database logs which are in the logs folder of the database volume
yeah mean the ones in /immich/postgres/log folder?
seems like some are empty there

the latest logs look like this:
docker supports setting memory limits
I see, thanks. I'll look into it - though from the history of the resource monitor RAM usage doesn't seem to be very high though
cpu also seems alright to me hm..
memort

maybe try removing cpu_shares and cpuset, see what happens. I find it weird a container is hanging the whole system, but maybe synology implementation is bugged
cpu

hm yeah that was me trying to get the system to be more responsive when immich is hammering it - my first crash was when I didn't add those settings and they managed to reduce CPU usage a bit and make the system more responsive again while ingesting images
checklist for myself:
- set sane memory limits for the containers... just in case
- investigate cpu_shares and cpuset.. removing the settings / or alternatively leave a core completely free for the system and test that
what about IO?
volume utilization: volume1 is hard drives (with SSD read cache), volume2 is SSDs
(media is stored on HDDs — thumbnails, database and so on on SSDs)
note that DSM automatically kicked off data scrubbing after the hard reboot; which is why the usage was so high today
also note: total system load, so reads from nightly backups are also shown here

your database crashed / got OOM killed
it seems that it was able to recover on startup
in general linux mostly hangs because of memory. synology is probably btrfs so that could be causing it to crash, though I've only seen generic linux kernels (not synology) crash from btrfs.
yup all volumes are set up as btrfs - so that sounds like what you are describing a bit
is docker memory usage perhaps not reported properly in the synology activity monitor?
I'll add some memory limits to my containers then.. are there some recommended values?