What is my Immich doing now?
It's been processing lots of photos and video overnight.
At this moment I believe something is weird going on behind the scene. Facial Recognition count stays unmoved. No other jobs are running.
Server is non-responsive. CPU cranks to 100%.
Ctrl-C has no effect.
I want to see if I can get some meaningful log and inspect what happened.
Please suggest some methods.

26 Replies
:wave: Hey @BitePa,
Thanks for reaching out to us. Please carefully read this message and follow the recommended actions. This will help us be more effective in our support effort and leave more time for building Immich :immich:.
References
- Container Logs:
docker compose logs
docs
- Container Status: docker ps -a
docs
- Reverse Proxy: https://immich.app/docs/administration/reverse-proxy
- Code Formatting https://support.discord.com/hc/en-us/articles/210298617-Markdown-Text-101-Chat-Formatting-Bold-Italic-Underline#h_01GY0DAKGXDEHE263BCAYEGFJA
Checklist
I have...
1. :blue_square: verified I'm on the latest release(note that mobile app releases may take some time).
2. :blue_square: read applicable release notes.
3. :blue_square: reviewed the FAQs for known issues.
4. :blue_square: reviewed Github for known issues.
5. :blue_square: tried accessing Immich via local ip (without a custom reverse proxy).
6. :blue_square: uploaded the relevant information (see below).
7. :blue_square: tried an incognito window, disabled extensions, cleared mobile app cache, logged out and back in, different browsers, etc. as applicable
(an item can be marked as "complete" by reacting with the appropriate number)
Information
In order to be able to effectively help you, we need you to provide clear information to show what the problem is. The exact details needed vary per case, but here is a list of things to consider:
- Your docker-compose.yml and .env files.
- Logs from all the containers and their status (see above).
- All the troubleshooting steps you've tried so far.
- Any recent changes you've made to Immich or your system.
- Details about your system (both software/OS and hardware).
- Details about your storage (filesystems, type of disks, output of commands like fdisk -l
and df -h
).
- The version of the Immich server, mobile app, and other relevant pieces.
- Any other information that you think might be relevant.
Please paste files and logs with proper code formatting, and especially avoid blurry screenshots.
Without the right information we can't work out what the problem is. Help us help you ;)
If this ticket can be closed you can use the /close
command, and re-open it later if needed.The Immich server is still pounding the External Libraries server.
I think you may need to check facial recognition settings, whether the path are correct. I had met once due to I put the facial recognition model to the wrong path. After moved to correct path, rebuild, run again and then the counts moving.
This is the 2nd round of batch processing. So if this batch and previous batch are put in the same path, the. There’s something wrong. I’ve not changed the path.
Ah... finally it caught my Ctrl-C.
it has this:
immich_redis | 1:M 24 Dec 2024 06:03:04.194 * Background saving started by pid 13342
immich_machine_learning | [12/24/24 14:27:54] CRITICAL WORKER TIMEOUT (pid:13306)
immich_redis | 13342:C 24 Dec 2024 06:27:25.513 * DB saved on disk How should I proceed and get the log without starting Immich? Anyone please?
immich_redis | 13342:C 24 Dec 2024 06:27:25.513 * DB saved on disk How should I proceed and get the log without starting Immich? Anyone please?
Any hardware details? Compose env files?
I’m using Proxmox LXC container. Assigned tons of RAM n CPUs to it. The first batch ran fine, although number of files less than the 2nd batch.
docker-compose.yaml is as follow:
all standard stuff, except the 2 lines for External Libraries
Ok. I can't wait, so I restarted the docker container. Here is the error when it crashed the whole server:
anything that I can collect for this crash?
There are lots of "connection timeout".
compose seems OK, are the library mounts on the network or on the device?
Also are you still scanning every 20 minutes?
I guess I need to dig into postgres or redis.
no more 20 mins... every 6 hrs or something like that
network
remote server is intact.
Just wondering whether the remote device is terminating or the container
in machine-learning log, I got this:
[2;36m[12/24/24 13:14:12][0m[2;36m [0m[34mINFO [0m Application startup complete.
[2;36m[12/24/24 14:27:54][0m[2;36m [0m[1;7;31mCRITICAL[0m WORKER TIMEOUT [1m([0mpi[1;92md:1330[0m6[1m)[0m
[2;36m[12/24/24 15:12:12][0m[2;36m [0m[1;31mERROR [0m Worker [1m([0mpi[1;92md:1330[0m6[1m)[0m was sent SIGKILL! Perhaps out of [2;36m [0m memory? no error in postgres log redis log: 13342:C 24 Dec 2024 06:27:25.513 * DB saved on disk 13342:C 24 Dec 2024 06:27:46.931 * RDB: 1 MB of memory used by copy-on-write 1:M 24 Dec 2024 06:27:51.902 * Background saving terminated with success 1:C 26 Dec 2024 07:04:18.614 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo is it likely that the container died? i'll check remote server
[2;36m[12/24/24 14:27:54][0m[2;36m [0m[1;7;31mCRITICAL[0m WORKER TIMEOUT [1m([0mpi[1;92md:1330[0m6[1m)[0m
[2;36m[12/24/24 15:12:12][0m[2;36m [0m[1;31mERROR [0m Worker [1m([0mpi[1;92md:1330[0m6[1m)[0m was sent SIGKILL! Perhaps out of [2;36m [0m memory? no error in postgres log redis log: 13342:C 24 Dec 2024 06:27:25.513 * DB saved on disk 13342:C 24 Dec 2024 06:27:46.931 * RDB: 1 MB of memory used by copy-on-write 1:M 24 Dec 2024 06:27:51.902 * Background saving terminated with success 1:C 26 Dec 2024 07:04:18.614 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo is it likely that the container died? i'll check remote server
So there is remote ML?
nope. it is not configured. there is a container for ML, so I got the log
Ah, just
docker compose logs
next time, it gets logs for all containersok
the remote server hosting the External Libraries is mounted using sshfs
nothing weird in the remote server logs
sshfs 👀
I'm looking into sshfs a bit and that seems remarkably unstable when "layered', such as used within another application rather than just on the CLI
i.e. through docker
oh....
samba is not stable. cifs not stable. what do you suggest?
If they are all not stable maybe it's not the filesystems... 🙃
ummm...
i only run Debian 11 or 12.
not even Ubuntu
Sure, but your hardware or network might be messing things up
I think most people use SMB mounts here
may be it stresses everything to the max during the first time scan
In my case, my NUC literally thermally overheated and hard crashed during ingest 😛
Case closed. No concrete idea about the cause.