Immich SSD + HDD Performance Issue & crash
Hi, I would be very happy if anyone could help 
The Core Problem :
The Immich container, running within a Proxmox LXC, consistently crashes and causes 100% CPU saturation on the host whenever a new asset is uploaded and the subsequent automatic Machine Learning (ML) jobs (Smart Search, OCR, Face Recognition) are triggered.
The system becomes completely unresponsive during this time.The data storage is split between a slow NAS (QNAP via NFS) for the original photo library and a fast local SSD for temporary files and application data (PostgreSQL, Thumbnails). See my docker & .env in the attached txt file for more details.
Diagnostic I/O tests were performed from inside the LXC to quantify the performance difference between the storage mounts. Here are the results.
Sequential Read Speed (BW) :
NAS QNAP (NFS),Original Photo Library,≈136 MiB/s,
Local SSD,Immich / Thumbnails,≈1452 MiB/s,
Random Read/Write IOPS :
NAS QNAP (NFS) ≈155 R-IOPS/65 W-IOPS
Local SSD ≈47.8 kIOPS/20.5 kIOPS
The crash occurs immediately after the asset is uploaded and the file is moved to its final location on the slow NAS, followed by the activation of the ML service.
Logs:
"immich_server | DEBUG [Microservices:StorageCore] Attempting to rename file: .../upload/...jpg => /usr/src/app/upload/library/admin/...jpg
immich_server | DEBUG [Microservices:StorageCore] Unable to rename file. Falling back to copy, verify and delete"
ML Job Initialization (Machine Learning Log) The system freezes immediately upon starting the intensive, I/O-bound analysis jobs, even when the concurrency is set to 1 (SMART_SEARCH_CONCURRENCY=1):
"immich_machine_learning | [11/26/25 01:08:52] INFO Downloading detection model 'PP-OCRv5_mobile' to /cache/ocr/PP-OCRv5_mobile/detection/model.onnx. This may take a while.
immich_machine_learning | [11/26/25 01:08:53] INFO Loading detection model 'antelopev2' to memory"
--> System crash/freeze occurs here, certainly due to I/O Wait caused by slow read requests to the NAS."
The Core Problem :
The Immich container, running within a Proxmox LXC, consistently crashes and causes 100% CPU saturation on the host whenever a new asset is uploaded and the subsequent automatic Machine Learning (ML) jobs (Smart Search, OCR, Face Recognition) are triggered.
The system becomes completely unresponsive during this time.The data storage is split between a slow NAS (QNAP via NFS) for the original photo library and a fast local SSD for temporary files and application data (PostgreSQL, Thumbnails). See my docker & .env in the attached txt file for more details.
Diagnostic I/O tests were performed from inside the LXC to quantify the performance difference between the storage mounts. Here are the results.
Sequential Read Speed (BW) :
NAS QNAP (NFS),Original Photo Library,≈136 MiB/s,
Local SSD,Immich / Thumbnails,≈1452 MiB/s,
Random Read/Write IOPS :
NAS QNAP (NFS) ≈155 R-IOPS/65 W-IOPS
Local SSD ≈47.8 kIOPS/20.5 kIOPS
The crash occurs immediately after the asset is uploaded and the file is moved to its final location on the slow NAS, followed by the activation of the ML service.
Logs:
"immich_server | DEBUG [Microservices:StorageCore] Attempting to rename file: .../upload/...jpg => /usr/src/app/upload/library/admin/...jpg
immich_server | DEBUG [Microservices:StorageCore] Unable to rename file. Falling back to copy, verify and delete"
ML Job Initialization (Machine Learning Log) The system freezes immediately upon starting the intensive, I/O-bound analysis jobs, even when the concurrency is set to 1 (SMART_SEARCH_CONCURRENCY=1):
"immich_machine_learning | [11/26/25 01:08:52] INFO Downloading detection model 'PP-OCRv5_mobile' to /cache/ocr/PP-OCRv5_mobile/detection/model.onnx. This may take a while.
immich_machine_learning | [11/26/25 01:08:53] INFO Loading detection model 'antelopev2' to memory"
--> System crash/freeze occurs here, certainly due to I/O Wait caused by slow read requests to the NAS."