Gunicorn out of memory errors

New docker install with nothing but the suggested edits to .env is generating gunicorn out of memory errors in terminal. From my troubleshooting, it appears to be linked to iOS photo uploads. After uploading some number of photos (more than 10, less than 100), the terminal on the server starts throwing gunicorn out of memory errors. Have checked RAM usage on VM and there does not appear to be a RAM shortage. Have recreated on several mobile clients. No edits to docker-compose or .env (outside of API keys and mount directory which is on NFS).
No description
37 Replies
Alex Tran
Alex Tran3y ago
How much ram do you allocate for the Machine learning container? The container will load the model into the memory during run time, trigger by the first upload So if it tries to load the model into memory and doesn’t have enough ram, it would crash. I recommend to have at least 4GB of ram for the machine learning container, 6-8GB would be preferred for upcoming release with facial recognition model
oxfordcomputers
oxfordcomputersOP3y ago
I have 10000M allocated.
Alex Tran
Alex Tran3y ago
10GB?
oxfordcomputers
oxfordcomputersOP3y ago
Correct You mentioned that the ram should be allocated for the machine learning container. I am running the ML container (docker) alongside the other dockers using the standard .env and docker-compose files. Do I need to manually specify the RAM for the ML container somewhere?
Alex Tran
Alex Tran3y ago
No 10GB for the whole stack should be enough What is your system that run Immich?
Kryptonian
Kryptonian3y ago
Currently the ml container eats all the ram you give it, and crashes because of how the models are kept in memory (and not unloaded at all).
Alex Tran
Alex Tran3y ago
I think the next release would fix this issue as we moved to fastapi now
Kryptonian
Kryptonian3y ago
Let's hope so. :)
oxfordcomputers
oxfordcomputersOP3y ago
Proxmox kernel Linux 5.15.85-1-pve #1 SMP PVE 5.15.85-1 on a Ryzen 7 2700X with 32GB RAM I'd suspected this might be what was happening given that no amount of RAM seems to satisfy it Is fastapi available on either branch? I'm currently on latest release branch
Alex Tran
Alex Tran3y ago
You can try to use the tag main to test the fastapi build
oxfordcomputers
oxfordcomputersOP3y ago
So just change this image: ghcr.io/immich-app/immich-machine-learning:release in .env to image: ghcr.io/immich-app/immich-machine-learning:main to try out the fastapi build? Do I need to change any of the other docker images to main?
Alex Tran
Alex Tran3y ago
No there is not pulling should work
oxfordcomputers
oxfordcomputersOP3y ago
So docker pull ghcr.io/immich-app/immich-machine-learning:main in my VM? Will that replace the current ML container or create a second ML container? Sorry for the extra questions--I'm bad with docker stuff.
Alex Tran
Alex Tran3y ago
No it just download another image and then when you use docker-compose up, it will use the corresponding image with the tag main the release tag image is still on your machine
oxfordcomputers
oxfordcomputersOP3y ago
Alright, thanks Alex. FWIW, your project is amazing and I've got a monthly donation set up on librepay for the next year!
Alex Tran
Alex Tran3y ago
Hey man, thank you for the kind words I appreciate it
oxfordcomputers
oxfordcomputersOP3y ago
Wanted to provide an update. I switched to main as suggested but the same error persists with gunicorn.
oxfordcomputers
oxfordcomputersOP3y ago
No description
oxfordcomputers
oxfordcomputersOP3y ago
No description
oxfordcomputers
oxfordcomputersOP3y ago
No description
Kryptonian
Kryptonian3y ago
Yeah, that's what I thought would happen.
oxfordcomputers
oxfordcomputersOP3y ago
It seems that, for now, my only option to avoid these OOM errors is to just disable ML.
jrasm91
jrasm913y ago
What is actually killing the process?
oxfordcomputers
oxfordcomputersOP3y ago
How would I determine that?
Kryptonian
Kryptonian3y ago
dmesg is one thing to check eg kernel logs.
oxfordcomputers
oxfordcomputersOP3y ago
I've just tried dmesg | grep gunicorn but no results. I have, however, had ML disabled since yesterday and gone through several reboots which may have cleared such logs?
Kryptonian
Kryptonian3y ago
Yeah then look in /var/log and grep all files for oom
oxfordcomputers
oxfordcomputersOP3y ago
Done. I get hits in kern and sys logs but they're the same exact message types as provided in the above screenshots.
Kryptonian
Kryptonian3y ago
look in kern if there's anything around those. Are you running it in a lxc container?
oxfordcomputers
oxfordcomputersOP3y ago
May 7 08:56:09 immich kernel: [ 115.847035] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=docker-4985ad7a64045fc432f1d5371aa8919c6659883623f47324df66aa0d6eb97f16.scope,mems_allowed=0,global_oom,task_memcg=/system.slice/docker-4985ad7a64045fc432> May 7 08:56:09 immich kernel: [ 115.847085] Out of memory: Killed process 873 (python) total-vm:4782964kB, anon-rss:2212072kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:6652kB oom_score_adj:0 ^ so that was when I had ML on but had reduced RAM back to 4GB Here's output from kern.log when ML was on but RAM was set to 10GB May 4 00:47:22 immich kernel: [ 3903.349505] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=docker-30261e4f97ab5c0777ebb554c66efdeb576feeabf3b99310f634c3c0d5ea3028.scope,mems_allowed=0,global_oom,task_memcg=/system.slice/docker-cd2a6534f36e61d858> May 4 00:47:22 immich kernel: [ 3903.349531] Out of memory: Killed process 1958 (gunicorn) total-vm:2611424kB, anon-rss:951112kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:3140kB oom_score_adj:0 May 4 00:47:38 immich kernel: [ 3919.760475] typesense-serve invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0 May 4 00:47:38 immich kernel: [ 3919.760480] CPU: 0 PID: 1759 Comm: typesense-serve Not tainted 5.10.0-22-amd64 #1 Debian 5.10.178-3 I am not running in an LXC container, no. So, I think the answer to this question is that typesense-serve is killing gunicorn
Alex Tran
Alex Tran3y ago
Are you running this in a VM, LXC, what is the linux distribution you are running this?
oxfordcomputers
oxfordcomputersOP3y ago
VM in Proxmox Linux 5.15.85-1-pve; VM distribution is Debian 11.6 on kernel ver. 5.10.0-22
Alex Tran
Alex Tran3y ago
Thank you, nothing out of the ordinary pretty strange issue you are facing
Kryptonian
Kryptonian3y ago
How have you limited the ram? The VM's total or container itself?
oxfordcomputers
oxfordcomputersOP3y ago
I've allocated a minimum 2GB, max 4GB to the VM itself. (I have played around with this a bit too--like trying a min. 4GB, max 10GB, but same results.) If by container you mean the docker container, I haven't applied any limit within Debian to docker. From my understanding of docker, this shouldn't be relevant but can't hurt to ask: are there any underlying system packages/tools that docker/immich relies upon that may be out of version that I should check?
Kryptonian
Kryptonian3y ago
Okay so it runs out of ram in the VM entirely. :yikes:
oxfordcomputers
oxfordcomputersOP3y ago
Yes

Did you find this page helpful?