Topics

Immich•3y ago

oxfordcomputers

Gunicorn out of memory errors

New docker install with nothing but the suggested edits to .env is generating gunicorn out of memory errors in terminal. From my troubleshooting, it appears to be linked to iOS photo uploads. After uploading some number of photos (more than 10, less than 100), the terminal on the server starts throwing gunicorn out of memory errors. Have checked RAM usage on VM and there does not appear to be a RAM shortage. Have recreated on several mobile clients. No edits to docker-compose or .env (outside of API keys and mount directory which is on NFS).

37 Replies

Alex Tran•3y ago

How much ram do you allocate for the Machine learning container? The container will load the model into the memory during run time, trigger by the first upload So if it tries to load the model into memory and doesn’t have enough ram, it would crash. I recommend to have at least 4GB of ram for the machine learning container, 6-8GB would be preferred for upcoming release with facial recognition model

oxfordcomputersOP•3y ago

I have 10000M allocated.

Alex Tran•3y ago

10GB?

oxfordcomputersOP•3y ago

Correct You mentioned that the ram should be allocated for the machine learning container. I am running the ML container (docker) alongside the other dockers using the standard .env and docker-compose files. Do I need to manually specify the RAM for the ML container somewhere?

Alex Tran•3y ago

No 10GB for the whole stack should be enough What is your system that run Immich?

Kryptonian•3y ago

Currently the ml container eats all the ram you give it, and crashes because of how the models are kept in memory (and not unloaded at all).

Alex Tran•3y ago

I think the next release would fix this issue as we moved to fastapi now

Kryptonian•3y ago

Let's hope so. :)

oxfordcomputersOP•3y ago

Proxmox kernel Linux 5.15.85-1-pve #1 SMP PVE 5.15.85-1 on a Ryzen 7 2700X with 32GB RAM I'd suspected this might be what was happening given that no amount of RAM seems to satisfy it Is fastapi available on either branch? I'm currently on latest release branch

Alex Tran•3y ago

You can try to use the tag main to test the fastapi build

oxfordcomputersOP•3y ago

So just change this image: ghcr.io/immich-app/immich-machine-learning:release in .env to image: ghcr.io/immich-app/immich-machine-learning:main to try out the fastapi build? Do I need to change any of the other docker images to main?

Alex Tran•3y ago

No there is not pulling should work

oxfordcomputersOP•3y ago

So docker pull ghcr.io/immich-app/immich-machine-learning:main in my VM? Will that replace the current ML container or create a second ML container? Sorry for the extra questions--I'm bad with docker stuff.

Alex Tran•3y ago

No it just download another image and then when you use docker-compose up, it will use the corresponding image with the tag main the release tag image is still on your machine

oxfordcomputersOP•3y ago

Alright, thanks Alex. FWIW, your project is amazing and I've got a monthly donation set up on librepay for the next year!

Alex Tran•3y ago

Hey man, thank you for the kind words I appreciate it

oxfordcomputersOP•3y ago

Wanted to provide an update. I switched to main as suggested but the same error persists with gunicorn.

oxfordcomputersOP•3y ago

oxfordcomputersOP•3y ago

No description

oxfordcomputersOP•3y ago

No description

Kryptonian•3y ago

Yeah, that's what I thought would happen.

oxfordcomputersOP•3y ago

It seems that, for now, my only option to avoid these OOM errors is to just disable ML.

jrasm91•3y ago

What is actually killing the process?

oxfordcomputersOP•3y ago

How would I determine that?

Kryptonian•3y ago

dmesg is one thing to check eg kernel logs.

oxfordcomputersOP•3y ago

I've just tried dmesg | grep gunicorn but no results. I have, however, had ML disabled since yesterday and gone through several reboots which may have cleared such logs?

Kryptonian•3y ago

Yeah then look in /var/log and grep all files for oom

oxfordcomputersOP•3y ago

Done. I get hits in kern and sys logs but they're the same exact message types as provided in the above screenshots.

Kryptonian•3y ago

look in kern if there's anything around those. Are you running it in a lxc container?

oxfordcomputersOP•3y ago

May  7 08:56:09 immich kernel: [  115.847035] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=docker-4985ad7a64045fc432f1d5371aa8919c6659883623f47324df66aa0d6eb97f16.scope,mems_allowed=0,global_oom,task_memcg=/system.slice/docker-4985ad7a64045fc432>
May  7 08:56:09 immich kernel: [  115.847085] Out of memory: Killed process 873 (python) total-vm:4782964kB, anon-rss:2212072kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:6652kB oom_score_adj:0

^ so that was when I had ML on but had reduced RAM back to 4GB Here's output from kern.log when ML was on but RAM was set to 10GB

May  4 00:47:22 immich kernel: [ 3903.349505] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=docker-30261e4f97ab5c0777ebb554c66efdeb576feeabf3b99310f634c3c0d5ea3028.scope,mems_allowed=0,global_oom,task_memcg=/system.slice/docker-cd2a6534f36e61d858>
May  4 00:47:22 immich kernel: [ 3903.349531] Out of memory: Killed process 1958 (gunicorn) total-vm:2611424kB, anon-rss:951112kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:3140kB oom_score_adj:0
May  4 00:47:38 immich kernel: [ 3919.760475] typesense-serve invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
May  4 00:47:38 immich kernel: [ 3919.760480] CPU: 0 PID: 1759 Comm: typesense-serve Not tainted 5.10.0-22-amd64 #1 Debian 5.10.178-3

I am not running in an LXC container, no. So, I think the answer to this question is that typesense-serve is killing gunicorn

Alex Tran•3y ago

Are you running this in a VM, LXC, what is the linux distribution you are running this?

oxfordcomputersOP•3y ago

VM in Proxmox Linux 5.15.85-1-pve; VM distribution is Debian 11.6 on kernel ver. 5.10.0-22

Alex Tran•3y ago

Thank you, nothing out of the ordinary pretty strange issue you are facing

Kryptonian•3y ago

How have you limited the ram? The VM's total or container itself?

oxfordcomputersOP•3y ago

I've allocated a minimum 2GB, max 4GB to the VM itself. (I have played around with this a bit too--like trying a min. 4GB, max 10GB, but same results.) If by container you mean the docker container, I haven't applied any limit within Debian to docker. From my understanding of docker, this shouldn't be relevant but can't hurt to ask: are there any underlying system packages/tools that docker/immich relies upon that may be out of version that I should check?

Kryptonian•3y ago

Okay so it runs out of ram in the VM entirely. :yikes:

oxfordcomputersOP•3y ago

Yes

A place to hang out, get support, discuss Immich, get announcements about releases and anything else going on.

24KMembers

View on Discord

Did you find this page helpful?