Pod easily get OOM!
I am using an 8xA40 instance. Pod id: k3urxcxexkj989
Even though I do not run any heavy tasks, just unzip a file and upload some data to the pod using scp commands, the pod frequently got OOM issues. My pod has ~375GB of RAM, and I don't think my process caused the problem. Could you check out the issue? Thanks
1 Reply
I have restarted the pods several times, but the issue still persist