RunPod•17mo ago

Cuda out of memory

Hello, I am using the Runpod PyTorch 2.1. I am trying to train a small model (phi) about 1.5gb and whatever I do, I keep getting an error about Cuda out of memory from a process I don’t know where it comes from. I am using a 3090 gpu so I don’t understand where is the problem

5 Replies

flash-singh•17mo ago

you likely need more vram, whatever model your running takes up too much vram

RounMicLessOP•17mo ago

But this phi model is 1.5 bg and now I tried a A40 and got the same problem. Moreover, I don’t have any fluctuation of gpu utilization on the website Yeah I also tried it on A100

flash-singh•17mo ago

something is wrong with code then

RounMicLessOP•17mo ago

Autotrain ? I should open an issue ? It’s okay for me, I just want to be sure it doesn’t have a link with runpod/container since I am able to run it on my computer locally

J.•17mo ago

I think considering that people like kopylk are able to run very large training sets, id be surprised, if there was an issue with runpod. maybe something about the code is constantly pushing to vram, without management. But ive used up a large amount of memories for image generations before and LLMs, where ive deifnitely runned out, but bumping up to something like a A100, I haven't had an issue with

Gaming

Programming

Cuda out of memory

Did you find this page helpful?