RunpodR
Runpod4mo ago
gx

OutOfMemoryError: CUDA out of memory

I keep getting this error when trying to run various models (e.g., gpt-oss-20b, llama-3.3-70b) on pods. Even when running GPUs with way more than the required vRAM (e.g., 141GB H200 for gpt-oss-20b) I still get this error. I have tried setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True but that didn't fix it.

[info] Pipeline stopped due to error: CUDA out of memory. Tried to allocate 42.49 GiB. GPU 0 has a total capacity of 139.72 GiB of which 38.96 GiB is free. Process 584678 has 100.75 GiB memory in use. Of the allocated memory 99.91 GiB is allocated by PyTorch, and 181.47 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)\n
>
A guide to torch.cuda, a PyTorch module to run CUDA operations
Was this page helpful?