RunpodR
Runpod8mo ago
peanut_

Running out of memory

Hi, the OG kohya template from runpod was taken down and not replaced, so now I'm using the InvokeAI template. I can't complete any training because it keeps crashing because it keeps running out of memory. I've never had this happen before
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 512.00 MiB. GPU 0 has a total capacty of 23.54 GiB of which 303.12 MiB is free. Process 2505369 has 384.00 MiB memory in use. Process 2505421 has 7.50 GiB memory in use. Process 2521763 has 15.35 GiB memory in use. Of the allocated memory 14.30 GiB is allocated by PyTorch, and 581.12 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Was this page helpful?