That was exactly it. torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU

That was exactly it.

torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacity of 79.20 GiB of which 16.62 MiB is free. Process 2510593 has 79.18 GiB memory in use.,

training was successful on batch size=2 on h100 sxm
This thing is demanding indeed, consumed like 77gb vram
Was this page helpful?