Pod ran out of CPU RAM
I somehow managed to run out of RAM (not VRAM, system RAM)... right after a very compute-heavy operation (calculating quantized KV-Cache scales)... while running
Now that you're done laughing at my misfortune, is there anything at all I can do to save those weights? Even enabling some swap would be completely fine... I just want the weights to save to the networked drive...
Pod ID: tybrzp4aphrz3d
model.save_pretrained... while the weights are still in VRAM... The pod is still running, but completely unresponsive.Now that you're done laughing at my misfortune, is there anything at all I can do to save those weights? Even enabling some swap would be completely fine... I just want the weights to save to the networked drive...
Pod ID: tybrzp4aphrz3d