We are experiencing many instances of GPU pods (mainly A6000) that stop working after 30 hours losing also the VRAM content.
We have repeatedly reported these issues but still there is not a solution since it keeps happening.
We have left a pod on (ID : cxquttq3m3kqvl) for you to debug, can you please help?
Thanks