Issues when restarting stopped pod

For a few days, I've had multiple issues when restarting a stopped pod. It will just hang saying "Container is not running" -- once I briefly caught an error in the system console about 'failed to start networking' and 'driver failed to program' -- is that an issue on the RunPod infra? I should note that I'm running the exact same container image over and over again, and if I terminate the one the failed and re-create it from scratch it works every time, but I thought you were supposed to be able to restart a stopped pod? Oh, and I can confirm that the API says the pod was restarted with the GPU attached and is in 'RUNNIING' state, but it has the issue described above. Here are two pod IDs with approximate time of the failure for reference: v8ncs5gdf40rm4: 6:44pm US ET e7kere9fn8wyj6: 8:26pm US ET
0 Replies
No replies yetBe the first to reply to this messageJoin

Did you find this page helpful?