Support for terminating pods via SkyPilot
Hi, I want to let my training runs go overnight and to terminate the pod once they are finished training. To do this, I am currently using SkyPilot. Whenever I try and stop a pod via SkyPilot, I get an error similar to
Stopping is currently not supported for RunPod. Can RunPod please support this feature?11 Replies
It would also be useful to be able to set
image_id so I can use the template runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04 instead of the default which has an old version of cuda
CC @LukeIf you’re using a network volume, there’s no need for a “stop” option since all your data is stored in the network volume. You can safely terminate the pod without losing data.
Regarding your second question, I didn’t quite follow. When modifying the template, you can specify any docker image you prefer.
I am trying to terminate the pod via CLI using the SkyPilot integration, but I get an error that its not supported.
Same for the template, I want to set it via CLI using SkyPilot, but get an error that its not supported.
I am trying to build off of this tutorial, using the features in SkyPilot:
https://docs.runpod.io/tutorials/integrations/skypilot
Running RunPod on SkyPilot | RunPod Documentation
SkyPilot is a framework for executing LLMs, AI, and batch jobs on any cloud, offering maximum cost savings, highest GPU availability, and managed execution.
Unknown User•13mo ago
Message Not Public
Sign In & Join Server To View
it is not
Unknown User•13mo ago
Message Not Public
Sign In & Join Server To View
I can post it later today, the main thing that differs is I specify ‘image_id’ to try and get a torch 2.4 template, but it says its not supported with runpod
working backwards, is there any docs on specifying a template on skypilot with runpod? Is there any way to auto terminate a pod when its idle (ie training run ends)?
Unknown User•13mo ago
Message Not Public
Sign In & Join Server To View
Thats what I used :p
I just dont think runpod supports these features in the integration
Unknown User•13mo ago
Message Not Public
Sign In & Join Server To View