TL;DR: Pod always restarts Docker command, never leaves
RUNNING
RUNNING
state. Company blog claims otherwise.
Hi there! I was trying to implement running a one-off job using ECR-based Docker image on RunPod.
I'd like to create a Pod that uses a Docker container from AWS ECR, run the command, let the Pod finish the commant. Using polling I want to poll the Pod status and terminate the pod as soon as it finishes
Design your container or job to exit when finished. If it’s a one-off batch job, ensure the container’s command will naturally terminate (and not linger).
Problem: a created pod doesn't terminate, it restarts the container command and never actually finishes. What I did:
1. Created a template that points to a container in my AWS ECR 2. Ran a pod 3. I observe the following logs:
start container for ***.dkr.ecr.us-east-1.amazonaws.com/***:jenkins-7264_env.gpu: beginstart container for ***.dkr.ecr.us-east-1.amazonaws.com/***:jenkins-7264_env.gpu: beginstart container for ***.dkr.ecr.us-east-1.amazonaws.com/***:jenkins-7264_env.gpu: begin
start container for ***.dkr.ecr.us-east-1.amazonaws.com/***:jenkins-7264_env.gpu: beginstart container for ***.dkr.ecr.us-east-1.amazonaws.com/***:jenkins-7264_env.gpu: beginstart container for ***.dkr.ecr.us-east-1.amazonaws.com/***:jenkins-7264_env.gpu: begin
Explains how to use Runpod’s API to run AI jobs on a schedule or on-demand, so GPUs are active only when needed. Demonstrates how scheduling GPU tasks can reduce costs by avoiding idle time while ensuring resources are available for peak workloads.