Runpod

R

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

No space left on device on a serverless work

I have train a model with replicate. I have use this article to create a docker image from this new train model https://blog.runpod.io/replicate-cog-migration/. Docker image is ready on my servless endpoint. I have attached different storage from 100 GB to 1 TB, but I have still this message on worker (see log attached file). What I'm doing wrong ? Thanks for your help

Job completed 100% but stuck

I got a job running and it finished without errors upto 100% via the job.progress_update api. But it keeps staying on the 100% and in progress state without going to completed state. The job seems actually completed as i’m not billed and the endpoint goes to idle....

US-TX-3 completely down

Any ETA on when it will be restored? It's been a while now.
No description

I want a list of CPU types by data center

The project I'm developing requires both GPU and CPU performance and needs to be serverless. I want to choose the CPU that's right for my project, so can you list the CPU types of GPU instances by data center?...

Veryyyyyy slow serverless VLLM

Considering moving away from RunPod, this is just insane how slow this is on serverless Runpod serverless 4090 GPU, cold start of vllm: Model loading took 7.5552 GiB and 52.588290 seconds ...

Difficulty setting up ComfyUI Serverless.

hi, so I made a Dockerfile like so (I need this custom node to handle base64 I/O)
FROM runpod/worker-comfyui:5.1.0-base
RUN comfy node install comfyui-tooling-nodes
FROM runpod/worker-comfyui:5.1.0-base
RUN comfy node install comfyui-tooling-nodes
...
Solution:
I figured it out, I need to exclude docker.io/library/ from the URL when I specify my docker file

API endpoint URL

Alright, let me start out by saying I'm very new to all this and just trying to offload transcription for a specific program. I have setup an endpoint .. but the software config only asks for "endpoint url" . My question is, what do I put there so the transcription gets offloaded. As it is I just get a return of a 404 page not found. --- Remote Settings (Only used if TRANSCRIPTION_MODE=remote) --- URL of your running faster-whisper-server/speaches API...

Why "costPerHr" in /v1/endpoints differs so much from the actual cost per hour?

it says that cost of running NVIDIA A100 80GB PCIe is $1.64 per hour, which is a lie because it is actually $2.736 per hour. Also there is no name of the GPU the worker is running on GET /v1/endpoints ```{ "createdAt":"2025-05-02T07:18:46.953Z",...

Queued Jobs not getting resolved

Does anyone know if there is a problem with serverless, a lot of my jobs are stuck in the queue and even though I have idle workers, nothing seems to be running.

Container images loading stuck in a loop when launching

I have workers that never actually launch after pulling containers. Have no idea how to debug this. Deleted and recreated the endpoint and get the same behavior. Any thoughts on how to resolve? It is extra aggrevating as I've had to spin this up because of the EU-SE-1 performance degradation and now getting hit with this issue. endpoint id: d9b5s5qpbl0sfb === Snip from the logs. This is just looping repeatedly. Have the credentials set for dockerhub, the image is published and available, etc. ===...

RTX-5090

when will we see the RTX-5090 in serverless? currently it doesn't work because the vLLM version still doesn't support, will the runpod team gives a nightly build of the container that runs on the RTX-5090?

environment variables

how to get environment variables while build? I set variables in endpoint settings, but they are not available during build. Please tell me if I missed something? ...
Solution:
You can either build the Docker image yourself, push to a docker registry and deploy from there. Or if you want to use the Github Integration you need to put your key inside the Dockerfile, this is the current solution.
No description

Network Volumes for Custom Models with ComfyUI Serverless"

Hey everyone, I'm trying to understand network volumes with RunPod serverless. If I create a ComfyUI serverless endpoint with the default worker image and attach a network volume, can I then launch a regular ComfyUI pod using that same volume to add my custom models and workflows? And will those custom models then be accessible when my serverless endpoint runs? Basically trying to extend the default image without modifying it.

build error (wrong path)

Hi, I think there is a bug in the building pipeline. I have set "Dockerpath" to "xx/yy/Dockerfile" in my serverless config. But when building i got an error: ```...

Change headers of webhook call

I would like to add an access token to the webhook call. In addition my hosting service blocks requests with GO as user-agent 🤷...

Seeing a bunch of running workers but none of them is running jobs.

Hey I experienced a lot of serverless workers running but none of them is picking request to run jobs other than the cost

Regular "throttled" status

Hi, I've configured a serverless endpoint with the max_workers setting explicitly set to 1. I've observed that the single worker for this endpoint frequently enters and stays in the "Throttled" state. This seems to be causing significant delays in request processing, making them take much longer than the actual inference time. ...
Solution:
when you set max worker to 1, your worker only deploy to single machine, when you not using it, we will give that machine to other people and when machine is fully used, your worker will be throttled. Highly suggest to avoid set max worker to be 1.
No description

Error running ComfyUI workflow from pod on serverless

I’m encountering an error when running a ComfyUI workflow from a pod on RunPod serverless. Previously, I was running the ComfyUI workflow on a pod with network storage mounted, and it worked fine. Now, I want to run the workflow via API, so I deployed the endpoint using the image: timpietruskyblibla/runpod-worker-comfy:3.4.0-base with my network storage mounted as the endpoint’s storage....
No description

Requests stuck in queue

Hi I am having issues with my serverless deployment - tasks are stuck in queue for 6-10 min, while there are idle workers (screenshot 1) I believe the issue to be with how the container is started, and not with the image itself....
No description

Disk Volume Pricing for Serverless

I'm looking for clarification on disk pricing for serverless workers. On the pricing page a Container Disk price for running pods of $0.10/GB/month (and $0.20/GB/month for idle pods). How does this translate to the serverless workers? When I create a template for my endpoint I specify Volume Disk (e.g 20Gb); how am I being charged for this? 20 * $0.20 *number of workers per month (assuming the workers are idle)? ...