Runpod

R

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

Workers and rate increasing

Hello everyone, can someone explain to me how the number of workers affects the increased consumption? Because, as I understood, even if there are 100 workers, it will not affect the cost of money, because the same capacities will be used. But it seems to me that I'm wrong, because it's strange?..

what am i doing wrong, serverless workers optimization



"ComfyUI to API"-wizard

We’ve built a tool that takes your ComfyUI workflow (the full export via Comfy → File → Exportnot the API export) and turns it into a ready-to-use serverless endpoint. It automates the complicated bits for you. How it works 1. Upload workflow.json exported from Comfy → File → Export (not the API export) 2. Analyze your workflow: click Analyze to run our comfy-agent. It detects custom nodes, locates required models and prepares a Dockerfile...
No description

Stuck initializing vLLM

I'm using the official runpod vLLM image for a serverless endpoint, using all default settings (besides the model), and my workers are all stuck initializing. I have a job in the queue (just hitting /v1/models) for 20+ minutes now and there are five workers (3 regular + 2 extra) just spinning "initializing". I don't see anything in the worker logs. What am I doing wrong, and more to the point, how do I figure out what I'm doing wrong? Just seeing logs would be nice. (Endpoint id: 99ky9cmdcjj2hm)

Get a cost with the result?

Can a serverless endpoint return the cost as well? Maybe I missed some documentation, i want to display every execution cost on my frontend for serverless.

Export Serverless Metrics

Is it possible to export Serverless metrics? From GUI or API. It seems that there was an API in last year (found in an old post) but I'm getting a 401 if I'm trying today....

Workers getting stuck at Websocket receive time out. still waiting...

I am trying to run comfyui on serverless. build a 40gb docker file and added to serverless. My models and nodes are loading into memory but process is getting stuck at "Websocket receive time out. still waiting..." i have provided my logs and the utilization screenshot....
No description

So constant Crashes, outages, errors, lag

So are you guys dying? Not using our money to upgrade? What's the deal? And When are you going to start crediting account for this abysmal service?

Workers throttled and outdated despite many choices

I currently have one idle worker that is outdated and 4 other workers that are showing up as throttled. They all are H200 SXM workers, although my endpoint is configured to support many other cards and the GPU Configuration options I have selected say High Supply, Low Supply, and Medium Supply. Why is my outdated worker not being updated to the latest release?...
No description

Best practices to deal with serverless throttling

Since we are not the only ones dealing with serverless throttling (https://discord.com/channels/912829806415085598/1413624731533443137/1425369322527920211), perhaps there are already best practices/recommendations to mitigate the issue? For example, run more workers, use different GPUs, use Instant Clusters, put some parameters in place, etc.?

Did Runpod break vLLM bnb 4-bit?

I used Runpod a year ago and was able load a Llama3-8B finetune into vLLM and quantized using BNB on the fly to 4bit. I've been trying that with a Qwen3-14B finetune recently and I can't seem to get it to work. I also merged my finetune to 4-bit bnb safetensors and it also refuses to load. Is there some new configuration I need to use to get this to work now?...

running synchronous workload on serverless

I want to run comfyUI synchronously using my web-server wrapper: https://github.com/ImmarKarim/comfyui-api-wrapper . it has functionalities that are very useful to me. my backend sends an API request to the above wrapper and expects the response in the same connection. to get this working on runpod serverless, can you please help clarify: 1) can I deploy this wrapper out-of-the-box and expect it to work in runpod's serverless ecosystem?...

Unauthorized error with admin role

Hi RunPod team I’m running into an issue with a Serverless endpoint where: - I am logged in with my personal user account (confirmed Admin role in the team)...

98% Speed Optimization Achieved - Can We Go Further?

```Current Setup & Results Architecture: RunPod Serverless + ComfyUI InfiniteTalk I2V workflow (Image-to-Video with audio)...

Build Stuck in Pending for Hours - Need Help

Hi, I've been trying to deploy my GitHub repo to RunPod Serverless for hours but the build keeps getting stuck in "Pending" status for 1+ hour. Workers show "initializing" but never start. Can someone help me figure out what's wrong? Thanks.

Best workflow to test if serverless docker containers work?

Currently, this is my workflow -> make a code change locally in Dockerfile -> cloud build happens (30mins) -> docker pushed to dockerhub -> I restart runpod serverless workers manually -> I send a request using the test page -> runpod downloads the docker (10 mins) -> model actually runs (10mins) Thus, it takes 1hr for me to test any change I make on serverless docker containers. What is a good way to build dockers or test changes more quickly?...

Runpod Severless Build Pending for HOURS

1. We pushed a new release to update our serverless endpoint 2. Build has been pending for 2 hours with no logs 3. This is the job ID — 0d26531f-a8cb-4de8-9d5f-256fef613ee2...
No description

Setting up your own serverless AI video generator with ComfyUI + WAN 2.2

I recently experimented with building a serverless AI video generation workflow using ComfyUI + WAN 2.2 on RunPod, and I wanted to share what I learned. Here’s the video: YouTube Link I’d love to hear if anyone else has tried something similar, or if you have tips for optimizing serverless AI video generation workflows....

Jobs stuck in queue

Jobs seem to be stuck in a queue - workers are available but not processing requests

Intermittent 502 Bad Gateway Errors on Serverless Load Balancer Endpoints

Hi RunPod team and community, I’m experiencing intermittent 502 Bad Gateway errors on my serverless load balancer endpoints. The requests are usually processed normally, and my logs don’t show any clear pattern or error when the 502 occurs. The timeout settings are reasonable and not being hit. Has anyone else encountered this issue?...