Runpod

R

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

Extremely High Delay Time

📅 Date/Time: Sept 9, 2025 📊 Observed Behavior: At this timestamp, I noticed an extreme delay time of ~70,722,000 ms (≈ 19.6 hours) on my pod: • P70: 70,722,309 ms • P90: 70,722,363 ms...
No description

Serverless endpoint deployment: Something went wrong. Please try again later or contact support.

your UI console.runpod.io Something went wrong. Please try again later or contact support....

How long is the delay on serverless?

I'm consistently getting 30 minutes (mostly 1-2 hours or more) of delay for my requests to my serverless endpoint, is this the default? This is totally unusable.
No description

File caching question

For my processes I have a file that needs to be compiled and cache per gpu type. its around 22MB , right now I only need to suppport 4 gpus but might need more later. Should I have all of the precompiled file in the docker image or have a script get them from a file bucket or else

Help! I want to use the public enpoint. Crazy hallucination with the standard format

I followed the format runpod provided in setting up the public endpoint, but the output is completely unreleated. How do we structure the code calling the public endpoint so we get an output like the official website interface. Please help.

Workers stuck at "Running" indefinitely until removed by hand

It was all good for a month or so but lately (3-4 days) many of my workers started randomly being stuck at "Running" with errors like this: (this is worker qv6a8l4769xx3q). They keep running like this, generating infinite uptime and seemingly affecting the queue (noticed many jobs waiting for minutes despite having 5 idle workers ready) ```...
No description

Workers are getting throttled

Hey guys, Workers are getting throttled. I have 50 workers limit and most of them are getting throttled. My application is being impacted havily. For a note, its mostly happening for US based workers. I have no preferences around GPU or CUDA so its starting worker randomly across the globe....

Serverless shared workers?

I've 2 endpoints, each of them seems to share the same workers? so every time I do a new release, it's applying for both of them...
Solution:
You probably created both endpoints using same template or clone it. Try to create a new one manually set the settings

build stuck at pending before it starts building

Serverless importing repo from github with dockerfile, build gets. stuck at pending stage

Intermittent "CUDA error: device-side assert triggered" on Runpod Serverless GPU Worker

I’m deploying a GPU-based service on Runpod Serverless. Most requests run fine, but after some time I start getting errors like: "error": "CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.", "executionTime": 120575...

Typescript handler functions

Does Runpod support Typescript handler functions for serverless endpoints, or only Python? We're using Runpod to accelerate scientific workloads and just shelling out to C++ binaries in the serverless function. Since the rest of our backend is Typescript it's mildly unpleasant to support Python for this single use-case....

Not receiving webhook requests after job finishes

For worker hhw01hbcmodswc Example request IDs: 7be46aaf-633b-403b-b216-946ccf98958f-e1 fc40f664-c679-403e-a7f3-c9b7a28ad251-e1...

am i billed for any of this

4 on idle, havent touched them for 10 mins, initialising one is extra tho, which i didnt ask for. do i only get billed when a request is sent from start to finish?
No description

Unauthorized while pulling image for Faster Whisper Template from Hub

Getting the following error suddenly while using the Faster Whisper Template from the Hub, worked fine before: loading container image from cache Loaded image: registry.runpod.net/runpod-workers-worker-faster-whisper-main-dockerfile:bd500dc88 error pulling image: Error response from daemon: unauthorized...

ComfyUI Serverless Worker CUDA Errors

Some serverless workers run into runtime cuda errors and fail silently. Is there anyway to tackle this? Can I somehow get runpod to fire me a webhook so I can atleast retry? Any solutions to make serverless more predictable? How are people deploying production level comfyui inference on serverless? Am I doing something wrong?...

CUDA error comfyui

I'm getting this error, everything used to work so far, Idk what's wrong 🤔 id: dbfc886a-8465-4112-ba2c-e1c8e297bbb4-e2 Workflow execution error: Node Type: CLIPTextEncode, Node ID: 10, Message: CUDA error: no kernel image is available for execution on the device\nCUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1\nCompile with TORCH_USE_CUDA_DSA to enable device-side assertions.\n\n"...
Solution:
I'll try RTX 4090 now

Is there any way to force stop a worker?

So i have a project that requires a worker to run to access the app on this worker via this app api. problem is, this takes about a minute and then i dont need it. but worker keeps running for about 8 mins when i dint need it. so i was wandering, is there any way to force stop? i didnt find any api calls in docs, execution timeout option seems to do nothing, and even canceling the job and purging queue via api doesnt stop it. Help would be appreciated

Long delay time

HI , my serverless inference requests always have a long delay time 40 - 50 seconds. What is exactly this delay time ? My docker image is quite big, would making it smaller reduce the delay time ? Thanks you