Runpod

R

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

Testing default "hello world" post with no response after 10 minutes

Attached a few pics of what I tried to do, I eventually cancelled it after a little under 10 minutes and never got a reply, it just stayed in queue. I assume I'm doing something wrong. I left all endpoint settings as default and set the hugging face url to openai/gpt-oss-20b
Solution:
oh change your image tag to
runpod/worker-v1-vllm:v2.8.0gptoss-cuda12.8.1
runpod/worker-v1-vllm:v2.8.0gptoss-cuda12.8.1
...
Message Not Public
Sign In & Join Server To View
No description

do the public endpoints support webhooks?

I'm not seeing anything in the documentation about webhooks for the public endpoints.
Solution:
Update: They do support webhooks :)

Serverless timeout issue

Hi Guys i need help with a serverless timeout issue i have a serveless endpoint setup that keeps timing out after 60 seconds i tried setting up the timeout to be 1200 , i tried disabling the time out , tried sending a time out with the reuqest { "input": {...

RunPod worker is being launched, which ignores my container's ENTRYPOINT

Hello, I'm experiencing an issue with my serverless endpoint. Despite my endpoint being configured to use a 'Custom' worker with my own Docker image (ovyrlord/comfyui-runpod:v1.27), the logs show that a generic RunPod worker is being launched, which ignores my container's ENTRYPOINT. I have verified all my settings and pushed multiple new image tags, but the issue persists. Can you please investigate and clear any stuck configurations on your end for my endpoint?

Load balancing Serverless Issues

Hello Everyone, I was trying to switch from queue to load balancing and try with the default template site provided and even tried hitting the http port, it keeps on running indefinitely, unless I force them to stop, it will keep on running and incur charges. Any recommendations on how to properly hit the endpoint and actually get a response. It just hangs, and doesn't really start the worker....

Access to remote storage from vLLM

I want to make an API call with a file that is on a my RunPod remote storage. But vLLM tell me: Cannot load local files without --allowed-local-media-path ...
Solution:
It works with that (if it can help others): "allowed_local_media_path": os.getenv('ALLOWED_LOCAL_MEDIA_PATH', '/runpod-volume') Add this line in: /worker-vllm/src/engine_args.py So you can add an ENV variable with the paths you want (or by default it will be/runpod-volume)....

"In Progress" after completion

I have a serverless endpoint that trains loras but for some reason after it finishes its still "In Progress" the container has been removed and I am not being charged but yet the status does not update to completed
No description

Load Balancer Endpoint - "No Workers Available"

I tried using Load Balancer Endpoint today. I got it to work successfully as well. However, after successful uses. I noticed some annoying behavior. 1. When testing, you're going to have lots of test runs, so you'll have to hit the endpoints multiple times. However, once you reached the point of, let's say, the 12th time of sending the task to the worker, it's gonna return "no workers available" despite having it but in "Idle" state. 2. When doing inference (for context, I use LatentSync. So it takes about a good 2-5 mins), I had to manually hit /ping in order to prevent the worker to become "Idle", which is kind of annoying....

a

2025-08-09 13:32:40 [INFO] #10 0.357 update-alternatives: using /usr/bin/python3.10 to provide /usr/bin/python (python) in auto mode 2025-08-09 13:32:41 [INFO] #10 0.359 update-alternatives: using /usr/bin/pip3 to provide /usr/bin/pip (pip) in auto mode 2025-08-09 13:32:41 [INFO] #10 0.359 update-alternatives: warning: not replacing /usr/bin/pip with a link 2025-08-09 13:32:41 [INFO] #10 DONE 0.4s 2025-08-09 13:32:41 [INFO]...

Serverless Logs Inconsistant

Hello, We are currently testing various Docker files to ensure the stability and reliability of the systems we’ve built. However, we’ve encountered significant challenges with the logging system. At this time, logs only appear to function properly about 10% of the time. Additionally, telemetry data tends to reset whenever we open the details for individual workers, and the log output is blank in approximately 90% of cases....

long build messages don't wrap

Long build messages don't wrap in the Builds section preventing you from accessing the ellipsis menu.
No description

Failed to return job results

My serverless worker logs these errors throughout the process: 2025-08-07T09:05:53.399265616Z {"requestId": "7d8f9b4a-9caf-48cb-a798-e4047fe62a9b-e1", "message": "Failed to return job results. | 404, message='Not Found', url='https://api.runpod.ai/v2/mwbt52if15qdt0/job-done/nvyv3441xhr52v?gpu=NVIDIA+H100+80GB+HBM3&isStream=false'", "level": "ERROR"} no progress updates on /status and no completed status either (the process completes successfully though)...
Solution:
this was resolved via support

How to set max concurrency per worker for a load balancing endpoint?

I'm trying to configure the maximum concurrency for each worker on my serverless load balancing endpoint, but I can't seem to find the setting in the new UI.

Not getting all webhooks from requests

Some requests with webhooks fail and I'm not sure why. Can't see anything in the logs for this. For instance, this one finished perfectly in the worker but did not send the webhook response: https://api.runpod.ai/v2/whpwouwejfjrmq/status/655e8518-11ec-48f1-a2c1-25ca0e6c4ef4-u1 Request (details changed for privacy)...

What are the best practices when working with network volumes and large models

Hi Runpod! we've been using serverless pods for quite a while now. most of our customer serving ran in the background, on demand, which means we could tolerate the long warmup times. However, to meet demands as per our customers we have made several key improvements in our generation times. That being said, our main bottleneck today is the infrastructure itself. We use quite a bit of models to perform the work for our customers, and have tried 3 different paths: 1. Working with images from a private registry that contained the models - was untolerable, the images kept re-downloading layers that were not altered, making it unfeasible to sustain through development unless we seperate only the models. and even then, whenever we need to add a new lora etc -- causes a lot of issues....

Some questions about Serverless workers and custom workflows

Hi all, i'm very newbie, please help me with this questions. 1)How long will it take for a serverless worker to start with models around ~60 GB, and is it better to store the models on a network volume or bake them into the Docker container? 2) What is the simplest and fastest way to create my own serverless worker if I already have a ComfyUI workflow with custom nodes? ...
Solution:

Update Transformers Library

Hi, I am trying to run Qwen/Qwen3-Embedding-8B via serverless endpoints. 1. I select quick deploy, Infinity Vector Embeddings. 2. Set Qwen/Qwen3-Embedding-8B as the model. 3. Batch size 32, data type auto....

New Serverless UI Issue

Your new UI does not have the worker setting ("Max Worker", "Active Workers" etc.) field. As a result, I am unable to resolve the max worker issue as shown in the screenshot. Please roll back to the previous UI and don't push stuff like this in production.
No description

serverless runpod/qwen-image-20b stays in initiating

Hi, I am trying to deploy serverless end point for this image runpod/qwen-image-20b... it just stays in initiating. I created a template with 20GB disk using NVIDIA GPU. Can anyone please help. Regards, Sameer...

Serverless Load-balancing

Good morning, I've recently came across https://docs.runpod.io/serverless/load-balancing/overview and following the instrucions. Yet, when I attempted to make a external HTTP request using n8n it simply did not work. I've attached my works logs below. Please let me know if I've done something wrong. Or. It's a possible issue with the documentation. Note) I used the following Container Image: runpod/vllm-loadbalancer:dev...