Runpod

R

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

CPU Availability in North America?

I spent all day trying to create a new CPU serverless endpoint. It kept getting stuck on "Initializing" for many minutes at a time. After spending a few hours digging through my Docker pipeline, I realized that the actual reason no workers were available is because I was attempting to stand up the servers in North America. When I picked the entire world, I saw that I could only get CPU servers in Romania and Iceland. Specifically EU-RO-1 and EUR-IS-1. That's understandable, I guess, but the Serverless » New Endpoint UI shows "High" availability of CPU3 and CPU5 workers across the board, even when narrowing it down to a single datacenter in the US. I learned to rely on that label when picking GPU workers for a different endpoint. Can you please confirm if my intuition is correct? And if so, perhaps you could improve the labeling in the UI to reflect the true availability of those workers?...

Serverless run time (CPU 100%)

So, i have a comfy UI workflow with a couple of custom nodes running. Most of the time my workflow takes about 6-8 minutes. The weird thing, 24GB or 80GB is only 1-2 minutes difference. ...
No description

Custom vLLM OpenAI compatible API

Hello, I'm running OpenAI compatble server using vLLM. In runpod for SERVERLESS service you cannot choose the endpoint you want to track the POST requests to, it's /run or /runsync by default, ny question is how do I either change the runpod configuration of this endpoint to /v1 (OpenAI endpoint) or how do I run the vLLM docker image so that it is compatible with the runpod?...

How to cache model download from HuggingFace - Tips?

Usin Serverless (48gb pro) w Flashboot. Want to optimize for fast cold start is there a guide somewhere? it does not seem to be caching the download - it's always re-downloading the model entirely (and slowly)...
No description

ComfyUI stops working when using always active workers

Hi. I know it's strange, but here it is. I have a workflow that works flawlessly when using serverless workers that are NOT always active. That is, if I set "always active" to 0 and max workers to 1 or 2 and it all works fine. For deployment, I put 1 worker as always active and 3 max workers. With this setup, (and exactly the same code as before), things stop working. The ComfyUI server starts but it looks like the endpoint never receives a request. If I set It back to set 0 always active workers, it works again. ...

is it possible to send request to a specific workerId in a serverless endpoint?

I need to have a custom logic to distribute requests to available workers in the serverless endpoint. Is there a way to send request to a specific worker using workerId?

Error response from daemon: --storage-opt is supported only for overlay over xfs with 'pquota' mount

Here are the request ids: e5307e07-7f0e-4b82-b668-7560a9b7ad4b-u1 9a65646e-1b26-4177-8262-59080c9d8e24-u1...

Polish TAX ID invoices

Hi, how can I correctly set Polish VAT ID, so I can get invoice when using one-time credit purchase? I do no see any option to set this ID during Stripe card checkout. I have this ID set in my profile options, is it sufficient for invoice generation?

How to cancel request

Here is my python code for running the request. ############# run_request = endpoint.run(input_payload)...

What is the normal network volume read speed? Is 3MB/s normal?

I've been seeing network speeds of less than 3MB/s in EU-SE-1 which makes things difficult.

Pods not getting started

Whenever my endpoint receives new requests and it autoscales to create new pods, a few of the pods get stuck while booting and don't respond. Also, while this happens I am being charged because somehow that is considered as uptime, certainly not a fault with my code and multiple other pods work fine on boot

First runs always fail

when using serverless api endpoint (comfyUI installed), the first run always fails even tho the following ones work fine. this is what the api returns on the first run:...

RunPod GPU Availability: Volume and Serverless Endpoint Compatibility

Hey everyone! Quick question about RunPod's GPU availability across different deployment types. I'm a bit confused about something: I created a volume in a data center where only a few GPU types were available. But when I'm setting up a serverless endpoint, I see I can select configs with up to 8 GPUs - including some that weren't available when I created my volume. Also noticed that GPU availability keeps fluctuating - sometimes showing low availability and sometimes none at all. So I'm wondering:...

How long does it normally take to get a response from your VLLM endpoints on RunPod?

Hello. I've tested a very tiny model (Qwen2.5-0.5B-Instruct) on the official RunPod VLLM image. But the job takes 30+ seconds each time - 99% of it is loading the engine and the model (counted as delay time), and the execution itself is under 1s. Flashboot is on. Is this normal or is there a setting or something else I should check to make the Flashboot kick in? How long do your models and endpoints normally take to return a response?

This server has recently suffered a network outage

This server has recently suffered a network outage and may have spotty network connectivity. We aim to restore connectivity soon, but you may have connection issues until it is resolved. You will not be charged during any network downtime.

serverless health

https://api.runpod.ai/v2/adaejhk*****/health The health endpoint while cold starting, never indicate the state of initializing? It just goes from idle/ready to running and back. Is there a way to indicate the Serverless Endpoint is warming up for the application indication?...

Monitoring Queue Runpod

I have had a lot of issues these past days on Runpod. I'd like to be able to quickly react to them with a notification when the queue of a given pod is > 5 during X seconds. Is there an easy way to check that?...

Why dont have any thing A100 or H100 now :(.

I don't understand,, I have reserved about 7 H100 but currently all H100 or A100 worldwide on runpod I don't see which one :(.
No description

Need help *paid

Hey, So i have a custom workflow and just can't run it on runpod serverless. Currently I'm trying with this template but just getting the following error {'type': 'invalid_prompt', 'message': 'Cannot execute because node IPAdapterUnifiedLoader does not exist.', 'details': "Node ID '#109'", 'extra_info': {}}...

Runpod requests fail with 500

also when i try to open my endpoint in UI, it redirects to 404. I didn't change anything....