Runpod

R

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

ComfyUI Pod -> ComfyUI Serverless

What is the best way to transfer my running ComfyUI pod to an an ComfyUI Serverless? Does anyone have some experience with this case?

Can anyone tell me when the vLLM repo in the serverless hub be back?

My endpoints were deployed using the vllm repos and i noticed they removed it from the hub as well after a failed version update. Can anyone from the devs please inform me about the progress so far?
No description

serverless endpoint stuck on Throttled or Initializing

Hello! hope all is well. Trying to follow the load balancing serverless endpoint with vllm, on H100s selected, workers are either throttled or initialized. Can you please help? here is the endpoint ID: vllm-kcj166ha6f00m1

Comfyui to API wizard doest work for easy deploy

I followed these to convert my comfyui workflow to API , create docker in github and deploy from wizard. https://comfy.getrunpod.io/wizard/ https://docs.runpod.io/community-solutions/comfyui-to-api/overview But i always get "Build failed" with no logs in serverless deployment. Need help...

502 Error - Serverless Endpoint (id: z2v5nclomp5ubo) 1.1.4 Infinity Embeddings

It seems my pipelines that leverage RunPod Serverless Infinity Embeddings are unable to requisition an pod, likely due to volume? All of the workers are showing status of Initializing or Throttled permanently in the UI. However when I check the GPU configuration, I have selected 4 valid GPUs all of which are either "Medium Supply" or "High Supply" I end up getting a 502 error in my pipeline in response to my request, which based on my research appears to be related to the RunPod load balancer failing to find a backend pod that it can send jobs to, which when combined with the Throttled / Initializing messages in the UI makes me think that the RunPod service is overwhelmed. Is there a plan to improve the reliability of the service? I could also easily be missing something regarding my configuration though it is generally the Infinity Embeddings 1.1.4 quick start offered through the RunPod UI running BAAI/BGE-m3 with minimal additional configurations. Is there a recommendation for how I can use a fallback service, an additional endpoint, anything from RunPod so that it can have time to self heal?...

Slow serverless uploading

I've attached image. Docker image size is 13gb ( I would say its a reasonable size? ). But in runpod logs it says its uploading at 1MBps. Is there any reason why its so low?
No description

Model no Intializing anymore?

Did no changes (been using this container for several months) and now it is stuck during intializing. It say worker is ready in the container logs..but not progressing for hours
No description

Ensuring Task Routing to Warm Workers for FlashBoot VRAM Persistence

Hi team, I’m using FlashBoot and my understanding is that the container should stay alive after a job finishes so that the model remains loaded in VRAM, reducing cold start time. However, after I activate worker A, the next task often gets scheduled on worker B instead. This defeats the purpose of FlashBoot because the preloaded model in worker A is never reused....
No description

Migrate Network Volume so Serverless workers have better GPUs

Hello guys I’ve been on serverless for about a year and it’s time to make sure my serverless endpoints provide the best experience on my web application. Currently I have 30 serverless workers and this week some of them have taken up to 7 minutes each to run a workflow which is way higher than the 15sec prediction time I expect. I have my serverless workers connected to my Network Storage in region CA-MTL-1 and unfortunately I only get 1 useful GPU type available for my workers (A40). I want the workers to access better GPUs and I think that means I need them to be connected to my network storage in different regions. Can I duplicate my network storage in several regions? Or have a central region and only some network storage folders are copied over? Who can help me with this?...

Issue with llama-3.1:405b using https://console.runpod.io/hub/tanujdargan/runpod-worker-ollama

Hi I am stuck in a rollout in progress spinning wheel with no logs to see what is going on. using this repo https://console.runpod.io/hub/tanujdargan/runpod-worker-ollama...

can anybody help me build a custom Wan 2.2 I2V serverless

Looking for someone who can help me build Man 2.2 Image-to-Video model serverless

Comfy AI workers problem

I'm brand new to Runpod and trying to get it to work after putting $20 on my acct. I'm trying to use the Flux.1 Dev model on Hugging Face, and being guided by Lovable and Grok, but can't get it to work. My endpoint: endpoint/19brh8a9a0bhcz Can someone review and help me get this working please?...

VLLM repo not available in serverless

Hello! I am not able to select vllm as repo for serverless, It was there yesterday and the day before but did not work prolly cuz of outage, but right now It is not listed in avaliable repos, how can I fix this?

Request to increase worker count

Hi, I've requested through ui but got no luck on any response. Can I expedite the request here?...

Chat why am i getting Disk Quote Exceeded

it wasnt the case before, i didint even add anything lol help me gang

Serverless Package Build Failed - Disk Quota Exceeded

TL;DR - Runpod deploy failed with Disk Quota Exceeded error even though volume has sufficient disk space. Details 1. I pushed a new release from my repository to submit a new build to runpod severless. 2. When building the package, runpod failed with the following error...

401 Unauthorized when accessing REST API (rest.runpod.io/v1) despite API key ALL permissions

Hey, everyone! I need some help, I know its AI-generated in terms of formatting but this is my problem, We're getting a 401 Unauthorized error when using the REST API (rest.runpod.io/v1) to manage serverless endpoints, even though our API key has Read/Write permissions enabled. What we're trying to do:...

Request in queue for over 10 hrs

Hello, I need some assistance. I connected the Hugging Face Meta LLM model to the V2.10.0 serverless endpoint, and I currently have a balance of $7.86. Please explain why my request has been stuck in the queue for more than 10 hours. How can I correct the error? Thank you....
Next