Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡｜serverless

⛅｜pods

🔧｜api-opensource

📡｜instant-clusters

🗂｜hub

ericmsilver

3/28/2025

Rag on serverless LLM

I am running a server less LLM. I want to add to a model a series of pdf files to augment the model. I can do it on webui in a dedicated gpu by adding knowledge

Talion

3/28/2025

Unexpected Infinite Retries Causing Unintended Charges

I recently ran my serverless workload using my custom Docker image on RunPod, and I encountered an issue that resulted in significant unexpected charges. My application experienced failures, and instead of stopping or handling errors appropriately, it kept retrying indefinitely. This resulted in: - $166.69 charged by OpenAI due to repeated API calls. - $14.27 charged on RunPod for compute usage....

necreiP

3/28/2025

Serverless vLLM workers crash

Whenever I create a serverless vLLM (doesn't matter what model I use), the workers all end up crashing and having the status "unhealthy". I went on the vLLM supported models website and I use only models that are supported. The last time I ran a serverless vLLM, I used meta-llama/Llama-3.1-70B, and used a proper huggingface token that allows access to the model. The result of trying to run the default "Hello World" prompt on this serverless vLLM is in the attached images. A worker has the status...

Eren

3/27/2025

Meaning of -u1 -u2 at the end of request id?

Would like to have what those means. I saw u2 on and u1 both sync and not sync requests, couldn't understand what is that.

xnerhu

3/26/2025

Ambiguity of handling runsync cancel from python handler side

Hi. What's the best way I can handle "cancel" signal in serverless server/handler side? Is default cancel logic just stopping the container all together?

Naruto

3/26/2025

Enabling CLI_ARGS=--trust-remote-code

I am trying to run some of the SOTA models and the error logs tell me that I need to enable this CLI flag. How can I do that?

Jaber

3/26/2025

CUDA profiling

Hey guys, how can I profile kernels on serverless GPUs Like I have a cuda kernal, how can I know it’s performance using serverless GPUs like RunPod gpus...

xnerhu

3/25/2025

Serverless handler on Nodejs

Hi. I see there is official SDK for serverless handler, but for Python. I don't see any API for handler in js-sdk.

zfmoodydub

3/23/2025

RunPod Serverless Inter-Service Communication: Gateway Authentication Issues

I'm developing an application with two RunPod serverless endpoints that need to communicate with each other: Service A: A Node.js/Express API that receives requests and dispatches processing tasks Service B: A Python processor that handles data and needs to notify Service A when complete ...

J.Lee

3/22/2025

Runpod ComfyUI Serverless Huggingface Models does nothing

When deploying a ComfyUI serverless endpoint, the attached screen appears which asks for Hugging Face Models. However when I checked the repo, it is not utilized at all. https://github.com/search?q=repo%3Arunpod-workers%2Frunpod-worker-comfy%20MODEL_NAME&type=code How do I download required models (.safetensors) and comfy nodes when deploying an endpoint?...

Solution:

when you press next, until there is environment variable, you can check what is added there. then you can do add same env's with the same docker image template

Message Not Public

smakster

3/20/2025

Serverless ComfyUI -> "error": "Error queuing workflow: HTTP Error 400: Bad Request",

I am running Serverless ComfyUI wirh Runpod and it is not working can someone please help ? i keep getting Job response: { "delayTime": 1009, "error": "Error queuing workflow: HTTP Error 400: Bad Request",...

eric.mattmann

3/19/2025

Error 404 on payload download.

Hi guys! I'm tryin to download a file to my endpoint for processing, using the runpod download utility, and sometimes but no always I get the message: ```...

EgoDeath

3/19/2025

Failed Faster-Whisper task

I continue to get this error and I cant figure out whats going on, please help ❤️ Job submitted: 3f6a6e02-5249-4faf-9fb3-49ac501c695d-u2 Job failed: {'delayTime': 163, 'error': '{"error_type": "<class 'av.error.InvalidDataError'>", "error_message": "[Errno 1094995529] Invalid data found when processing input: '/tmp/tmpi89o0mcn.wav'", "error_traceback": "Traceback (most recent call last):\n File \"/usr/local/lib/python3.10/dist-packages/runpod/serverless/modules/rp_job.py\", line 134, in run_job\n handler_return = handler(job)\n File \"/usr/local/lib/python3.10/dist-packages/runpod/serverless/utils/rp_debugger.py\", line 165, in call\n result = self.function(*args, **kwargs)\n File \"/rp_handler.py\", line 72, in run_whisper_job\n whisper_results = MODEL.predict(\n File \"/predict.py\", line 75, in predict\n segments, info = list(model.transcribe(str(audio),\n File \"/usr/local/lib/python3.10/dist-packages/faster_whisper/transcribe.py\", line 277, in transcribe\n audio = decode_audio(audio, sampling_rate=sampling_rate)\n File \"/usr/local/lib/python3.10/dist-packages/faster_whisper/audio.py\", line 46, in decode_audio\n with av.open(input_file, metadata_errors=\"ignore\") as container:\n File \"av/container/core.pyx\", line 401, in av.container.core.open\n File \"av/container/core.pyx\", line 272, in av.container.core.Container.cinit\n File \"av/container/core.pyx\", line 292, in av.container.core.Container.err_check\n File \"av/error.pyx\", line 336, in av.error.err_check\nav.error.InvalidDataError: [Errno 1094995529] Invalid data found when processing input: '/tmp/tmpi89o0mcn.wav'\n", "hostname": "187psw4ygtfrrm-64410c48", "worker_id": "187psw4ygtfrrm", "runpod_version": "1.5.2"}', 'executionTime': 184, 'id': '3f6a6e02-5249-4faf-9fb3-49ac501c695d-u2', 'status': 'FAILED', 'workerId': '187psw4ygtfrrm'}...

ozzie

3/18/2025

Delete Serverless Endpoint via the API?

I am trying to delete the serverless endpoint via the API, but everytime I make a request to the endpoint, I get an internal error: Via the Python API: ``` delete_endpoint_graphql = """mutation {{...

Solution:

Does it work when you use the REST API? https://rest.runpod.io/v1/docs#tag/endpoints/DELETE/endpoints/{endpointId}

Cemal

3/18/2025

Terminate worker

Hi y'all, is there any way to terminate specific worker (serverless) via api or as additional control in handler return? I do not want to refresh the worker i just want to terminate at my special occasions....

Zaid Qureshi (S25)

3/18/2025

Is it possible to response with Transfer-Encoding: Chunked

Hello, I'm using serverless endpoints. Currently return a JSON object. Is it possible to lets say directly return a wav file with Transfer-Encoding chunked So the response headers would be Content-Type: audio/wav...

WeamonZ

3/17/2025

disk quota exceeded serverless runpod github

Hi, I'm getting a disk quota exceeded when trying to build my runpod serverless from a github repo. It downloads a few models. Is there a maximum quota size ? ...

Solution:

okay, the build has timeouts too, so try to optimize for that

Message Not Public

ammar

3/17/2025

Ollama serverless?

is thaty any easy way to run ollama over serverless?

muggleborntribute#0

3/17/2025

Serverless docker image deployment

Hi, I finetuned a lora from llama 3.2 3B using unsloth. and want to deploy that on serverless. Using vLLM with merged model degrades the performance too much to be of use. I then, followed instructions from this link https://github.com/runpod-workers/worker-template/tree/main and created a serverless endpoint using the docker image. but it keeps on initializing and does not complete one job. job remains in queue. I might be missing something. I also don't have much experience with docker. I might be making a mistake there. But I did test the docker locally before deploying. I would appreciate any help regarding this....

Yobin

3/17/2025

Can you now run gemma 3 in the vllm container?

In the serverless, its seems im getting an error, any help on this

Previous Next

Gaming

Programming

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!