RunPod

R

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods-clusters

Enabling CLI_ARGS=--trust-remote-code

I am trying to run some of the SOTA models and the error logs tell me that I need to enable this CLI flag. How can I do that?

CUDA profiling

Hey guys, how can I profile kernels on serverless GPUs Like I have a cuda kernal, how can I know it’s performance using serverless GPUs like RunPod gpus...

Serverless handler on Nodejs

Hi. I see there is official SDK for serverless handler, but for Python. I don't see any API for handler in js-sdk.

RunPod Serverless Inter-Service Communication: Gateway Authentication Issues

I'm developing an application with two RunPod serverless endpoints that need to communicate with each other: Service A: A Node.js/Express API that receives requests and dispatches processing tasks Service B: A Python processor that handles data and needs to notify Service A when complete ...

Runpod ComfyUI Serverless Huggingface Models does nothing

When deploying a ComfyUI serverless endpoint, the attached screen appears which asks for Hugging Face Models. However when I checked the repo, it is not utilized at all. https://github.com/search?q=repo%3Arunpod-workers%2Frunpod-worker-comfy%20MODEL_NAME&type=code How do I download required models (.safetensors) and comfy nodes when deploying an endpoint?...
Solution:
when you press next, until there is environment variable, you can check what is added there. then you can do add same env's with the same docker image template
No description

Serverless ComfyUI -> "error": "Error queuing workflow: HTTP Error 400: Bad Request",

I am running Serverless ComfyUI wirh Runpod and it is not working can someone please help ? i keep getting Job response: { "delayTime": 1009, "error": "Error queuing workflow: HTTP Error 400: Bad Request",...

Error 404 on payload download.

Hi guys! I'm tryin to download a file to my endpoint for processing, using the runpod download utility, and sometimes but no always I get the message: ```...

Failed Faster-Whisper task

I continue to get this error and I cant figure out whats going on, please help ❤️ Job submitted: 3f6a6e02-5249-4faf-9fb3-49ac501c695d-u2 Job failed: {'delayTime': 163, 'error': '{"error_type": "<class 'av.error.InvalidDataError'>", "error_message": "[Errno 1094995529] Invalid data found when processing input: '/tmp/tmpi89o0mcn.wav'", "error_traceback": "Traceback (most recent call last):\n File \"/usr/local/lib/python3.10/dist-packages/runpod/serverless/modules/rp_job.py\", line 134, in run_job\n handler_return = handler(job)\n File \"/usr/local/lib/python3.10/dist-packages/runpod/serverless/utils/rp_debugger.py\", line 165, in call\n result = self.function(*args, **kwargs)\n File \"/rp_handler.py\", line 72, in run_whisper_job\n whisper_results = MODEL.predict(\n File \"/predict.py\", line 75, in predict\n segments, info = list(model.transcribe(str(audio),\n File \"/usr/local/lib/python3.10/dist-packages/faster_whisper/transcribe.py\", line 277, in transcribe\n audio = decode_audio(audio, sampling_rate=sampling_rate)\n File \"/usr/local/lib/python3.10/dist-packages/faster_whisper/audio.py\", line 46, in decode_audio\n with av.open(input_file, metadata_errors=\"ignore\") as container:\n File \"av/container/core.pyx\", line 401, in av.container.core.open\n File \"av/container/core.pyx\", line 272, in av.container.core.Container.cinit\n File \"av/container/core.pyx\", line 292, in av.container.core.Container.err_check\n File \"av/error.pyx\", line 336, in av.error.err_check\nav.error.InvalidDataError: [Errno 1094995529] Invalid data found when processing input: '/tmp/tmpi89o0mcn.wav'\n", "hostname": "187psw4ygtfrrm-64410c48", "worker_id": "187psw4ygtfrrm", "runpod_version": "1.5.2"}', 'executionTime': 184, 'id': '3f6a6e02-5249-4faf-9fb3-49ac501c695d-u2', 'status': 'FAILED', 'workerId': '187psw4ygtfrrm'}...

Delete Serverless Endpoint via the API?

I am trying to delete the serverless endpoint via the API, but everytime I make a request to the endpoint, I get an internal error: Via the Python API: ``` delete_endpoint_graphql = """mutation {{...

Terminate worker

Hi y'all, is there any way to terminate specific worker (serverless) via api or as additional control in handler return? I do not want to refresh the worker i just want to terminate at my special occasions....

Is it possible to response with Transfer-Encoding: Chunked

Hello, I'm using serverless endpoints. Currently return a JSON object. Is it possible to lets say directly return a wav file with Transfer-Encoding chunked So the response headers would be Content-Type: audio/wav...

disk quota exceeded serverless runpod github

Hi, I'm getting a disk quota exceeded when trying to build my runpod serverless from a github repo. It downloads a few models. Is there a maximum quota size ? ...
Solution:
okay, the build has timeouts too, so try to optimize for that

Ollama serverless?

is thaty any easy way to run ollama over serverless?

Serverless docker image deployment

Hi, I finetuned a lora from llama 3.2 3B using unsloth. and want to deploy that on serverless. Using vLLM with merged model degrades the performance too much to be of use. I then, followed instructions from this link https://github.com/runpod-workers/worker-template/tree/main and created a serverless endpoint using the docker image. but it keeps on initializing and does not complete one job. job remains in queue. I might be missing something. I also don't have much experience with docker. I might be making a mistake there. But I did test the docker locally before deploying. I would appreciate any help regarding this....

Can you now run gemma 3 in the vllm container?

In the serverless, its seems im getting an error, any help on this

"Max Retries Reached"

For some reason, I see this error wayy more commonly now than before, is there a reason?
No description

"Something went wrong" trying to create a new endpoint

I'm trying to create a new serverless endpoint, and I just get an error saying "Something went wrong. Please try again later or contact support. Something went wrong. Please try again later or contact support. Something went wrong. Please try again later or contact support." when I hit the final create endpoint. No choices seem to impact this (I get this regardless of what GPUs I choose, number of workers, network volume or not, etc.). Any ideas?

Faster-Whisper output "None" — log 400 "Bed request"

Hi! I’m using the prebuilt Faster Whisper serverless endpoint to transcribe audio files. I send a transcription request using a direct-download Google Drive URL, and the job completes (status COMPLETED) but returns no transcription text—the output field is None. The logs show repeated 400 “Bad Request” errors when trying to return job results. It works locally, but it appears the prebuilt container isn’t packaging the transcription result properly or in a different format. Any ideas what could b...

Can someone help me integrate a JS docker endpoint that executes FFMPEG?

No matter what I try, the job never gets processed and stays "in_queue". Please help

Anyone get vLLM working with reasonable response times?

Seems that no matter how I configure serverless with vLLM, the workers are very awkward in picking up tasks and even with warm containers tasks in the queue sit around for minutes for no obvious reason. Has anyone actually been able to use serverless vLLM for a production use case?