RunPod

R

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

kernel dying issue

Starting today, the kernel has suddenly stopped working properly, and it keeps dying or failing to run. I need to quickly check the results, but all my work has come to a halt. I need a quick response regarding this kernel dying issue.

Question about delay and execution time billing

Is the total billable time = delay + execution or is the delay part of the execution time? In the example, I'm being billed for 80s (execution time) or 95s (execution + delay)?...
No description

Dependencies version issue between gradio and runpod

Hello everyone, I'm trying to rebuild the docker image from the runpod-workers repository for a1111 to change the model available and possibly try to add ad detailer and open pose to it. https://github.com/runpod-workers/worker-a1111 When I'm building the image I got this dependencies conflict that I can't solve The conflict is caused by:...

Serverless Deforum

Hello everyone! 👋 I'm new to RunPod and super excited to join this community. My goal is to build a simple website that multiple users can access with accounts to generate images and videos. The image generation part is already working—Serverless Stable Diffusion is up and running! ☀️ I'm really proud I got this working! 💪...

Video Editing

Hey, I currently have some python scripts to automate video editing with moviepy and some direct ffmpeg calls and was looking into using runpod serverless due to the speed of GPU with nvenc and add the functionality to our webtool so other members of my team can use it. Would this be cost effective and is the performance change worth it? Or is should I just upgrade my VPS CPU?

Execution time discrepancy

I built a custom text embedding worker, when i time the request on the pod it takes about 20ms to process from start to finish. The request takes a lot longer (about 1.5 seconds), and runpod returns an executionTime: 1088ms in the response object. do you know where this discrepancy might come from? As it is, it's currently really limiting the throughput of my worker, and there isn't much point in using a GPU if it's so heavily bottlenecked....

Speed

Hey guys just wondering if your runpod serverless speeds were good. Im running llama 3.1 8B on 16gb vram

Understanding RunPod Serverless Pods: Job Execution and Resources Allocation

I'm new to RunPod and need clarification on how serverless pods work. Here's my understanding: - RunPod serverless pods allow code to run when triggered, eliminating idle costs. - Code is executed as a job by a worker, accessed through an endpoint. - I can specify the number of jobs a worker can run....

How to force /runsync over 60 secs

Need to keep /runsync alive for over 60 seconds. No webhooks, async. Just want the /runsync to work as is just for longer exection times.

CORS issues

Access to XMLHttpRequest at 'https://api.runpod.ai/v2/bsy98fzdbod86f/run' from origin 'https://**********prod.web.app/' has been blocked by CORS policy: Request header field access-control-allow-origin is not allowed by Access-Control-Allow-Headers in preflight response. Any solutions for this?...

Sync endpoint returns prematurely

Sync endpoint sometimes randomly (about half of the time) responds prematurely with in progress json. The job finished however, I need the sync not to respond until the job is done.

Is it possible to see logs of a historical job ID?

I’ve had a user mention that their image didn’t process due to a processing error so I would like to see the logs leading up to the error. I have the job ID (from a day ago) can you advise how I can see the worker logs for that particular job in Runpod? FWIW: the job ID is 58f1b0ce-d4de-4711-b58a-1c42bb3d5017-u1...

Implement RAG with vllm API

Is it possible to implement RAG with the given API of vllm and our deployed model on the serverless endpoint.

How to deploy flux.schnell to serveless?

Title says it all Would be nice to have a guide on how to setup flux on a serverless endpoint also, I'm planning to train some loras and store them for future use ...

When ttl is not specificed in policy, one gets 500 with {"error":"ttl must be \u003e= 10,000 ms"}

Since ~40 min all my requests with 'executionTimeout': 120000 get this error with HTTP status 500. here is my repro curl -v -X POST "https://api.runpod.ai/v1/XXX/run" -H "Authorization: Bearer XXX" -H "Content-Type: application/json" -d '{"input": {"XXX": "XXX"}, "policy": {"executionTimeout": 120000}}'...

Pushing a new release to my existing endpoint takes too long

I've pushed a new release to my endpoint with a new Docker tag image. This tag only modifies some app code, so all the heavy docker layers should already be there. The system logs show "Pulling fs layer" for most of the layers except the first 5. Isn't RunPod caching the layers somewhere or does it have to pull ALL layers everytime I push a new release even though only the last layer has changed...?...

Serverless worker doesn't run asynchronously until I request its status in local development

I'm following the docs and created a verey simple handler.py with the following: ``` import runpod ...

increase workers

i have a requirement of 50 API calls at a time. Currently it is only 5 workers in serverless endpoint, its taking too long as each API response time is around 25 sec. Any solutions? anyone from the team please reach out. Thank you!...

Do I need to base my serverless worker image from the official base image?

I have my own Dockerfile already optimized with only the things I need to perform the inference. Runpod docs says that we should start by forking the worker-template, but basing from it I end up with a HUGE image. Is there anything special in the runpod/base image or can I just use my own and simply make sure I'm installing runpod in Python and exposing a handler function with CMD at the end?...