Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡｜serverless

⛅｜pods

🔧｜api-opensource

📡｜instant-clusters

🗂｜hub

kip

8/24/2024

Execution time discrepancy

I built a custom text embedding worker, when i time the request on the pod it takes about 20ms to process from start to finish. The request takes a lot longer (about 1.5 seconds), and runpod returns an executionTime: 1088ms in the response object. do you know where this discrepancy might come from? As it is, it's currently really limiting the throughput of my worker, and there isn't much point in using a GPU if it's so heavily bottlenecked....

Jaay

8/24/2024

Speed

Hey guys just wondering if your runpod serverless speeds were good. Im running llama 3.1 8B on 16gb vram

Taa97

8/23/2024

Understanding RunPod Serverless Pods: Job Execution and Resources Allocation

I'm new to RunPod and need clarification on how serverless pods work. Here's my understanding: - RunPod serverless pods allow code to run when triggered, eliminating idle costs. - Code is executed as a job by a worker, accessed through an endpoint. - I can specify the number of jobs a worker can run....

abtx

8/23/2024

How to force /runsync over 60 secs

Need to keep /runsync alive for over 60 seconds. No webhooks, async. Just want the /runsync to work as is just for longer exection times.

thushar

8/23/2024

CORS issues

Access to XMLHttpRequest at 'https://api.runpod.ai/v2/bsy98fzdbod86f/run' from origin 'https://**********prod.web.app/' has been blocked by CORS policy: Request header field access-control-allow-origin is not allowed by Access-Control-Allow-Headers in preflight response. Any solutions for this?...

abtx

8/23/2024

Sync endpoint returns prematurely

Sync endpoint sometimes randomly (about half of the time) responds prematurely with in progress json. The job finished however, I need the sync not to respond until the job is done.

tzk

8/23/2024

Is it possible to see logs of a historical job ID?

I’ve had a user mention that their image didn’t process due to a processing error so I would like to see the logs leading up to the error. I have the job ID (from a day ago) can you advise how I can see the worker logs for that particular job in Runpod? FWIW: the job ID is 58f1b0ce-d4de-4711-b58a-1c42bb3d5017-u1...

Cyber | Senpai

8/22/2024

Implement RAG with vllm API

Is it possible to implement RAG with the given API of vllm and our deployed model on the serverless endpoint.

Roberto

8/22/2024

How to deploy flux.schnell to serveless?

Title says it all Would be nice to have a guide on how to setup flux on a serverless endpoint also, I'm planning to train some loras and store them for future use ...

Augenbrauensenker

8/21/2024

When ttl is not specificed in policy, one gets 500 with {"error":"ttl must be \u003e= 10,000 ms"}

Since ~40 min all my requests with 'executionTimeout': 120000 get this error with HTTP status 500. here is my repro curl -v -X POST "https://api.runpod.ai/v1/XXX/run" -H "Authorization: Bearer XXX" -H "Content-Type: application/json" -d '{"input": {"XXX": "XXX"}, "policy": {"executionTimeout": 120000}}'...

EMPZ

8/21/2024

The Workers tab of an endpoint crashed with a frontend error

It crashes every single time.

EMPZ

8/21/2024

Pushing a new release to my existing endpoint takes too long

I've pushed a new release to my endpoint with a new Docker tag image. This tag only modifies some app code, so all the heavy docker layers should already be there. The system logs show "Pulling fs layer" for most of the layers except the first 5. Isn't RunPod caching the layers somewhere or does it have to pull ALL layers everytime I push a new release even though only the last layer has changed...?...

EMPZ

8/21/2024

Serverless worker doesn't run asynchronously until I request its status in local development

I'm following the docs and created a verey simple handler.py with the following: ``` import runpod ...

thushar

8/21/2024

increase workers

i have a requirement of 50 API calls at a time. Currently it is only 5 workers in serverless endpoint, its taking too long as each API response time is around 25 sec. Any solutions? anyone from the team please reach out. Thank you!...

EMPZ

8/21/2024

Do I need to base my serverless worker image from the official base image?

I have my own Dockerfile already optimized with only the things I need to perform the inference. Runpod docs says that we should start by forking the worker-template, but basing from it I end up with a HUGE image. Is there anything special in the runpod/base image or can I just use my own and simply make sure I'm installing runpod in Python and exposing a handler function with CMD at the end?...

JJY

8/21/2024

Why my docker image used for my serverless endpoint is not updating?

Hi team, I pushed a new version of docker image to my personal docker hub, and I want to update my serverless endpoint to use my latest docker image. What I did was that I clicked the new release in my endpoint setting, but it is not working for me. My runpod endpoint shows no sign of updating. Can anyone help?...

shawtyisaten

8/21/2024

worker keeps dying while training a lora model

even after setting the worker to be active, it keeps dying after like 2 minutes. is there a way to prevent this?

madiator

8/21/2024

Long latencies

I have a 7B model that is supposed to be very fast (it checks if a claim is supported by a context, and gives a yes/no answer). If I rent a H100, I can process my prompt and get a response in 100ms (for a prompt that's about 1400 words). But a very short prompt (about 200 words) when using serverless takes about 1.3 to 1.5 seconds. I tried to have "active workers" but that didn't help. Any tips on how to reduce the latency?...

thushar

8/20/2024

Edit endpoint with new docker image

Is it possible to update the deployed endpoint with new docker image linked to template?

nalak

8/20/2024

Running a specific Model Revision on Serverless Worker VLLM

How do I specify the model revision on serverless? I was looking through the readme in https://github.com/runpod-workers/worker-vllm and I see I can build a docker image with the revision I want, but is that the only way to go about this? Specifically, I wanna setup this huggingface model: https://huggingface.co/anthracite-org/magnum-v2-123b-exl2 edit: fixed the model link...

Previous Next

Gaming

Programming

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!