Runpod

R

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

Failed to queue job

Why am I getting this error? I switched gpu's on serverless and ran same api call. { "delayTime": 2379, "error": "Failed to queue job",...

ComfyUI ValueError: not allowed to raise maximum limit

I deployed a comfyui with the comfyui_controlnet_aux plugin included, and when I run the test runpod locally it works fine, but after deploying to serveless I get the following error.

Webhook duplicate requests

Hi, I noticed webhook requests replay starting last night. I would like to know the behavior of webhook. Will it be possible to receive same requests multiple times even the first request is received successfully?...

Request Format Runpod VLLM Worker

``` { "conversation": { "id": "some_conversation_id", "messages": [...

image returns as base64

I'm using the SD Anything v5 model. when im making a api call (RUNSYNC/RUN) im getting the image in a base64 text. How do i get it to return a link to the image? my only header except authorization is Content-Type: application/json...

Request stuck in "IN_QUEUE" status

Hi, I tried to setup a endpoint for "rembg" package using cog. It seems to run fine locally, but when I hit the serverless endpoint the request is always "IN_QUEUE" even though the logs shows that the server is up and running. Do you know what could be the issue?...

Rundpod VLLM Cuda out of Memory

Hi I've been using the default runpod VLLM template with the mixtrial model loaded in the network volume. I'm encountering CUDA out of memory on cold starts. Here is the error log. ...

Automate the generation of the ECR token in Serverless endpoint?

I want to use AWS ECR to store my serverless images. However, the token expires and I do not find the way to automate the regeneration. Please, let me know if there is a way of doing this.

Worker handling multiple requests concurrently

I have an application where a single worker can handle multiple requests concurrently. I'm not finding a way of allowing this in runpod serverless. The multiple requests are always queued when using a single worker. Is this possible?...

Issue with a worker hanging at start

No code changes. Im also hitting an issue where workers are stuck with "loading image from cache"

Serverless inference API

Does RunPod offers Automatic1111 serverless API ?

Do you get charged whilst your request is waiting on throttled workers?

I did everything I can to reduce cold start times on my end. I managed to install everything onto a Docker container (and avoid using Network Volume), and I'm using the official runpod A11111 worker with minor modifications. Unfortunately the cold start times are still random. I noticed that despite setting 3 Max workers to an endpoint, all 5 workers would get the "Throttled" status, which I'm guessing is the reason why the cold start times are so random. If I'm getting a very high request time due to Throttled workers, in this case, one basic request took 170 seconds - am I getting charged for the entire 170 seconds?...
No description

Is there a way to send an request to cancel a job if it takes too long?

I'm trying to find a way to cancel an API request if it takes too long. This is so that I have a way to deal with requests that are stuck "IN_QUEUE". Preferably something like: https://api.runpod.ai/v2/{endpoint_id}/cancel_job/{job_id}...

#How to upload a file using a upload api in gpu serverless?

This is my current code, there is a separate fastapi running. ```python automatic_session = requests.Session() retries = Retry(total=10, backoff_factor=0.1, status_forcelist=[502, 503, 504]) automatic_session.mount('http://', HTTPAdapter(max_retries=retries))...

All of the workers throttled even if it shows medium availability?

When we created an endpoint in a serverless manner, we noticed that none of our queries were being processed. When we looked inside the endpoint, we saw that all the workers were throttled. However, these machines appear to be available in terms of their availability status, how can we solve this?
No description

Unreasonably high start times on serverless workers

I'm trying to deploy a serverless endpoint for A1111 instances using a preconfigured network volume. I've followed the steps shown in this tutorial https://www.youtube.com/watch?v=gv6F9Vnd6io But my workers seem to be running for multiple minutes with the container logs filled with the same message "Service not ready yet. Retrying..." Am I missing something here?...
No description

Using Same GPU for multiple requests?

Hello @here, I am using ComfyUI + my own custom scripts to generate images. I have set it up in RunPod Serverless (A100) GPUs in the following way: The request contains an image URL....

Creating serverless templates via GraphQL

I want to automatically create serverless templates from command line/python. I tried the example for creating a new pod here: https://docs.runpod.io/docs/create-pod-template which worked. I want to create a serverless template though, but when I add isServerless: true to the graphql request (as per the PodTemplate documentation) it errors out with: ```...

streaming

i'm used to using openai to stream text all in one request that is done async. why does runpod just spam requests for the /stream endpoint instead of making it one request and feeding data as it's generated?

Issue with Worker Initiation Error Leading to Persistent "IN_PROGRESS" Job Status

Hi All, While testing the endpoint, I observed that when initiating a job with an empty input in the request body (using {"input": {}} or an empty JSON {}), the worker fails to start successfully. Despite this, the job status continually displays as "IN_PROGRESS". This problem persists without reaching a resolution or a timeout state, necessitating manual intervention to cancel the job. The log output indicates an error message: "Job has missing field(s): input." However, the job status doesn't reflect this error and remains indefinitely in progress....
No description