RunPod

R

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

Job response not loading

Hi guys, seems like its stuck in loading for at least 1 - 2 minutes. Anyone has any idea what's going on?
No description

All of a Sudden , Error Logs

My Serverless endpoint has been worknig fine up until yesterday. I woke up today and i'm getting this error logs. I'm pretty sure i did'nt change anything in my code. Even when i send the request from the runpod interface , i get the same error logs. Please i need this fixed ASAP because i have people depending on the endpoint { "delayTime": 6470, "error": "Error queuing workflow: <urlopen error [Errno 111] Connection refused>",...

Serverless upscale workflow is resulting in black frames.

Hi, I am attempting to make a simple FLUX upscale process request using the serverless service. I built my image with Docker, but it gives me black frames in the output. I am using FLUX dev with FP8 and the standard VAE. Any ideas?

Failed to load docker package.

2025-01-02T05:04:09Z error pulling image: Error response from daemon: Head "https://ghcr.io/v2/ammarft-ai/img-inpaint/manifests/1.31": denied: denied It was working before...

Serverless SGLang - 128 max token limit problem.

I'm trying to use the subject template. I have always the same problem, the number of token of the answer is limited to 128. I don't know how to change the configuration.,,, I've tried with Llama 3.2 3B and Mistral 7B and with both happens the same problem. I've tried to ste the following environment variables with higher numbers than 128 with now luck ......

Too big requests for serverless infinity vector embedding cause errors

I keep running into "400 Bad Request" server errors for this service, and finally discovered that it was because my requests were too large and running into this constraint: https://github.com/runpod-workers/worker-infinity-embedding/blob/acd1a2a81714a14d77eedfe177231e27b18a48bd/src/utils.py#L14 ```python INPUT_STRING = StringConstraints(max_length=8192 * 15, strip_whitespace=True) ITEMS_LIMIT = { "min_length": 1,...

Cannot send request to one endpoint

I have deployed 4 endpoints on runpod, each having different work to do. I can send request to three of my endpoints but for 1, i can't send request, it gives me timeout error and even the job status is not changing on runpod UI. I have tried deleting the endpoint and deploying it again but same problem.

Settings to reduce delay time using sglang for 4bit quantized models?

I'm deploying 4bit AWQ quantized model: casperhansen/llama-3.3-70b-instruct-awq The delay time for parallel requests increases exponentially when using tsglang template. What settings I need to use to make sure the delay time is manageable?...

How to make api calls to the endpoints with a System Prompt?

Hi everyone, I’m new to using Runpod’s serverless endpoints for LLM calls. So far, I’ve only worked with OpenAI APIs, and we’ve built a product around GPT-4 models. Now, we’re planning to transition to open-source alternatives. I’ve successfully created serverless endpoints on Runpod for models like Qwen14B Instruct and Llama 8B Instruct. I can get outputs from these models using both the Runpod SDK and the UI with JSON input like this: ```{...

Serverless GPUs unavailable

The Serverless GPU that i'm using are always unavailable. Is ther any plans to make them more available in the near future or is there any other solution ?
No description

Where to find gateway level URL for serverless app

Hi folks, I have an app running on serverless infra. I want to use its http endpoint. I am not able to find a static host which I can use to access the app. When I go inside workers tab under endpoint, I can see it has an option to open HTTP based session but that seems to be associated with worker and not with endpoint itself. I tried accessing by endpoint id as well but it did not work. Would anyone point out please?...
No description

Attaching network volume with path inside pod

Hey guys, I have an app running inside container and I want a path from my network drive to be mounted as path inside container. For instance, I have path /app/models inside my container. I want to keep some models inside my network drive and want to be used by pod as /app/models. Not finding any solid documentation around this...

Running worker automatically once docker image has been pulled

Hello I'm using a serverless runpod worker to run a comfyui workflow. This is the GitHub repo: https://github.com/blib-la/runpod-worker-comfy What I'd like to implement is a warmup workflow to be initiated once the image has been successfully pulled. This will ensure my comfyui models will be cached for future runs and save on cold start times....

Mail provider

I am wondering can I run my email provider on this platform

Build can't find requirements.txt

For some reason when trying to create an endpoint, it can't find the requirements when running COPY requirements.txt /app/ using Docker, although it is in the same directory as the Dockerfile itself. It only happens when using runpod and it works when using Docker locally. This is my first time doing something with Docker, so please don't roast me xD
No description

Efficient serverless release with image caching

My current deploy system is to push a updated docker image to docker hub (with git hash as tag) and run "new release" and pass the updates docker hub instance tag. However, it does a full download of the new image (20-30gb) instead of caching. Every small update is currently very slow to roll out. Is there a more efficient way of doing deploys? Or letting it know that a docker image has been updated-in place during dev? Thank you!

Huggingface space on Serverless. How to get the Gradio API string which is the same as Worker ID?

I deployed Huggingface Space which use Gradio. If I have worker ID then I can connect to the worker usually like https://${workerID}-proxy.runpod.net/ How can I either the available workerIDs or forward my request from serverless endpoint to Gradio API which uses something like: ```...

Has the issue of slow loading models from network volumes been resolved?

1.Previously when using serverless, loading the model from a network volume from was very slow, has that issue been resolved
2. does the pod have the same issue when using a network volume....

Environment Variables Crossing Serverless Endpoints

I have a 3 endpoints that use the same serverless template then I update the docker image and env vars as needed. My issue is the environment variables seem to sync between the 3 endpoints. ie I have value X that for endpoints should be A, B, C. But after setting them I can see the endpoints all have X=C....

Charge of 50 USD failed cause I don't have enough money. Balance is 99USD. Do I need to recharge?

I currently have a balance of $99 in my RunPod account and am using a serverless endpoint. Recently, there was an attempted charge of $50 that failed due to insufficient funds. I will automatically get some money in a couple of days. But what happens if RunPod tries again to take $50 and fail? Will it stop the endpoint even if I have $99 in my runpod balance?