Runpod

R

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

Error response from daemon: Container is not paused.

Hello Team, After deploying a new docker image on a serverless endpoint I am getting the below errors in my system log: 024-07-30T11:56:27Z error starting: Error response from daemon: Container 2a638b70551885c464f48892d2d0fc9eed7eb590fbda42b33841d7e84b23b307 is not paused Can someone please help me this?...

The official a1111 worker fails to build

Attempts to build the main branch of https://github.com/runpod-workers/worker-a1111 fail. civit.ai no longer appears to allow unauthenticated model downloads, returning a 401. This was quite easy to fix. More importantly, the dependency chain appears to have regressed. Currently building the given repository, as cloned, with docker build --platform linux/amd64 -t test . results in...

RuntimeError: Found no NVIDIA driver on your system

I'm pulling my hair over this for 2 weeks now. I'm building a container with Comfyui and IDM-VTON custom_node and whenver I run it (serverless and on a pod) it gives me the "Found no driver" message. This container runs without a problem on my home 4090 and I'm using 4090 on runpod too. When I run the same container without IDM-VTON custom node, it runs fine. The container is the following: vazazon/comfyuivenv:dev What am I missing?...

Is the vLLM worker updated for LLaMA3.1 yet?

If not, is anyone aware of a good serverless container that does support it?

How to create network volume in EU-NL and EU-SE regions?

How can I create a network volume in EU-NL-1/2 and EU-SE-1/2?

Getting timeout with network volume

I want to install llama 3.1 70B model as serverless but 'cold start' takes too long 1-3 minutes. For this reason, I tried to do it with 'network volume', but this time the model cannot be downloaded for the first time, I keep getting timeout after waiting 6-7 minutes. In short, the model cannot be downloaded from HuggingFace servers and transferred to 'network volume'. I am using vLLM. Thanks for your help.

Running into this error while running idm-vton on runpod

``` packages/huggingface_hub/utils/_validators.py", line 160, in validate_repo_id 2024-07-27T09:39:56.116114493Z raise HFValidationError( 2024-07-27T09:39:56.116118532Z huggingfacehub.errors.HFValidationError: Repo id must use alphanumeric chars or '-', '', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: './IDM-VTON'. ...

Help Reducing Cold Start

Hi, I've been working with RunPod for a couple of months, it has been great. I know the image only downloads one time, I know you have two options for optimization, embedding the model on the docker image or having a network volume but with less flexibility since it will be located only on one region. I'm embedding my model on the docker image plus executing scripts to cache the loading, config or downloading. I'm using whisper-large-v3 model with my own code since it has a lot of optimizations. The cold start without any flashboot is between 15-45 seconds. My goal is to reduce this time as much as possible without depending on a high requests volume. ...

Is privileged mode possible?

I have an application that requires a kernel module be loaded. For an image to add a kernel modules requires privileged mode from the host. Is there anyway to get privileged mode enabled on my images so that I can add a kernel module to it?

Is there an easy way to take a python flask application as a serverless api hosting on Runpod??

I'm looking for a way to host a flask app, which worked great using ngrok on my local machine, as a serverless hosting (on-demand by api call) without having to change very much (unlike how AWS requires taking things apart and rebuilding to turn into lambda functions.) Is there a way to do this easily on Runpod?

Llama 3.1 via Ollama

You can now use the tutorial on running Ollama on serverless environments (https://docs.runpod.io/tutorials/serverless/cpu/run-ollama-inference) in combination with Llama 3.1. We have tested this with Llama 3.1 8B, using a network volume and a 24 GB GPU PRO. Please let us know if this setup also works with other weights and GPUs....
No description

Slow docker image download from GCP

Hi, I am experimenting with runpod recently. I tried to deploy a whisper image to runpod from my company GCP docker repo and I found it pretty slow. It look almost 10 minutes to download a 11GB size image. While I understand the image is huge, but I wonder is ther any things to do to speed up the process. For example, the repo location (current in asia, as my company is in asia)

Guide to deploy Llama 405B on Serverless?

Hi, can any experts on Serverless advice on how to deploy Llama 405B on Serverless?

How does the vLLM template provide an OAI route?

Hi, so the vLLM template provides an additional OAI compatible route. As I'm currently looking into making my own serverless template for exl2, I wondered how this was achieved as I currently don't see any description in the documentation about how to set it up and looking into the source doesn't seem to provide much more insight. If I check for job.get("openai_route") is that handled automatically or how would I go about adding it into the handler (or elsewhere)?

vllm

Any plans to update the vllm image worker? I would like to test Phi 3 and Llama 3.1, currently both are unsupported with the current image. (serverless)

Serverless worker failing - how do I stop it

I have a couple of questions. I use Runpod Serverless to power a ComfyUI API - it works well most of the time but today I noticed one of my serverless workers kept failing. The errors only occured with one of the workers, the others performed fine. Why would this be? and is there a way of terminating specific workers? also, how can I get notified if one of them is playing up? Thanks!...

Running Auto1111 getting - error creating container: cant create container; net

But it clears and eventually does run the item in the queue. I have network storage setup. Open to paid consultants on this. DM if interested.

Why "CUDA out of memory" Today ? Same image to generate portrait, yesterday is ok , today in not.

"delayTime": 133684, "error": "CUDA out of memory. Tried to allocate 1.50 GiB (GPU 0; 23.68 GiB total capacity; 18.84 GiB already allocated; 1.47 GiB free; 20.46 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF", "executionTime": 45263, "id": "ae1e4066-e2b7-43c1-8f37-3525bda03893-e1",...

GPU memory issue

I have a question, is there anyone from runpod can dm me so we can talk about it and dive into it, thanks!

runpod IP for whitelisting for cloud storage

I have cloudinary account, and from runpod I want to download images from cloudinary but I want it also to be secure so what IP to whitelist so my cloudinary account only accepts request from my runpod serverless request.