Runpod

R

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

Accept new task when continues to process the old one

Anyone know can i accept new task in worker continuing to process the old one (for example uploading to s3)? I discovered that my gpu task processes by 0.04 seconds but uploading to s3 spends more than 1 second And it's very unprofitable...

I want to use A100 with savings plans!

Are there any savings plans for the A100 GPU? I would like to pre-pay for a month, but can't find the option. Or is A100 not eligible for Savings Plans? I would appreciate it if you could let me know....

Custom Template Taking Hours To Initialize

I made a custom template with the docker template https://huggingface.co/spaces/rwitz/go-bruins-v2 and it is taking hours to initialize on serverless.

How to retire a worker and retry its job?

We're noticing that every so often, a worker gets corrupted, and doesn't produce correct output. It's easy enough for us to detect it inside the handler when it happens. Is there a built-in way to tell runpod the job failed, the worker is bad, and it should be refreshed and requeued? Or should I do this manually with "refresh_worker" and use the API to requeue?
Solution:
Ohhh ashelyk is right u can just return a true to selectively do it. I just read the how to do it on start but not the return: Return refresh_worker=True as a top-level dictionary key in the handler return. This can selectively be used to refresh the worker based on the job return. Example:...

Best practices

Hey 👋🏻 I have a few questions regarding Runpod serverless, specifically related to image generation tasks (like Stable Diffusion). 1. Storage - NVME vs Network Volume: I've read in some posts that storing models directly in the docker container is more cost and speed-efficient compared to using a network volume. For tasks involving various Stable Diffusion templates, does this mean all templates must be stored within the Docker container?...

Problem with venv

Hi, This is my handler.py file: ```#!/usr/bin/env python3 """ Example handler file. """ ...
Solution:
Finally works, thanks for all the help! The issue was the base image that was different.

Experiencing huge execution time on Serverless

We are currently experiencing huge execution time and jobs actually not being processed on Serverlesss. Is this something Runpod is working on right now?...
No description

Mount gpu in container

Hello, I'd like to use runpod to perform blender render tasks. I wish to use this docker image: https://hub.docker.com/r/linuxserver/blender The documentation say this about hardware accelaration: this container is capable of supporting accelerated rendering with /dev/dri mounted in For Nvidia, it requires the installation of Nvidia container tools...

"Initializing" State Duration

Hi everyone, I've recently started exploring serverless but encountered an issue. When I try to deploy, my deployments remain in an "Initializing" state for at least half an hour. Is this a normal part of the process, or could it indicate a problem with my Docker image? For context, I've successfully launched a different image before, and Quick Deploy options seem to launch instantly. Any insights or advice would be greatly appreciated! Thanks!...

Issue with Dependencies Not Being Found in Serverless Endpoint

I am encountering an issue with a network volume I created: First,I created a network volume and used it to set up a pod. During this setup, I modified the network volume: In the directory where the network volume was mounted, I created and activated a virtual environment (venv). I then installed various dependencies in this environment. Then, I have created a serverless endpoint that utilizes this network volume. As far as I understand, this network volume is mounted on the directory runpod-volume. I initiate the venv located in this directory and then start a program that is also stored there. However, I soon encounter a problem: the dependencies that I had installed are not being found....
Solution:
I think it is HIGHLY better to just bake the dependencies into the Dockerfile and activate it that way. Also without seeing ur dockerfile is hard...

Running script / ADetailer

Anyone got a way to install and run ADetailer?
Solution:
CMIIW 1. git clone into extensions/adetailer 2. python3 -m install (What it does?)...

progress updates implementation for Automatic1111 / ComfyUI

Hi. I can't really wrap my head around how progress updates would work in a real-world implementation. Ideally I would get detail feedback where my job is. Potentially even an intermediate image. The documentation on it is very short. There was a thread with some code examples: https://discord.com/channels/912829806415085598/1050627382110855198/1166028785234235444 ...

How much RAM do we have per Serverless endpoint?

I am curious about how does RAM usage work? Since for fly.io you can allocate RAM / CPU, but here for runpod we only choose the GPU/whatever VRAM the GPU has. But if I am doing something like memory intensive like video / audio processing, will it just crash at some point bc of it?...
Solution:
ram is least same as vram, most of the time its 25% higher than vram

Import PIL (pillow image library) in rp_handler.py

Hey everyone, i try to use a tool from the PIL module to process my generated image before sending back in the handler. Unfortunately i get a
Module no found Error
Module no found Error
although the library is installed in my network volume and is written in my requirements.txt of the handler. when i connect to my handler via web terminal and start in the command line python, i also cannot import PIL. So is it necessary to link some python libraries manually from my network volume to my handler or what is the best practice here?...
Solution:
Whatever is in your network volume isn’t just available on ur serverless for python packages This is bc when the docker initializes, and it gets python it will set it’s python packages to be something like /somewhere/packages. But not under runpod-volume/package....

Possible error in docs: Status of a job with python code

In the docs, there is this command to retrieve the status of a submitted job:
curl https://api.runpod.ai/v2/<your-api-id>/status/<your-status-id>
curl https://api.runpod.ai/v2/<your-api-id>/status/<your-status-id>
And in the docs, this should be the equivalent python code: ``# this requires the installation of runpod-python # with pip install runpod-python` beforehand...
Solution:
My apologies ```python import runpod runpod.api_key = "Your Key"...

Image is generated successfully, but cant not found for sending back

Hey everyone, i call my ComfyUI backend and recieve the message in the logs:
{"requestId": null, "message": "Images generated successfully for prompt: 5cf0fe28-0abd-4eb1-8d6d-f2bad258baa5", "level": "INFO"}
{"requestId": null, "message": "Images generated successfully for prompt: 5cf0fe28-0abd-4eb1-8d6d-f2bad258baa5", "level": "INFO"}
But as a overall response i get:...
Solution:
I fixed it, there was a Preview Image in the Flow. I removed it and now it works.

Serverless Endpoint Streaming

I'm currently working with Llama.cpp for my inference and have setup my handler.py file to be similar to this guide. https://docs.runpod.io/docs/handler-generator My input and handler file looks like this: ...

How to reduce cold start & execution time?

Hi , i have a serverless endpoint and it have like 70 sec cold start and 50 sec execution time.I was trying to change the gpu's and someting happened , it started to work so fast like 500ms cold starts and 10 sec exection time and output was fine? How did that happen do you guys have any idea ? how can i achieve that again?...

How to edit/view handler from a cog on replicate?

I was following this guide: https://blog.runpod.io/replicate-cog-migration/ And it says: Depending on the specifics of your application, you may need to modify the handler file before building But how can I do this? Where do I see the handler?...

Général advices on the pricing and the use of server less

Hello, I am not sure how does it work exactly. So I have a few questions. ...