R
RunPodACiDGRiM

server less capability check

I want to add runpod into a tier of load balanced llm models behind an app like openrouter.ai, but the decision will occur in our infrastructure. When i invoke a server less instance with my app and a task is completed, how am I billed for idle time if the container unloads the model from gpu memory? In other words I want to reduce costs and increase performance by only needing to load the model after an idle timeout, paying only for the small app footprint in storage/memory
Solution:
You are charged for the entire time the container is running including cold start time, execution time and idle timeout.
Solution
A
ashleyk44d ago
You are charged for the entire time the container is running including cold start time, execution time and idle timeout.
A
ACiDGRiM44d ago
I thought so. Do the containers have docker capabilities to create a wireguard interface?
A
ashleyk44d ago
You can't access the underlying docker stuff on the host machine if that's what you're asking
A
ACiDGRiM44d ago
I don't mean the docker socket. I mean I want to create a VPN tunnel to my AWS tenant, rather than dealing with pki in the container
Want results from more Discord servers?
Add your server
More Posts
GPU memory usage is at 99% when starting the task.I started to notice some GPU OOM failure today, and it's specific to this instance: A40 - 44adfw5inhShould i wait for the worker to pull my imageI have a large image (100 GB), should i wait for worker to pull the image before starting any inferePossible memory leak on ServerlessWe're testing different mistral models (cognitivecomputations/dolphin-2.6-mistral-7b and TheBloke/doare we able to run DinD image for GPU pods?Hi, anyone tried running DinD in GPU pods?Runpod error starting container2024-03-07T14:40:19Z error starting container: Error response from daemon: failed to create task forRunpod SD ComfyUI Template missing??Where did the "Runpod SD ComfyUI" template go? Can anyone help? I've been using it extensively for aDockerless dev and deploy, async handler need to use async ?handler.py in HelloWorld project, there is not 'async' before def handler(job): . But in serverlesSomething broken at 1am UTCSomething was broken at 1am UTC which caused a HUGE spike in my cold start and delay times.Should I use Data Centers or Network Volume when confige serverless endpoint ?My project is an AI portrait app targeting global users. The advantage of using data centers is the Pod OutageCurrently taking 100x longer to pull the docker image and when it eventually builds I have an API seAre stream endpoints not working?This is a temp endpoint just to show you all. /stream isn't available, what's up?Cuda - Out of Memory error when the 2nd GPU not utilizedI have a pod with 2 x 80 GB PCIe and I am trying to load and run Smaug-72B-v0.1 LLM. The problem is,Postman returns either 401 Unauthorized, or when the request can be sent it returns as Failed, errorPostman reads the following, when I send runsync request from runpod tutorial (from generativelabs) Backdrop Build V3 Credits missingHi team, I hope this message finds you well. I am writing to follow up on the recent offer I receiv