RunPod•4mo ago

Recommended DC and Container Size Limits/Costs

Hello, I’m new to deploying web apps and currently using a persistent network drive along with serverless containers to generate images. My app requires at least 24GB of RAM, and I’ve encountered some challenges in my current region (EU-RO-1): there aren’t many A100 or H100 GPUs available, and most of the 4090 GPUs are throttled. Recommended Data Centers: Are there specific geographic data centers you’d recommend for better GPU availability and performance? Performance and Costs: Since my usage isn’t constant, the containers often ‘wake up’ from idle or after being used by someone else. When this happens, the models (ComfyUI) have to load, leading to generation times ranging from 20 seconds to 3-4 minutes. I assume this delay occurs because the models are loading from a network-mounted drive rather than locally. If I preload the models onto the containers to avoid this transfer, will it increase my container costs? Where can I find information about container size limits and associated pricing? Additional Resources: Could you recommend sources to learn more about best practices, cost optimization, and efficient use of serverless containers for workloads like mine?

30 Replies

Jason•4mo ago

You can check when you create a network storage, which gpu is available in which dc

niroknoxOP•4mo ago

Yeah, It seems like H100s are not avilable generally they come and go, i will stick to EU-RO-1 I apprecaite you taking the time to respond! 🔥🤘🏆

Jason•4mo ago

It may or may not your container cost, if it's running ( even for moving files) it will be charged.. But worth trying if your worker is active. Because if your worker isn't always running, when it goes back to idle ( off) the files will be lost and you have to move it again What container size limits? I don't think there is any, but if you have a larger docker image( don't know how much till it's slow enough) surely it will take longer to load Oh or you mean by having the model inside image? Yeah sure people say it's faster if your models aren't too much and huge

niroknoxOP•4mo ago

Yeah, I am thinking its worth upploading my models (Flux1 and Shuttle3.1) once to the image and let them load once when they innitialize. this way when they get a job, the loading is quick as i can see the large delay of 2 minutes is when comfyui loads the model , it's currently on the mounted drive and i think even though its the same DC, its just too slow to move 23GB on the network. Yeah so that is exactly what i am not sure about - is there a limit? or a cost to the size of the containers? i could not find anywhere to read about that. You got me exactly now. that's what i meant 👍 Any experience with that? i don't mind having 100GB image, i only upload it once and let them deploy it.

Jason•4mo ago

Don't know if Runpod do have any guide for best practices but try checking the blog, maybe use depot.dev (optional) to optimize your image, watch some YouTube videos for container building, some blogs to optimize docker image, and finally resource from runpod docs to load model properly if you control the code( in this case you want to launch comfyui before you call the serverless.Start() function in your handler python file) Yeah try it if it's slow enough then maybe don't use that or try to split into different images (maybe a bit difficult) I do have some experience with having model inside the image but it's not that big.. It works well

niroknoxOP•4mo ago

Sure, I will try and report back 🙂 and it will be the first time I am useful to others on a discord channel haha 🙂 yeah people already complaining that sometimes its 20 sceonds and sometimes 4 minutes 🙂

Jason•4mo ago

Yup I'd like to hear that

niroknoxOP•4mo ago

So i need to ensure it is faster. also do you know what does the 'always active' option is on serverless? is that like a pod? always on, always charging? seems good in terms of performance but might not be smart to do at the start as my demand is still very low not many people on the app yet.

Jason•4mo ago

I also actually struggle with this last time, long time loading with comfyui It was not only the model but because of the extensions too but I halted my development.. So I didn't look any further yet Yes, it's always running, so the model you currently loaded will still be there Yup agreed

niroknoxOP•4mo ago

i think since the datacenter is most likely an outsourced one, we can't trust it's a real LAN.. 1GB speed etc.. so even though its local, there is 2-3 min to transfer the file each time, seems like overkill YEah, so this will be great for later, this way they will be reserved for me too i can grab all the H100 i need over time

Jason•4mo ago

I'd suggest if you have multiple models, make sure its not unloaded to make it fast, you can use higher vram gpu so it accommodates them well I've never tried this but seems to be a great idea to not de load model everytime The flashboot on serverless actually seems to reduce cold start by keeping the model warm in gpu, so it loads faster so that's why something like this can be used What do you mean?

niroknoxOP•4mo ago

i only use 1 model each time. and total only have 2. the issue is even if i dont unload it, the container when it moves to idle its still there but after 2 minutes or so, the containers refresh or something.. not sure how it works exactly but i noticed if i queue loads of jobs it works faster as it does not unload but if i wait 5 minutes between each request, the container worker 'forgets' and needs to reload am i missing something?

Jason•4mo ago

Network storage on a different data center you're implying?

niroknoxOP•4mo ago

well, logically it is but it could be a VLAN.. u never know with these things.. i am originally a TCP networks and routing engineer, been around many datacenters..

Jason•4mo ago

Yeah sometimes if you have a lot of requests the "flashboot" keeps your model loaded

niroknoxOP•4mo ago

exactlly. i wonder if the flashboot reset time can be adjusted, i recall seeing something

Jason•4mo ago

So when it "refreshes" it deloads Cannot hahah, but the more requests you have ( I'm guessing) it'll keep it way longer

niroknoxOP•4mo ago

Jason•4mo ago

Or the more time your worker is sctive

niroknoxOP•4mo ago

yes, can not set that.. unless i hack into their core system and find that parameter hahah

Jason•4mo ago

That execution timeout limits the time "running" for a job

niroknoxOP•4mo ago

yeah thats good since a generation can get stuck and charge me an arm and a leg: ) i am still new to this, i aim to learn every tiny bit of this - it's the only way to empower and use the most out of it. serverless is brilliant! allows a business to grow 🙂 normal 4090 on the cloud is like 2k monthly.

Jason•4mo ago

Oh that's cool, yeah I've never looked into their core systems but I believe it's in the same datacenter just has a low throughput to be able to handle the massive demand Hope you like runpod hahah

niroknoxOP•4mo ago

I LOVE runpod. and i am actually doing a project with one of their competitors but they dont have serverless and runpod's sdk is easy simple and they are ahead of the game. Tested it with a heavier container when all models are inside it- sucha huge difference, loads super quick!!!

Jason•4mo ago

Nicee So how big did it end up

niroknoxOP•4mo ago

90GB Not the end of the world Generating 4x1024 on flux model in seconds! Using H/A100 or even 4090s Send you to check it out in private (don’t want everyone here running my credit hahah)

Renzo•4mo ago

Out of curiosity, downloading the models is part of the Dockerfile configuration? Didnt' building and pushing the image take forever?

Jason•4mo ago

Not really if you have great internet connection, and how big is your model

Renzo•4mo ago

I don't think my internet connection is what matter. I'm concerned about the internet connection of the workers. If every time a worker starts, it has to download a 90GB image, wouldn't that take a long time?

Jason•4mo ago

Let's it first then, see if it takes too long then get back here again

Gaming

Programming

Recommended DC and Container Size Limits/Costs

Did you find this page helpful?