Recommended DC and Container Size Limits/Costs
Hello, I’m new to deploying web apps and currently using a persistent network drive along with serverless containers to generate images. My app requires at least 24GB of RAM, and I’ve encountered some challenges in my current region (EU-RO-1): there aren’t many A100 or H100 GPUs available, and most of the 4090 GPUs are throttled.
Recommended Data Centers: Are there specific geographic data centers you’d recommend for better GPU availability and performance?
Performance and Costs: Since my usage isn’t constant, the containers often ‘wake up’ from idle or after being used by someone else. When this happens, the models (ComfyUI) have to load, leading to generation times ranging from 20 seconds to 3-4 minutes. I assume this delay occurs because the models are loading from a network-mounted drive rather than locally.
If I preload the models onto the containers to avoid this transfer, will it increase my container costs?
Where can I find information about container size limits and associated pricing?
Additional Resources: Could you recommend sources to learn more about best practices, cost optimization, and efficient use of serverless containers for workloads like mine?
30 Replies
Unknown User•10mo ago
Message Not Public
Sign In & Join Server To View
Yeah, It seems like H100s are not avilable generally they come and go, i will stick to EU-RO-1
I apprecaite you taking the time to respond! 🔥🤘🏆
Unknown User•10mo ago
Message Not Public
Sign In & Join Server To View
Yeah, I am thinking its worth upploading my models (Flux1 and Shuttle3.1) once to the image and let them load once when they innitialize. this way when they get a job, the loading is quick as i can see the large delay of 2 minutes is when comfyui loads the model , it's currently on the mounted drive and i think even though its the same DC, its just too slow to move 23GB on the network.
Yeah so that is exactly what i am not sure about - is there a limit? or a cost to the size of the containers? i could not find anywhere to read about that.
You got me exactly now. that's what i meant 👍
Any experience with that?
i don't mind having 100GB image, i only upload it once and let them deploy it.
Unknown User•10mo ago
Message Not Public
Sign In & Join Server To View
Sure, I will try and report back 🙂 and it will be the first time I am useful to others on a discord channel haha 🙂
yeah people already complaining that sometimes its 20 sceonds and sometimes 4 minutes 🙂
Unknown User•10mo ago
Message Not Public
Sign In & Join Server To View
So i need to ensure it is faster. also do you know what does the 'always active' option is on serverless? is that like a pod? always on, always charging?
seems good in terms of performance but might not be smart to do at the start as my demand is still very low
not many people on the app yet.
Unknown User•10mo ago
Message Not Public
Sign In & Join Server To View
i think since the datacenter is most likely an outsourced one, we can't trust it's a real LAN.. 1GB speed etc.. so even though its local, there is 2-3 min to transfer the file each time, seems like overkill
YEah, so this will be great for later, this way they will be reserved for me too i can grab all the H100 i need over time
Unknown User•10mo ago
Message Not Public
Sign In & Join Server To View
i only use 1 model each time. and total only have 2. the issue is even if i dont unload it, the container when it moves to idle its still there but after 2 minutes or so, the containers refresh or something.. not sure how it works exactly but i noticed if i queue loads of jobs it works faster as it does not unload but if i wait 5 minutes between each request, the container worker 'forgets' and needs to reload
am i missing something?
Unknown User•10mo ago
Message Not Public
Sign In & Join Server To View
well, logically it is but it could be a VLAN.. u never know with these things.. i am originally a TCP networks and routing engineer, been around many datacenters..
Unknown User•10mo ago
Message Not Public
Sign In & Join Server To View
exactlly. i wonder if the flashboot reset time can be adjusted, i recall seeing something
Unknown User•10mo ago
Message Not Public
Sign In & Join Server To View

Unknown User•10mo ago
Message Not Public
Sign In & Join Server To View
yes, can not set that.. unless i hack into their core system and find that parameter hahah
Unknown User•10mo ago
Message Not Public
Sign In & Join Server To View
yeah thats good since a generation can get stuck and charge me an arm and a leg: )
i am still new to this, i aim to learn every tiny bit of this - it's the only way to empower and use the most out of it. serverless is brilliant! allows a business to grow 🙂 normal 4090 on the cloud is like 2k monthly.
Unknown User•10mo ago
Message Not Public
Sign In & Join Server To View
I LOVE runpod.
and i am actually doing a project with one of their competitors but they dont have serverless and runpod's sdk is easy simple and they are ahead of the game.
Tested it with a heavier container when all models are inside it- sucha huge difference, loads super quick!!!
Unknown User•10mo ago
Message Not Public
Sign In & Join Server To View
90GB
Not the end of the world
Generating 4x1024 on flux model in seconds! Using H/A100 or even 4090s
Send you to check it out in private (don’t want everyone here running my credit hahah)
Out of curiosity, downloading the models is part of the Dockerfile configuration? Didnt' building and pushing the image take forever?
Unknown User•10mo ago
Message Not Public
Sign In & Join Server To View
I don't think my internet connection is what matter. I'm concerned about the internet connection of the workers.
If every time a worker starts, it has to download a 90GB image, wouldn't that take a long time?
Unknown User•10mo ago
Message Not Public
Sign In & Join Server To View