Disk Volume Pricing for Serverless
I'm looking for clarification on disk pricing for serverless workers. On the pricing page a Container Disk price for running pods of $0.10/GB/month (and $0.20/GB/month for idle pods).
How does this translate to the serverless workers? When I create a template for my endpoint I specify Volume Disk (e.g 20Gb); how am I being charged for this?
20 * $0.20 *number of workers per month (assuming the workers are idle)?
In that case, does it make more sense to use a network volume, shared between the workers of the endpoint?
Side question: is there a way to programmatically push data to a network volume?
Thanks for your help!
13 Replies
My understanding is you're only being billed for the existence of the Network Storage Disk, not for how you use it. Pods are priced differently, storage is only priced:
For Disks <= 1TB:
$0.07/GB/Month
For Disks > 1TB:
$0.05/GB/Month
The Create Network Volume page has a nice little calculator, the number there is all you should expect to be charged.
https://www.runpod.io/console/user/storage/create
PS: Not yet, but keep an eye out for an announcement here about using S3 APIs to access Network Volumes in a few weeks.
Thank you for clarifying.
So I'm charged for the disk space specified in my Template, regardless of how many workers I have? So a Disk Volume of 20Gb in my Template comes down to 20 * $0.07 = $1.4 regardless of how many workers my endpoint has?
Disclaimer: I dont work at runpod, just a random techie, so take with a grain of salt
When you create a volume, you get a chunk of disk space in one of the regions. You pay for the space allocated, regardless of how much data there is (it might be full 100%, or empty, does not matter). It also does not matter how many workers are using your disk, as the disk space is mounted into a worker, not duplicated per worker. There might be a dip in speed when multiple workers try to read/write the same file
That's the Network Volume, right? There is also the Disk Volume https://docs.runpod.io/pods/storage/types#disk-volume which is specified when creating a template for a serverless endpoint.
I think the Disk Volume corresponds to the volumeInGb parameter in the Create Template API endpoint: https://rest.runpod.io/v1/docs#tag/templates/POST/templates, while the Network Volume would be specified in the networkVolumeId parameter of the Create Endpoint API endpoint https://rest.runpod.io/v1/docs#tag/endpoints/POST/endpoints
As I understand it the Disk Volume is different from the Network Volume, and I believe the Disk Volume is not shared between workers, so I am wondering how I am charged for Disk Space; once, or for every worker (since every worker will have it's own separate disk space).
The main reason I am asking about this is because I am downloading models onto my serverless workers when they get the handler receives its first request, which means that every time a worker gets it's first request I have to wait for it to download the models; working with a Network Volume sounds like a better alternative, but I would need to be able to load the models onto it programmatically. I also wanted to understand the pricing of the two storage options
I'm using network volumes to store models. One piece of feedback - loading models from there seems 3X slower than on my local machine (I am using comyui), not sure if it would be different for disk as opposed to network volume
Anyways, using disk for storing models would increase cold starts as you have to load 20GB from container cache as opposed to 10 for example if models are on the network volume. Also your containers will take much longer to deploy (for the same reason)
It's charged per minute/gb use granularity if I'm not wrong
Container disk is charged like running pods, when your worker is running
@DIRECTcut ▲ I got it to work with a network volume, but like you said it takes much longer to load the models onto VRAM. Have you figured out a way to reduce it?
I'm going to test baking the models into the Docker image, but am worried about the performance of this when I need +40Gb of models
The physical speed of the nw volume is the bittleneck so you cant reduce it
Baking the model in will make the image start time bad if the image is not cached
So cold start times will be bad for some time
And then improve gradually as more hosts gets the images cached
Best way to know for sure is to test it
Just tested with baked model, not great either.
A dirty hack I'm doing is having the models in a network volume, then copying the models to my disk outside the handler; this adds about 5-10s when the worker is fresh. It's not super elegant, but the model loading times are much faster (13s vs 60s) and my image is not too big 🤷♂️
Looking forward to having a cleaner solution than this in the future
That's strange
It should take a lot longer than 10 seconds
Assuming model size is 40 gigs
no the models are around 26Gb, but was planning on working with larger models down the line (40Gb)
How does baked model perform
My guess is it takes forever to pull the image and loads the model fast
True, seems faster. Will run some more tests later
Baked models definitely perform better, getting more consistent response times. But not loving the +30Gb Docker images