RunpodR
Runpod3w ago
Samsun

Potential Issues on Scaling a serverless Endpoint

Hello I want some information regarding scaling a saas project on runpod serverless endpoint.
I am thinking of hiring a develepor to build me a saas project that uses a Kokoro fast api docker container
I want to use 16GB GPU as 1st and 24GB as 2nd on setting up the configuration.
I don't ideally want to go above these due to costs per audio generated.
Audio length for each user will be from 8 to 10 minutes of generated audio.

My only issue is being able to scale to hundreds if not thousands of similtaneous users as the use case grows.

Bottom line is I want to offer this service to websites that will connect via an api or custom code snippet.
Which if impacts there own users then impacts the service I want to provide.

Realistically is runpod able to achieve this or will we most likely run into issues where users will be impacted?

I know I can add more funds for more workers and more GPU's so would this be sufficient or require an enterprise grade solution?
Was this page helpful?