How to Speed Up S3 Upload or Make it Async in RunPod Serverless Deployments
I am currently exploring using RunPod as our primary in-house model deployment platform instead of Replicate (our current preferred platform). Our in-house models mostly are txt2img/img2img custom models.
One of the issues I'm facing while testing RunPod is long S3 upload times. For example, for one of our processes, the prediction time is ~1 second, but the S3 upload is taking up to 4-5 seconds (depending on image size), significantly increasing the overall prediction time.
This causes two main problems:
Note: I am already using RunPod's specific S3 upload function that uses chunking.
One of the issues I'm facing while testing RunPod is long S3 upload times. For example, for one of our processes, the prediction time is ~1 second, but the S3 upload is taking up to 4-5 seconds (depending on image size), significantly increasing the overall prediction time.
This causes two main problems:
- Long prediction times despite the GPU being free after just 1 second of actual processing
- Increased queue times as workers remain occupied during these long uploads
Note: I am already using RunPod's specific S3 upload function that uses chunking.