How to resolve throttling on RunPod Serverless endpoints?

Hi,

I’m using a RunPod Serverless endpoint and occasionally I see the endpoint becoming throttled during inference.

The workload is GPU-heavy (video / image generation), and throttling seems to happen when requests run for a bit longer or when CPU-side processing (e.g. ffmpeg / preprocessing) is involved.

I’d like to understand:
• What are the common causes of throttling on Serverless endpoints?
• What are the recommended ways to mitigate throttling?
(e.g. limiting concurrency, splitting CPU/GPU workloads, adjusting endpoint settings, or using a different deployment type)

Any guidance or best practices would be appreciated. Thanks!

Runpod•3w ago•

2 replies

De ligt

How to resolve throttling on RunPod Serverless endpoints?

How to resolve throttling on RunPod Serverless endpoints?

Similar Threads

How to resolve throttling on RunPod Serverless endpoints?

Similar Threads

Similar Threads

Similar Threads