RunPod•3w ago

Regular "throttled" status

Hi, I've configured a serverless endpoint with the max_workers setting explicitly set to 1. I've observed that the single worker for this endpoint frequently enters and stays in the "Throttled" state. This seems to be causing significant delays in request processing, making them take much longer than the actual inference time. Notably, this endpoint performed perfectly throughout the previous week. I only started noticing this frequent 'Throttled' status and the associated delays this week, starting around April 28th. Could you provide some insights into potential factors that might be causing this frequent throttling?

Solution:

when you set max worker to 1, your worker only deploy to single machine, when you not using it, we will give that machine to other people and when machine is fully used, your worker will be throttled. Highly suggest to avoid set max worker to be 1.

Jump to solution

3 Replies

DIRECTcut ▲•3w ago

Noticed this too, only for endpoints with max_workers of 1. Seems to depend on the datacenter too. I just bumped workers to 2-3 and it goes away As a small hack to speed up redeploys while having multiple workers - you can scale them to 0 max_workers and back up to your value, which immediately starts the redeploy process

Jason•3w ago

More demand on the datacenter probably Shortage of gpus

Solution

yhlong00000•3w ago

Gaming

Programming

Regular "throttled" status

Did you find this page helpful?