What does "throttled" mean?

My endpoint dashboard sometimes shows "1 Throttled" worker, and 0 other workers, except for queued ones. What does the "throttled" status mean, and how do I prevent the condition?
Solution
From my understanding, and this is by no way official:

Throttled means that other services are using the GPU. I recommend, to have at least 2 max workers (which runpod will then allocate 5 workers on your endpoint), which will have the ability to "potentially" pick up jobs with the maximum workers ever working being the amount you chose.

There is no way to prevent it unless you require some "minimum" amount of working to always be active.

Throttled can also happen if there are issues with runpod itself it seems from my experience. But that is more rare.

You can use the /health endpoint to always check your endpoints to make sure you have idle or active workers ready.
Was this page helpful?