RunpodR
Runpod2y ago
7 replies
n8tzto

Intermittent Slow Performance Issue with GPU Workers

I am currently encountering an intermittent issue with some GPU workers exhibiting significantly slower performance. I have tried to measure the time taken for a specific task on a designated type of GPU worker (4090 24GB). Typically, when I send the exact identical payload input to the endpoint, the execution time is around 1 minute. However, I have observed that occasionally, a worker becomes exceptionally slow. Even with the same payload input, Docker image, tag, and GPU type, the execution time extends to a few hours. Notably, during these occurrences, the GPU utilization remains constantly at 0%.

Upon reviewing the output log, it is evident that the inference speed is unusually slow when the affected worker is in operation. Have any of you experienced a similar problem, and if so, how did you resolve it?

Your insights and assistance in addressing this issue would be greatly appreciated. Thank you.
Was this page helpful?