R
RunPod4mo ago
AC_pill

Idle time: High Idle time on server but not getting tasks from queue

I'm testing servers with high Idle time to keep alive and get new tasks, but the worker is showing idle and finished but not getting new tasks from the Queue. Is there any event or state I need to add to the handler?
8 Replies
AC_pill
AC_pill4mo ago
My Delay time is 60 seconds but as you can see on each execution the request execution time is 120 seconds besides the fact that the task is only 15s, but the server still hanging with the same task
AC_pill
AC_pill4mo ago
No description
justin
justin4mo ago
Share ur endpoint ID? Also im unable to replicate on my own endpoint. I think either way, u dont need to have high idle time if the idle time is > than cold start time isnt worth it. But as long as u return it should terminate the job? I guess when the job is done and return do u have logs on the timestamps? That is insanely weird that all ur execution times are perfect inline tho due to this idle Yeah i set my idle time to 30 seconds, and sent like 30 requests to it, and have varying execution times. Maybe can also scale ur max to 0 and back to ur max? just to manually reset it all? seems weird
AC_pill
AC_pill4mo ago
yeah, I'll do that, thanks Scaling back to 0 and max solved thanks @justin [Not Staff]
ashleyk
ashleyk4mo ago
Yeah, there is something wrong with the handling of throttled workers, one of my other endpoints was completely throttled for 2 days and when I scaled it down to zero and back up again, it came right again and doesn't have any throttled workers anymore.
justin
justin4mo ago
I think his issue was he set an idle time but his workers were waiting for idle time to complete before pulling jobs off the queue so he got a range of perfect execution times of 120 seconds cause they were waiting for his idle time lol. weird
AC_pill
AC_pill4mo ago
seems to be working now, probably cached dockerfile? not sure the execution time is back to normal, from a batch of 20 tasks only one failed, and that must be GPU because there is no log output
No description
AC_pill
AC_pill4mo ago
and yes, a lot of throttled workers pushes the cold start servers to unreliable state
Want results from more Discord servers?
Add your server
More Posts
Is there a programatic way to activate servers on high demand / peak hours load?We are testing the serverless for production deployment for next month. I want to assure we will havRunpodctl in container receiving 401Over the past few days, I have sometimes been getting a 401 response when attempting to stop pods wiIncreasing costs?guys last few days seems an increase in cost without a spike in active usage. do you have any idea wCannot establish connection for web terminal using Standard Diffusion podI'm able to connect to the Webui HTTP client. And I can connect via SSH from my local machine AND I [URGENT] EU-RO region endpoint currently only processing one request at a timeWe have a production endpoint running in the EU-RO region but despite us having 21 workers 'running'Runpod errors, all pods having same issue this morning. Important operationI got this error on all my pods today We have detected a critical error on this machine which may aHi, I have a problem with two of my very important services, and I received the following messageHi, I have a problem with two of my very important services, and I received the following message: Error while using vLLm in RTX A60002024-02-22T11:19:46.009303238Z /usr/bin/python3: Error while finding module specification for 'vllm.502 error when trying to connect to SD Pod HTTP Service on RunpodI've been following along with this tutorial - everything was going smoothly until it cam time to cocorrect way to call jupyter in templateI'm trying to learn how to create a template. I'm using FROM runpod/pytorch:2.1.1-py3.10-cuda12.1.