Load balancing + scaling
Hello,
Does anyone know how requests are balanced across workers?
This is important to understand in the context of autoscaling — especially if I’m using scaling based on queue delay and idle timeout.
I expected that a job would be taken by an active worker first, and only if no active workers are available, a scaled-up worker would take over.
10 Replies
Unknown User•5mo ago
Message Not Public
Sign In & Join Server To View
If I have two warm workers with num active = 1, how are jobs balanced between them?
Unknown User•5mo ago
Message Not Public
Sign In & Join Server To View
It’s about autoscaling.
If I have even a single request that triggers scaling, then a new worker ends up running almost all the time — unless I set a very short idle timeout.
But setting it too low could lead to long cold starts.
I was hoping that this scaled worker wouldn’t receive any requests as long as there are “active” (non-scaled) workers available.
Unknown User•5mo ago
Message Not Public
Sign In & Join Server To View
Scaling up works perfectly, but scaling down doesn’t — at least in my case.
Imagine there’s a spike in workload, and a couple of additional workers are spun up.
After the spike ends, those workers can’t scale down because they’re still getting requests — so they never go idle.
That’s exactly what’s happening on my side.
As a result, I end up paying extra for workers I no longer need to handle the current workload.
Unknown User•5mo ago
Message Not Public
Sign In & Join Server To View
@Eugene_Swanley
Escalated To Zendesk
The thread has been escalated to Zendesk!
Ticket ID: #18610
Unknown User•5mo ago
Message Not Public
Sign In & Join Server To View
Probably it can be not idle, but is able to handle additional request (as I use concurrent workers), and I'm not sure that it's respected by the balancer