Load Balancer Least Outstanding Requests Steering,How does cloudflare know when request is finished?

I have 3 endpoints. traffic steering is Least Outstanding Requests. Each request takes around 2 minutes to process after being accepted by the server(generative AI). But often, I see that 1 endpoint has 3 requests (1 active, 2 in queue), while the other 2 endpoints don't have any active requests.

How is outstanding requests being measured exactly?

I see this from the documentation:

"LORS uses the number of unanswered HTTP requests to influence steering"

Does this mean that as soon as the server accepts the request, it's considered "complete", even if the server hasn't finished processing it?

https://developers.cloudflare.com/reference-architecture/architectures/load-balancing/#least-outstanding-requests-steering-lors
image.png
Cloudflare Docs
Cloudflare Load Balancing is a SaaS offering that allows organizations to host applications for a global user base while vastly reducing concerns of maintenance, failover, resiliency, and scalability. Using Cloudflare Load Balancing allows organizations to address the following challenges:
Was this page helpful?