The problem is that the newly added worker becomes available (green) before the A1111 has been booted. Because of this, new requests are being instantly sent to a new worker, and older workers are being shut down if they haven't received any requests during 5 seconds. This usually results in all active workers shutdown, and a long queue build up because all newly added workers haven't booted the A1111 yet.
I tried to increase the idle timeout, e.g. to 180 seconds but in this case the workers never scale down.
Questions: 1. How to make the worker available (green) only once the A1111 has been booted? 2. Is it possible to remove the worker also based on the queue delay setting? E.g. if a request waits in the queue less than 10 seconds, 1 worker is removed.
Recent Announcements
Continue the conversation
Join the Discord to ask follow-up questions and connect with the community
R
Runpod
We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!