Generation with increasing worker`s amount from 5 to 10
Hello everyone, there is such a question: with an increase in the number of workers at the endpoint, will generation become more expensive or only faster?
6 Replies
Little bit of both :)
But why it can become more expensive if the amount of generations will the same?
I would say neither 😀
If your requests are stuck in queue for a while, adding more workers will allow us to help you get through them faster. But it is more GPU time in the end. I think for some people it's maybe no net difference, but I guess it depends? Some kind of correlation here could be fun to calculate.
and in technical words: More cold starts(loading model, etc) rather than waiting for worker that is currently processing (warm)
Assuming your scaling type in the endpoint uses those new workers too
Thanks!