It takes a long time because you're not using streaming. You're basically waiting for the LLM to gen

It takes a long time because you're not using streaming. You're basically waiting for the LLM to generate the whole text before returning a response.
Was this page helpful?