RunpodR
Runpod7mo ago
Foopop

Serverless VLLM batching

Hey so every hour I have like 10k prompts I want to send to my serverless instance. Im using vllm and my question is does the batching which vllm does out of the box work for the serverless instance cuz I send all prompts as single request not in one request. I could not find anything about this in the docs and in this chat. Would be really helpful thanks.
Was this page helpful?