worker-vllm list of strings

Hey,
I have fine-tuned model that i want to deploy in serverless. I tried the vLLM prompts approach with list of strings (as attached) on T4 Colab and it works really well - response in 0.5 secs. And here is my question - do i need to create my own worker to post input as a list of strings or you handle this in your vllm-worker? -> https://github.com/runpod-workers/worker-vllm

Thanks for you reply 😉

#sorry, I accidentally posted on a different channel than #⚡|serverless
image.png
GitHub
The RunPod worker template for serving our large language model endpoints. Powered by vLLM. - runpod-workers/worker-vllm
Was this page helpful?