worker-vllm list of strings
Hey,
I have fine-tuned model that i want to deploy in serverless. I tried the vLLM prompts approach with list of strings (as attached) on T4 Colab and it works really well - response in 0.5 secs. And here is my question - do i need to create my own worker to post input as a list of strings or you handle this in your vllm-worker? -> https://github.com/runpod-workers/worker-vllm
Thanks for you reply
#sorry, I accidentally posted on a different channel than #
|serverless
I have fine-tuned model that i want to deploy in serverless. I tried the vLLM prompts approach with list of strings (as attached) on T4 Colab and it works really well - response in 0.5 secs. And here is my question - do i need to create my own worker to post input as a list of strings or you handle this in your vllm-worker? -> https://github.com/runpod-workers/worker-vllm
Thanks for you reply
#sorry, I accidentally posted on a different channel than #

GitHub
The RunPod worker template for serving our large language model endpoints. Powered by vLLM. - runpod-workers/worker-vllm