R
RunPod3mo ago
jonny9f

Serverless custom routes

Hi there. I'd like to implement by own streaming custom routes like the vllm worker ( https://github.com/runpod-workers/worker-vllm ). This worker supports routes like, https://api.runpod.ai/v2/<YOUR ENDPOINT ID>/openai/v1. How is this done? When I look in the source code that worker gets special keys passed to it in the rp handler like job_input.openai_route. Where does this key come from?
Thanks. Jon.
GitHub
GitHub - runpod-workers/worker-vllm: The RunPod worker template for...
The RunPod worker template for serving our large language model endpoints. Powered by vLLM. - runpod-workers/worker-vllm
9 Replies
ashleyk
ashleyk3mo ago
You can't add custom routes in serverless. You only use /run, /runsync, /status, /cancel, etc. Nothing custom.
jonny9f
jonny9f3mo ago
How does the vllm worker do it then?
ashleyk
ashleyk3mo ago
It doesn't, you are probably misunderstanding something. Or else RunPod especially exposed /openai stuff just for vllm worker, but you can't add your own.
jonny9f
jonny9f3mo ago
I think that must be the case. It's hardcoded to support those routes. As they are there for sure. From the docs, and testing, this works fine, curl https://api.runpod.ai/v2/<YOUR ENDPOINT ID>/openai/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer <YOUR OPENAI API KEY>" \ -d '{ "model": "<YOUR DEPLOYED MODEL REPO/NAME>", "messages": [ { "role": "user", "content": "Why is RunPod the best platform?" } ], "temperature": 0, "max_tokens": 100 }'
ashleyk
ashleyk3mo ago
Yeah then thats added by RunPod especially for vllm. You can't add your own though. Serverless typically just consists of a handler function.
Alpay Ariyak
Alpay Ariyak3mo ago
Yes, we added the OpenAI custom route just for for openai compatibility for vLLM and in-progress tensorRT and text embedding workers
ashleyk
ashleyk3mo ago
Would be nice to somehow proxy routes through to our endpoints though so that we don't have to use hacks like this.
No description
Alpay Ariyak
Alpay Ariyak3mo ago
Agreed, will bring that up
jonny9f
jonny9f3mo ago
agreed