Runpod•11mo ago

openai/v1 and open-webui

Hey Team,

Looking at your docs, and at the question "How to respond to the requests at https://api.runpod.ai/v2/<YOUR ENDPOINT ID>/openai/v1"; I've run into a weird gotcha. When I do a GET ---

curl -X GET https://api.runpod.ai/v2/<endpoint here>/openai/v1 \
     -H 'Content-Type: application/json' \
     -H 'Authorization: <token>' \

curl -X GET https://api.runpod.ai/v2/<endpoint here>/openai/v1 \
     -H 'Content-Type: application/json' \
     -H 'Authorization: <token>' \

curl -X GET https://api.runpod.ai/v2/<endpoint here>/openai/v1 \
     -H 'Content-Type: application/json' \
     -H 'Authorization: <token>' \

curl -X GET https://api.runpod.ai/v2/<endpoint here>/openai/v1 \
     -H 'Content-Type: application/json' \
     -H 'Authorization: <token>' \

it gives me an

{"error":"Error processing the request"}

{"error":"Error processing the request"}

{"error":"Error processing the request"}

{"error":"Error processing the request"}

Most applications (like open-webui) that use the openai spec expect this to be a GET (see openai docs -- https://platform.openai.com/docs/api-reference/models) and the docs imply that it is - https://github.com/runpod-workers/worker-vllm/tree/main#modifying-your-openai-codebase-to-use-your-deployed-vllm-worker. Am I missing something, how is this supposed to work?

Thanks,
Paul

GitHub

GitHub - runpod-workers/worker-vllm: The RunPod worker template for...

The RunPod worker template for serving our large language model endpoints. Powered by vLLM. - runpod-workers/worker-vllm

CoverGhoulOP•2/15/25, 10:44 PM

https://meyer-laurent.com/deploying-deepseek-r1-on-runpod-serverless-and-use-it-in-pycharm#2-understanding-vllm

This is the answer, apparently it won't load the models without write permissions. Seems a bit silly, but hopefully this helps someone else

Deploying DeepSeek-R1 on Runpod Serverless and use it in PyCharm

Deploy DeepSeek-R1 on Runpod Serverless with Docker & vLLM—run your own private local LLM that processes RAG data, auto-scales, and shuts down when idle.

openai/v1 and open-webui

Similar Threads

Similar Threads

Similar Threads