RunPod2mo ago

API not properly propping up?

Hi I'm new in Runpod. I deployed a finetined LLM using the vLLM template in Runpod. I'm having problems with the API, when I fire requests using the OpenAI chat completions API it gets stuck processing the request for a couple of minutes and returns 500. When I hit the API using the Runpod endpoint console and afterwards hit it again with the request that 500'd previously it works as expected in about 8 seconds. I am doing this without using a Handler, am I doing something wrong?
1 Reply
justin2mo ago
Do you have the clientside request that you are sending off to share? @Alpay Ariyak Can probably comment on the situation better