vllm worker OpenAI stream
Hi everyone,
I followed the Runpod documentation to create a simple OpenAI client code using a serverless endpoint for the Llava model (llava-hf/llava-1.5-7b-hf). However, I encountered the following error:
Has anyone experienced this issue? Any suggestions for resolving it?
Code:
I followed the Runpod documentation to create a simple OpenAI client code using a serverless endpoint for the Llava model (llava-hf/llava-1.5-7b-hf). However, I encountered the following error:
Has anyone experienced this issue? Any suggestions for resolving it?
Code:
