R
Runpod9mo ago
Inez

HTTP 502 on VLLM pod

I'm getting a 502 when trying to connect to the deployed service. Using the vllm-latest image and these arguments: --host 0.0.0.0 --port 8000 --model mistralai/Mistral-Small-24B-Instruct-2501 --dtype auto --enforce-eager --gpu-memory-utilization 0.95 --tensor-parallel-size 2 Using the ollama service doesn't have any issues. Any ideas?
1 Reply
Unknown User
Unknown User9mo ago
Message Not Public
Sign In & Join Server To View

Did you find this page helpful?