I'm using the official runpod vLLM image for a serverless endpoint, using all default settings (besides the model), and my workers are all stuck initializing. I have a job in the queue (just hitting /v1/models) for 20+ minutes now and there are five workers (3 regular + 2 extra) just spinning "initializing". I don't see anything in the worker logs. What am I doing wrong, and more to the point, how do I figure out what I'm doing wrong? Just seeing logs would be nice. (Endpoint id: 99ky9cmdcjj2hm)
Continue the conversation
Join the Discord to ask follow-up questions and connect with the community
R
Runpod
We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!