serverless endpoint stuck on Throttled or Initializing
Hello! hope all is well. Trying to follow the load balancing serverless endpoint with vllm, on H100s selected, workers are either throttled or initialized. Can you please help? here is the endpoint ID: vllm-kcj166ha6f00m1
13 Replies
one of the workers is running
it is healthy
but requests are hanging
tried PING and /v1/chat/completions
Looking into this, but if its healthy it should just be receiving traffic. The logs are different for me, so I'll need a minute.
thanks so much!
By different I meant worse, but it looks like it took us some time to download your custom image but you should see things start coming up?

(Image pull -> pod start)


worker has been up but still unreachable externally
should I try deleting that worker specifically?
I would give it a try, if it's health checking then there's no reason I can think of that we wouldn't be sending it traffic
@Dj just tried deleting and starting a new worker, also hanging
curling without the token returns 401
curling with, hangs

also shows failed requests