We deployed a LLaVA-v1.6-34B model on 2xA100SXM infra as a serverless endpoint. When we send a request, we don't get a response. And the request is indefinitely in the
IN_QUEUE
IN_QUEUE
status. Any suggestions for what we should we look at to start debugging this?
We've previously been successful deploying LLaVA-v1.5-13b. But again grateful for suggestions