Serverless VLLM concurrency issue

Hello everyone, i deployed a serverless vllm (gemma 12b model) through runpod ui. withj 2 workers of A100 80GB vram.

if i send two requests at the same time, they both become IN PROGRESS but i recieve the ouput stream of one first, the second always waits for the first to finish then i start recieveing the tokens stream. why is it behaving live this?
Was this page helpful?