Not getting 100s of req/sec serving for Llama 3 70B models with default vLLM serverless template - Runpod