Workers configuration for Serverless vLLM endpoints: 1 hour lecture with 50 students
Hey there, I need to showcase 50 students how to do RAG with open-source LLMs (i.e., LLama3). Which type of configuration do you suggest? I wanna make sure they have a smooth experience. Thanks!

