Text-generation-inference on serverless endpoints
Hi, I don't have much experience neither with llms nor with python, so I always just use this image 'ghcr.io/huggingface/text-generation-inference:latest' and run my models on Pods. Now, I wanna try serverless endpoints, but I don't know how to launch text-generation-inference on serverless endpoints, can someone give some tips or maybe there are some docs which could help me.



