Hi, I don't have much experience neither with llms nor with python, so I always just use this image 'ghcr.io/huggingface/text-generation-inference:latest' and run my models on Pods. Now, I wanna try serverless endpoints, but I don't know how to launch text-generation-inference on serverless endpoints, can someone give some tips or maybe there are some docs which could help me.
Continue the conversation
Join the Discord to ask follow-up questions and connect with the community
R
Runpod
We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!