Best way to deploy a new LLM serverless, where I don't want to build large docker images
The infrastructure I have come across at runpod, there is not much support for serverless for fast copying of weights of models from a local data centre. Can I get some suggestions on how I should plan my deployment because building large docker images and uploading them on docker file, and then server downloading it at cold start, takes massive time. Help will be appreciated.
Continue the conversation
Join the Discord to ask follow-up questions and connect with the community
R
Runpod
We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!