R
Runpod3d ago
crown

Pre-cached model selection doesn't appear to existing when creating a new serverless endpoint

The docs (https://docs.runpod.io/serverless/endpoints/manage-endpoints) say: """ Model (optional): Select a model from Hugging Face to optimize worker startup times. When you specify a model, Runpod attempts to place your workers on host machines that already have the model cached locally, resulting in faster cold starts and cost savings (since you won’t be charged while the model is downloading). You can either select from the dropdown list of pre-cached models or enter a custom Hugging Face model URL. """ .. however I don't see a dropdown of "pre-cached" models for that input when, for example, selecting vLLM via the https://www.console.runpod.io/serverless/new-endpoint Am I missing something here? 🤔
2 Replies
Tirth
Tirth3d ago
Thank you for bringing this feedback to our attention. As this feature is currently in beta, we are making changes to it. At the moment, we haven't enabled pre-cached models for Model Store. We'll update the docs to avoid the confusion. We will soon be pre-caching smaller, most-used models, but I don't have an ETA for that yet. I will keep you updated.
crown
crownOP3d ago
Understood, thanks for the quick reply 👍

Did you find this page helpful?