Is it possible to cache a subset of a huggingface repository? The documentation for cached models seems to assume that you'll always want the entire branch for a repository. That's not helpful for cases such as https://huggingface.co/unsloth/Qwen3.5-122B-A10B-GGUF/tree/main, which contains multiple quantised versions of the same model: you would typically want a single file from there, since the full branch has nearly 2TB of data in total.
Also, there doesn't seem to be any feedback during the caching process - it'd be helpful to show how much data it's trying to download, at the moment there's no way to tell if the caching is still in progress or how long it's going to take (compare this to the OCI step, which at least shows the
docker pull
docker pull
progress in the logs).
Recent Announcements
Continue the conversation
Join the Discord to ask follow-up questions and connect with the community
R
Runpod
We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!