Search
Star
Feedback
Setup for Free
© 2026 Hedgehog Software, LLC
Twitter
GitHub
Discord
System
Light
Dark
More
Communities
Docs
About
Terms
Privacy
Trying to load a huge model into serverless - Runpod
R
Runpod
•
2y ago
•
15 replies
blabbercrab
Trying to load a huge model into serverless
https://huggingface.co/cognitivecomputations/dolphin-2.9.2-qwen2-72b
Anyone have any idea how to do this in vLLM
?
I
've deployed using two 80GB gpus and have had no luck
cognitivecomputations/dolphin-2.9.2-qwen2-72b · Hugging Face
Runpod
Join
We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!
21,202
Members
View on Discord
Resources
ModelContextProtocol
ModelContextProtocol
MCP Server
Recent Announcements
Similar Threads
Was this page helpful?
Yes
No
Similar Threads
Deploying MIGAN model to Serverless.
R
Runpod / ⚡|serverless
2y ago
Serverless Load-balancing
R
Runpod / ⚡|serverless
8mo ago
Huge sudden delay times in serverless
R
Runpod / ⚡|serverless
2y ago
Experiencing huge execution time on Serverless
R
Runpod / ⚡|serverless
3y ago