Search
Star
Feedback
Setup for Free
© 2026 Hedgehog Software, LLC
Twitter
GitHub
Discord
System
Light
Dark
More
Communities
Docs
About
Terms
Privacy
GGUF Text Model Deploymet on Serverless with Streaming Response. - Runpod
R
Runpod
•
2mo ago
•
6 replies
Aksm
GGUF Text Model Deploymet on Serverless with Streaming Response.
I am trying to deploy Text model GGUF
(
https://huggingface.co/bartowski/cognitivecomputations_Dolphin-Mistral-24B-Venice-Edition-GGUF
)
I tried using lamma
.cpp but it
's not working as expected
, it
's running very slow and it
's starting new worker with each new requests
.
how should i deploy effectively
? Thanks in advance
.
Runpod
Join
We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!
21,202
Members
View on Discord
Resources
ModelContextProtocol
ModelContextProtocol
MCP Server
Recent Announcements
Similar Threads
Was this page helpful?
Yes
No
Similar Threads
GGUF in serverless vLLM
R
Runpod / ⚡|serverless
2y ago
Strange model response in Serverless
R
Runpod / ⚡|serverless
9mo ago
Serverless Streaming Documentation
R
Runpod / ⚡|serverless
17mo ago
Serverless Endpoint Streaming
R
Runpod / ⚡|serverless
3y ago