© 2026 Hedgehog Software, LLC
Twitter
GitHub
Discord
System
Light
Dark
More
Communities
Docs
About
Terms
Privacy
Search
Star
Feedback
Setup for Free
GGUF Text Model Deploymet on Serverless with Streaming Response. - Runpod
R
Runpod
•
4mo ago
•
6 replies
Aksm
GGUF Text Model Deploymet on Serverless with Streaming Response.
I am trying to deploy Text model GGUF
(
https://huggingface.co/bartowski/cognitivecomputations_Dolphin-Mistral-24B-Venice-Edition-GGUF
)
I tried using lamma
.cpp but it
's not working as expected
, it
's running very slow and it
's starting new worker with each new requests
.
how should i deploy effectively
? Thanks in advance
.
Similar Threads
GGUF in serverless vLLM
R
Runpod / ⚡|serverless
2y ago
Strange model response in Serverless
R
Runpod / ⚡|serverless
11mo ago
Serverless Streaming Documentation
R
Runpod / ⚡|serverless
2y ago
Serverless Endpoint Streaming
R
Runpod / ⚡|serverless
3y ago