Search
Star
Feedback
Setup for Free
© 2026 Hedgehog Software, LLC
Twitter
GitHub
Discord
System
Light
Dark
More
Communities
Docs
About
Terms
Privacy
Does VLLM support quantized models? - Runpod
R
Runpod
•
16mo ago
•
1 reply
StandingFuture
Does VLLM support quantized models?
Trying to figure out how to deploy this
, but I didn
't see an option for selecting which quantization I wanted to run
.
https://huggingface.co/bartowski/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored-GGUF
Thanks
!
bartowski/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored-GGUF · Hugg...
Runpod
Join
We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!
21,202
Members
View on Discord
Resources
ModelContextProtocol
ModelContextProtocol
MCP Server
Similar Threads
Was this page helpful?
Yes
No
Similar Threads
vLLM Endpoint - Gemma3 27b quantized
R
Runpod / ⚡|serverless
9mo ago
Settings to reduce delay time using sglang for 4bit quantized models?
R
Runpod / ⚡|serverless
14mo ago
Deploying bitsandbytes-quantized Models on RunPod Serverless using Custom Docker Image
R
Runpod / ⚡|serverless
16mo ago
How does the vLLM serverless worker to support OpenAI API contract?
R
Runpod / ⚡|serverless
2y ago