RunpodR
Runpod12mo ago
9 replies
Bj9000

Serveless quants

Hi, how do you specify a specific gguf quant file from a hf repo when configuring a vllm serveless endpoint? Only seems to let you specify the repo level.
Was this page helpful?