RunpodR
Runpod16mo ago
15 replies
artbred

GGUF vllm

It seems that the newest version of vllm's supports gguf models, have anyone figured out how to make this work in runpod serverless? Seems like need to set some custom ENV vars, or maybe anyone knows a way to convert gguf back to safetensors?
Was this page helpful?