R
Runpod14mo ago
artbred

GGUF vllm

It seems that the newest version of vllm's supports gguf models, have anyone figured out how to make this work in runpod serverless? Seems like need to set some custom ENV vars, or maybe anyone knows a way to convert gguf back to safetensors?
9 Replies
Unknown User
Unknown User14mo ago
Message Not Public
Sign In & Join Server To View
Misterion
Misterion11mo ago
hi, is there any solution to this?
Unknown User
Unknown User11mo ago
Message Not Public
Sign In & Join Server To View
Misterion
Misterion11mo ago
the problem is that you have to specify gguf file name, and belive there is no such env var for vllm worker we could download the model and pack it in the container, but I was just looking for out-of-the-box solution
Unknown User
Unknown User11mo ago
Message Not Public
Sign In & Join Server To View
Misterion
Misterion11mo ago
will do that as a workaround, but would be nice to support that natively
Unknown User
Unknown User11mo ago
Message Not Public
Sign In & Join Server To View
wiki
wiki11mo ago
I will add a support for this natively
Unknown User
Unknown User11mo ago
Message Not Public
Sign In & Join Server To View

Did you find this page helpful?