generation-config vllm
Hey! Need help with vLLM Quick Deploy setup.
I'm getting this warning and can't override sampling parameters in API requests:
WARNING 08-18 15:40:11 [config.py:1528] Default sampling parameters have been overridden by the model's Hugging Face generation config recommended from the model creator. If this is not intended, please relaunch vLLM instance with
--generation-config vllm
.
How do I add --generation-config vllm parameter when using Quick Deploy? Want to be able to set custom top_k, top_p, temperature in my requests instead of being stuck with model defaults.
Thanks!1 Reply
You can change the startup params through the environment variables in the Edit popup. I can grab you a screenshot if you'd like! @slxnxl