R
RunPod3w ago
Ale

Serverless vLLM changing engine arguments

Hi, I got vLLM Serverless worker up and running, but want to change one engine argument (which is not overridable through environment variables), specifically --limit-mm-per-prompt , how could I do that with your custom image runpod/worker-v1-vllm:v2.3.0stable-cuda12.1.0 that endpoints use? Thanks
8 Replies
Jason
Jason3w ago
use t he configure button to see if its there
Jason
Jason3w ago
GitHub
GitHub - runpod-workers/worker-vllm: The RunPod worker template for...
The RunPod worker template for serving our large language model endpoints. Powered by vLLM. - runpod-workers/worker-vllm
Ale
AleOP3w ago
thanks I'll look into it and report back
Ale
AleOP2w ago
@Jason https://github.com/runpod-workers/worker-vllm/issues/155 looks like support for it was implemented, and then someone accidentally removed it lol Would it be possible to add it back again?
GitHub
setting limit_mm_per_prompt · Issue #155 · runpod-workers/worker-...
Is it possible to set limit_mm_per_prompt? there as a PR about it but this feature is not in main.
Ale
AleOP2w ago
GitHub
fix: added back limit_mm_per_prompt to engine args by aleksandar-ba...
Added back the engine argument limit_mm_per_prompt that was accidentally removed as per #155
Jason
Jason2w ago
Hi, thanks for the pr, i currently dont have any permissions to merge pr's but sure i'll try to notify staffs @wiki
Ale
AleOP2w ago
@Jason Thanks!
Jason
Jason2w ago
No problem

Did you find this page helpful?