Runpod•12mo ago

vLLM override open ai served model name

Overriding the served model name on the vllm serverless pod doesn't seem to take effect. Configuring a new endpoint through the explore page on runpod's interface creates a worker with the env variable OPENAI_SERVED_MODEL_NAME_OVERRIDE but the name of the model on the openai endpoint is still hf_repo/model name. The logs show : engine.py: AsyncEngineArgs(model='hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4', served_model_name=None... and the endpoint returns

Error with model object='error' message='The model 'model_name' does not exist.' type='NotFoundError' param=None code=404

Setting the env variable SERVED_MODEL_NAME shows logs:

engine.py: Engine args: AsyncEngineArgs(model='hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4', served_model_name='model_name'...

yet the endpoint still returns the same error message as above.

3 Replies

Sven•5mo ago

Have the same problem 😦

wiki•4mo ago

I will take a look into this and fix it.

Sven•4mo ago

in the engine.py file, in the_initialize_engines method, simply changing line 201: self.base_model_paths = [ BaseModelPath(name=name, model_path=self.engine_args.model) for name in self.served_model_name.split(" ") ] 😃

Gaming

Programming

vLLM override open ai served model name

Did you find this page helpful?