Search Setup for Free

Runpod•4w ago

deseculavalutent

Can I use LoRA in vLLM serverless with OpenAI API?

I need both LoRA and Structured Outputs, but it seems like LoRA is only supported by Runpod API and Structured Outputs are only (poorly) supported by OpenAI API?

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

20,859Members

View on Discord

Similar Threads

Was this page helpful?

Product

Pricing Docs Communities

Resources

About Blog Changelog Contributors

Legal

Terms Privacy Cookies EULA

Twitter GitHub Discord

© 2026 Hedgehog Software, LLC. All rights reserved.

© 2026 Hedgehog Software, LLC

Twitter GitHub Discord

More

Communities Docs About Terms Privacy

Poddy

P

Poddy•11/24/25, 4:36 AM

Jason

J

Jason•11/25/25, 2:03 PM

Are you using vllm worker?

Jason

J

Jason•11/25/25, 2:04 PM

lora should be able to be load in the configuration via environment variable if i remember correctly, and whats the problem with structured output in openai api?

Jason

JJason Are you using vllm worker?

deseculavalutent

D

deseculavalutentOP•11/25/25, 2:23 PM

Yes

Jason

JJason lora should be able to be load in the configuration via environment variable if ...

deseculavalutent

D

deseculavalutentOP•11/25/25, 2:24 PM

I think environment variable is possible but only for hugging face not for Runpod storage?

deseculavalutent

D

deseculavalutentOP•11/25/25, 2:25 PM

Structured output does work for OpenAI API but does not enforce valid json 100% and does not enforce json field order

deseculavalutent

D

deseculavalutentOP•11/25/25, 2:26 PM

So I'm using guided regex with some post processing instead

Jason

J

Jason•11/25/25, 2:26 PM

Does your model truly supports. Structured outputs?

deseculavalutent

D

deseculavalutentOP•11/25/25, 2:26 PM

Is not model specific

deseculavalutent

D

deseculavalutentOP•11/25/25, 2:26 PM

⚡｜serverlessHow can I create machine-readable output in a specific order?

deseculavalutent

Ddeseculavalutent I think environment variable is possible but only for hugging face not for Runpo...

Jason

J

Jason•11/25/25, 2:26 PM

I mean env variable to configure vllm to also load loras

Jason

JJason I mean env variable to configure vllm to also load loras

deseculavalutent

D

deseculavalutentOP•11/25/25, 2:27 PM

Yes but the LoRA itself needs to be on hugging face?

Jason

J

Jason•11/25/25, 2:27 PM

I think so

Jason

J

Jason•11/25/25, 2:28 PM

Or it can be cached locally but you'll have to add it to your custom image manually

deseculavalutent

D

deseculavalutentOP•11/25/25, 2:37 PM

https://github.com/runpod-workers/worker-vllm/pull/130

deseculavalutent

D

deseculavalutentOP•11/25/25, 2:38 PM

https://github.com/runpod-workers/worker-vllm/pull/121

deseculavalutent

D

deseculavalutentOP•11/25/25, 2:38 PM

https://github.com/runpod-workers/worker-vllm/issues/119

Jason

J

Jason•11/26/25, 2:40 AM

What reason are you attaching those links for?

Jason

J

Jason•11/26/25, 2:40 AM

https://github.com/runpod-workers/worker-vllm/blob/main/docs/configuration.md

Is this not working or?

Screenshot_2025-11-26-10-39-57-891_com.brave.browser.jpg

GitHub

worker-vllm/docs/configuration.md at main · runpod-workers/worker-...

The RunPod worker template for serving our large language model endpoints. Powered by vLLM. - runpod-workers/worker-vllm

Similar Threads

Serverless vllm - lora

RRunpod / ⚡｜serverless

Lora modules with basic vLLM serverless

RRunpod / ⚡｜serverless

LoRA path in vLLM serverless template

RRunpod / ⚡｜serverless

Custom vLLM OpenAI compatible API

RRunpod / ⚡｜serverless