VLLM docker image: Disable reasoning

Hello,

I’m trying to disable reasoning in serverless vLLM with GPT-OSS-20B for streaming use cases. I don’t want any reasoning content in the responses, and I don’t need this feature at all.

I’ve tried using environment variables, but without success.

I also tried forking the RunPod vLLM repository and modifying src/handler.py and src/engine.py, but that didn’t work either.

I am stucked...
Has anyone managed to disable reasoning in serverless mode? Maybe some git repo? Thank you in advance.
Was this page helpful?