VLLM docker image: Disable reasoning
Hello,
I’m trying to disable reasoning in serverless vLLM with GPT-OSS-20B for streaming use cases. I don’t want any reasoning content in the responses, and I don’t need this feature at all.
I’ve tried using environment variables, but without success.
I also tried forking the RunPod vLLM repository and modifying src/handler.py and src/engine.py, but that didn’t work either.
I am stucked...
Has anyone managed to disable reasoning in serverless mode? Maybe some git repo? Thank you in advance.
I’m trying to disable reasoning in serverless vLLM with GPT-OSS-20B for streaming use cases. I don’t want any reasoning content in the responses, and I don’t need this feature at all.
I’ve tried using environment variables, but without success.
I also tried forking the RunPod vLLM repository and modifying src/handler.py and src/engine.py, but that didn’t work either.
I am stucked...
Has anyone managed to disable reasoning in serverless mode? Maybe some git repo? Thank you in advance.
