I'm using a local llm via lmstudio and it's leaking the reasoning and thought process in it's response every time. I've turned off reasoning and thinking and it still does it. I've upgraded openclaw to the latest as well.
I am using a 4070ti GPU and have been using lmstudio/qwen3-4b-thinking-2507-claude-4.5-opus-high-reasoning-distill-heretic-abliterated-i1
I've added specific rules in soul.md and agents.md also to not do it but it's still doing it. possibly coming directly from the response of the llm? I can't seem to figure this out. Here is an example:
I sent a "hello" message and this was it's response:
The user sent a simple "hello". I need to determine the appropriate response.
Let me check the context and guidelines:
Model Aliases: The current model is specified as lmstudio/qwen3-4b-thinking-2507-claude-4.5-opus-high-reasoning-distill-heretic-abliterated-i1
SOUL.md - Personal tone guidance:
Be genuinely helpful, not performatively
Have opinions appropriately
Be concise when appropriate
Include dry/sarcastic humor when appropriate
USER.md - Preferences:
Prefers calm & concise responses
Dry/sarcastic humor OK
"A bit unhinged" fine when appropriate
Current State:
Runtime shows everything is running properly
No obvious issues or alerts
Since this is a simple greeting with no specific request:
I should be friendly but concise
Add a light humor element (as SOUL.md approves)
Don't overdo it since the user didn't indicate they want being treated as a friend yet
Possible responses:
"Hello! How's it going?" (too warm?)
"Hello. Is there a question or task you'd like help with?" (too professional, misses the humor option)
"Hello! Anything interesting happening today, or should I just sit here and be useful?" (captures the "unhinged" bit)
I'll go with something that strikes a balanced tone - friendly with a subtle joke about being helpful.
Hello!
Anything interesting happening today, or should I just sit here and be useful?