From what I'm seeing with gpt-5.4, it uses service tier for "fast mode" and the context length to control regular priced gpt-5.4 or 2x more expensive after 272k context.
So for a clean implementation, you'd need to expose service tier as a setting in the TUI so you can change service tiers, which looks like a real pain in the butt. Or you can special case a custom model name like gpt-5.4-fast which gets a fast service tier.
Then the question arises, if Pi wants to have separate model names for the 1m context length or not. It could potentially be very easy to hit that 2x usage if you're not paying attention. To do this, then you'd probably have two different model names for gpt-5.4 or gpt-5.4-1m. Combine that with fast mode and you'd end up with 4 different model names, or you expose service tier in the TUI as a separate setting like how thinking level is exposed.
This is super annoying. @badlogic not sure how you're thinking about it?
Recent Announcements
Continue the conversation
Join the Discord to ask follow-up questions and connect with the community