Theo's Typesafe Cult•3mo ago

How does gpt prompt caching work on openrouter?

Caching with any provider that supports it works great, but not openai. For example i have a system promt that is 1500 tokens long on o200k_base encoding, i send a request to gpt-5 using openrouter, and i get this output (CompletionUsage(completion_tokens=27, prompt_tokens=1574, total_tokens=1601, completion_tokens_details=None, prompt_tokens_details=None)) which indicates that none of the tokens were cached. What could be the issue? doc: https://openrouter.ai/docs/features/prompt-caching

OpenRouter Documentation

Prompt Caching - Optimize AI Model Costs with Smart Caching

Reduce your AI model costs with OpenRouter's prompt caching feature. Learn how to cache and reuse responses across OpenAI, Anthropic Claude, and DeepSeek models.

1 Reply

YoGoUrTOP•3mo ago

Please ping on reply bump just found out that the issue is with openrouter, chatgpt on its own caches the messages actually no, the issue is not openrouter, gpt-5 does not want to cache at all, while 4o works, and 4o works only with max_tokens param, and gpt-5 does not support it

Gaming

Programming

How does gpt prompt caching work on openrouter?

Did you find this page helpful?