How does gpt prompt caching work on openrouter?

Caching with any provider that supports it works great, but not openai. For example i have a system promt that is 1500 tokens long on o200k_base encoding, i send a request to gpt-5 using openrouter, and i get this output (CompletionUsage(completion_tokens=27, prompt_tokens=1574, total_tokens=1601, completion_tokens_details=None, prompt_tokens_details=None)) which indicates that none of the tokens were cached. What could be the issue? doc: https://openrouter.ai/docs/features/prompt-caching
OpenRouter Documentation
Prompt Caching - Optimize AI Model Costs with Smart Caching
Reduce your AI model costs with OpenRouter's prompt caching feature. Learn how to cache and reuse responses across OpenAI, Anthropic Claude, and DeepSeek models.
1 Reply
YoGoUrT
YoGoUrTOP3mo ago
Please ping on reply bump just found out that the issue is with openrouter, chatgpt on its own caches the messages actually no, the issue is not openrouter, gpt-5 does not want to cache at all, while 4o works, and 4o works only with max_tokens param, and gpt-5 does not support it

Did you find this page helpful?