M
Mastra3w ago
ldp

Prompt caching with dynamic context

Where is the best place to supply fully dynamic context, while taking full advantage of prompt caching? (particularly with openai) OpenAI recommends placing it "at the end" of your prompt. Heres an approximation of our current setup:
export const chatAgent: Agent = new Agent({
name: "chat-agent",
tools: { }, // very large static toolset
instructions: ({ runtimeContext }) => {
// an example of some truly dynamic var, differs every req
const currentTime = Date.now();

return `${BASE_STATIC_SYSTEM_PROMPT}

- the user's name is ${runtimeContext.get('username')}
- the current time is ${currentTime}`;
},
});
export const chatAgent: Agent = new Agent({
name: "chat-agent",
tools: { }, // very large static toolset
instructions: ({ runtimeContext }) => {
// an example of some truly dynamic var, differs every req
const currentTime = Date.now();

return `${BASE_STATIC_SYSTEM_PROMPT}

- the user's name is ${runtimeContext.get('username')}
- the current time is ${currentTime}`;
},
});
When I test the above, every new message caches 0 tokens. (verified via stream's onFinish callback) When I remove the dynamism and only pass BASE_STATIC_SYSTEM_PROMPT, effectively the full token count gets cached, minus the last few tokens. What I believe is happening is at some point the cache check is performed by openai by concatenating SYSTEM_PROMPT + TOOLS_AS_STRING + MESSAGES_ARR and since our system prompt is dynamic, its causing a cache miss every time. Is there a better place to supply this dynamic data that wouldn't cause a cache miss? As a system message in the messages array? Is there any way to always keep this information in context (eg: not get dropped when it falls outside of the lastMessages window), while still optimizing for prompt caching?
6 Replies
Mastra Triager
📝 Created GitHub issue: https://github.com/mastra-ai/mastra/issues/10381 🔍 If you're experiencing an error, please provide a minimal reproducible example whenever possible to help us resolve it quickly. 🙏 Thank you for helping us improve Mastra!
Abhi Aiyer
Abhi Aiyer3w ago
Hey @ldp ! We call getInstructions on the agent before passing it to the model. You can return a SystemMessage|SystemMessage[] as instructions as well! There you could add providerOptions.
instructions: {
content: `
You are Michel, a practical and experienced home chef who helps people cook great meals with whatever
ingredients they have available. Your first priority is understanding what ingredients and equipment the user has access to, then suggesting achievable recipes.
You explain cooking steps clearly and offer substitutions when needed, maintaining a friendly and encouraging tone throughout.
`,
role: 'system',
providerOptions: {
openai: {
promptCacheKey: 'some-key',
}
}
},
instructions: {
content: `
You are Michel, a practical and experienced home chef who helps people cook great meals with whatever
ingredients they have available. Your first priority is understanding what ingredients and equipment the user has access to, then suggesting achievable recipes.
You explain cooking steps clearly and offer substitutions when needed, maintaining a friendly and encouraging tone throughout.
`,
role: 'system',
providerOptions: {
openai: {
promptCacheKey: 'some-key',
}
}
},
promptCacheKey promptCacheRetention
ldp
ldpOP3w ago
Ty so much abhi! I’ll try this out this weekend
_roamin_
_roamin_2w ago
Hey @ldp ! Just wondering if you had the chance to test the solution Abhi mentioned above? Let us know if you're running into any issue setting it up 😉
ldp
ldpOP2w ago
hey @roamin yep! i commented on the linked github issue here abhi's approach seemed to cache way more tokens (not sure i totally get why it works though 😆 )
// approximation of our setup
const providerOptions = { openai: { promptCacheKey: "chat-agent-cache" } };

export const chatAgent: Agent = new Agent({
name: "chat-agent",
tools: { }, // very large static toolset
instructions: () =>
([
{
role: "system",
providerOptions,
content: CHAT_SYSTEM_PROMPT,
},
{
role: "system",
providerOptions,
content: `Current Time: ${Date.now()}`,
},
])
},
});
// approximation of our setup
const providerOptions = { openai: { promptCacheKey: "chat-agent-cache" } };

export const chatAgent: Agent = new Agent({
name: "chat-agent",
tools: { }, // very large static toolset
instructions: () =>
([
{
role: "system",
providerOptions,
content: CHAT_SYSTEM_PROMPT,
},
{
role: "system",
providerOptions,
content: `Current Time: ${Date.now()}`,
},
])
},
});
feel free to close / mark resolved (not sure if you have that on discord)
_roamin_
_roamin_2w ago
Hey Joe! Thanks for testing it out! It works because when you're using providerOption, you're basically explicitly telling OpenAI to cache these instructions. If/when you only provide "text", then it's OpenAI that kind of decides what they're going to "cache" 😉

Did you find this page helpful?