Memory (RAM) issues

I have a workflow where one step is to enrich every page on a website (sometimes 600 or more) with wider business context information, then after that, a forEach step that handles each page of the website and examines the HTML / attributes. With a very large site, I got this error in the Mastra logs (attached). I have a few questions: 1) Is there any way around this (apart from "buy a server with higher spec")? I'm wondering especially from the perspective of workflow architecture/orchestration. 2) If I do plan to load in the entire HTML of a given web page (along with business context) and then process it, are there any best practices that I might not be doing? The Workflow is basically:
export const postOnboardingWorkflow = createWorkflow({
id: "post-onboarding-workflow",
description: "Rank, deduplicate, and research keywords with rankings-first approach",
inputSchema: PostOnboardingWorkflowInputSchema,
outputSchema: PostOnboardingWorkflowOutputSchema,
})
.then(pageRankingStep)
.then(pageDeduplicationStep)
.then(enrichPagesWithContextStep)processing
.foreach(singlePageKeywordStep, { concurrency: RANKING_KEYWORDS_CONFIG.MAX_CONCURRENCY })
.then(keywordCannicalizationCleanupStep)
.commit();
export const postOnboardingWorkflow = createWorkflow({
id: "post-onboarding-workflow",
description: "Rank, deduplicate, and research keywords with rankings-first approach",
inputSchema: PostOnboardingWorkflowInputSchema,
outputSchema: PostOnboardingWorkflowOutputSchema,
})
.then(pageRankingStep)
.then(pageDeduplicationStep)
.then(enrichPagesWithContextStep)processing
.foreach(singlePageKeywordStep, { concurrency: RANKING_KEYWORDS_CONFIG.MAX_CONCURRENCY })
.then(keywordCannicalizationCleanupStep)
.commit();
(MAX_CONCURRENCY is 1 and it still does this) The key part from the attached file:
<--- Last few GCs --->

[1:0x739912b6a650] 96509996 ms: Mark-Compact 254.7 (283.1) -> 254.2 (283.1) MB, 83.91 / 0.00 ms (average mu = 0.803, current mu = 0.315) allocation failure; scavenge might not succeed
[1:0x739912b6a650] 96510117 ms: Mark-Compact 258.2 (283.1) -> 257.7 (291.1) MB, 100.70 / 0.00 ms (average mu = 0.663, current mu = 0.169) allocation failure; scavenge might not succeed


<--- JS stacktrace --->

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
----- Native stack trace -----
<--- Last few GCs --->

[1:0x739912b6a650] 96509996 ms: Mark-Compact 254.7 (283.1) -> 254.2 (283.1) MB, 83.91 / 0.00 ms (average mu = 0.803, current mu = 0.315) allocation failure; scavenge might not succeed
[1:0x739912b6a650] 96510117 ms: Mark-Compact 258.2 (283.1) -> 257.7 (291.1) MB, 100.70 / 0.00 ms (average mu = 0.663, current mu = 0.169) allocation failure; scavenge might not succeed


<--- JS stacktrace --->

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
----- Native stack trace -----
11 Replies
Mastra Triager
Mastra Triager2mo ago
GitHub
[DISCORD:1413859688893775892] Memory (RAM) issues · Issue #7559 ·...
This issue was created from Discord post: https://discord.com/channels/1309558646228779139/1413859688893775892 I have a workflow where one step is to enrich every page on a website (sometimes 600 o...
Ward
Ward2mo ago
At this moment there might not be any solution, we are adding a event system to our workflows so we can run each step on different hardware. This is in the works but currenlty we are still in the < 1.0 range so we haven't really looked into memory and performance that much.
joneslloyd
joneslloydOP2mo ago
Understood, okay – Do you have any practical tips (aside from vertical and horizontal hardware scaling) that might mitigate the issue eg, "Use X or Y construct"? I'm already avoiding some of these issues by batching and using foreach but I don't know how it can always be avoided. Any advice would be hugely appreciated.
Ward
Ward2mo ago
Try not to put any large objects inside the step outputs/inputs
joneslloyd
joneslloydOP2mo ago
Okay! I am curious: In a foreach, does every iteration accumulate memory usage? ie, Even with minimal object size (etc) if the collection is large enough, is there the potential for this error (in my original post) ? Or in theory is it avoidable?
_roamin_
_roamin_2mo ago
@joneslloyd maybe you could try using the filesystem, load files when you need them, unload when done and then just pass the URIs around.
joneslloyd
joneslloydOP2mo ago
Thanks for getting back to me. I have a couple of thoughts: 1) It's a bit tricky because my Mastra workflows intentionally don't do CRUD (they only emit events) so dynamically loading in files from the FS might not be viable. 2) Even if they did do CRUD and could read from the FS (or an S3 bucket) would this not have the same effect as the foreach memory issue, because as the loop progressed toward the total number of items, each large web page would be in memory (the same as my current solution)?
_roamin_
_roamin_2mo ago
I think what you need to do is keep the large items out of the input/outputs of your workflow steps, and only keep the results of whatever processing you're doing. You'll also save on database storage since workflow snapshots end up in the database.
joneslloyd
joneslloydOP2mo ago
What about cases where the large item (full HTML of a given web page) are necessary for the workflow step? (And in this specific instance, it needs to be done for a large percentage of a website's pages).
_roamin_
_roamin_2mo ago
I'm guessing at some point there'll be better ways to handle large datasets in workflows, but for now you'll need to have a beefy server that has enough RAM to store all that data.
joneslloyd
joneslloydOP2mo ago
There is no other solution at this stage?

Did you find this page helpful?