MastraAI•2mo ago

Memory (RAM) issues

I have a workflow where one step is to enrich every page on a website (sometimes 600 or more) with wider business context information, then after that, a forEach step that handles each page of the website and examines the HTML / attributes. With a very large site, I got this error in the Mastra logs (attached). I have a few questions: 1) Is there any way around this (apart from "buy a server with higher spec")? I'm wondering especially from the perspective of workflow architecture/orchestration. 2) If I do plan to load in the entire HTML of a given web page (along with business context) and then process it, are there any best practices that I might not be doing? The Workflow is basically:

export const postOnboardingWorkflow = createWorkflow({
  id: "post-onboarding-workflow",
  description: "Rank, deduplicate, and research keywords with rankings-first approach",
  inputSchema: PostOnboardingWorkflowInputSchema,
  outputSchema: PostOnboardingWorkflowOutputSchema,
})
  .then(pageRankingStep)
  .then(pageDeduplicationStep)
  .then(enrichPagesWithContextStep)processing
  .foreach(singlePageKeywordStep, { concurrency: RANKING_KEYWORDS_CONFIG.MAX_CONCURRENCY })
  .then(keywordCannicalizationCleanupStep)
  .commit();

export const postOnboardingWorkflow = createWorkflow({
  id: "post-onboarding-workflow",
  description: "Rank, deduplicate, and research keywords with rankings-first approach",
  inputSchema: PostOnboardingWorkflowInputSchema,
  outputSchema: PostOnboardingWorkflowOutputSchema,
})
  .then(pageRankingStep)
  .then(pageDeduplicationStep)
  .then(enrichPagesWithContextStep)processing
  .foreach(singlePageKeywordStep, { concurrency: RANKING_KEYWORDS_CONFIG.MAX_CONCURRENCY })
  .then(keywordCannicalizationCleanupStep)
  .commit();

(MAX_CONCURRENCY is 1 and it still does this) The key part from the attached file:

<--- Last few GCs --->

[1:0x739912b6a650] 96509996 ms: Mark-Compact 254.7 (283.1) -> 254.2 (283.1) MB, 83.91 / 0.00 ms  (average mu = 0.803, current mu = 0.315) allocation failure; scavenge might not succeed
[1:0x739912b6a650] 96510117 ms: Mark-Compact 258.2 (283.1) -> 257.7 (291.1) MB, 100.70 / 0.00 ms  (average mu = 0.663, current mu = 0.169) allocation failure; scavenge might not succeed


<--- JS stacktrace --->

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
----- Native stack trace -----

<--- Last few GCs --->

[1:0x739912b6a650] 96509996 ms: Mark-Compact 254.7 (283.1) -> 254.2 (283.1) MB, 83.91 / 0.00 ms  (average mu = 0.803, current mu = 0.315) allocation failure; scavenge might not succeed
[1:0x739912b6a650] 96510117 ms: Mark-Compact 258.2 (283.1) -> 257.7 (291.1) MB, 100.70 / 0.00 ms  (average mu = 0.663, current mu = 0.169) allocation failure; scavenge might not succeed


<--- JS stacktrace --->

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
----- Native stack trace -----

message.txt

11 Replies

Mastra Triager•2mo ago

📝 Created GitHub issue: https://github.com/mastra-ai/mastra/issues/7559

GitHub

[DISCORD:1413859688893775892] Memory (RAM) issues · Issue #7559 ·...

This issue was created from Discord post: https://discord.com/channels/1309558646228779139/1413859688893775892 I have a workflow where one step is to enrich every page on a website (sometimes 600 o...

Ward•2mo ago

At this moment there might not be any solution, we are adding a event system to our workflows so we can run each step on different hardware. This is in the works but currenlty we are still in the < 1.0 range so we haven't really looked into memory and performance that much.

joneslloydOP•2mo ago

Understood, okay – Do you have any practical tips (aside from vertical and horizontal hardware scaling) that might mitigate the issue eg, "Use X or Y construct"? I'm already avoiding some of these issues by batching and using foreach but I don't know how it can always be avoided. Any advice would be hugely appreciated.

Ward•2mo ago

Try not to put any large objects inside the step outputs/inputs

joneslloydOP•2mo ago

Okay! I am curious: In a foreach, does every iteration accumulate memory usage? ie, Even with minimal object size (etc) if the collection is large enough, is there the potential for this error (in my original post) ? Or in theory is it avoidable?

_roamin_•2mo ago

@joneslloyd maybe you could try using the filesystem, load files when you need them, unload when done and then just pass the URIs around.

joneslloydOP•2mo ago

Thanks for getting back to me. I have a couple of thoughts: 1) It's a bit tricky because my Mastra workflows intentionally don't do CRUD (they only emit events) so dynamically loading in files from the FS might not be viable. 2) Even if they did do CRUD and could read from the FS (or an S3 bucket) would this not have the same effect as the foreach memory issue, because as the loop progressed toward the total number of items, each large web page would be in memory (the same as my current solution)?

_roamin_•2mo ago

I think what you need to do is keep the large items out of the input/outputs of your workflow steps, and only keep the results of whatever processing you're doing. You'll also save on database storage since workflow snapshots end up in the database.

joneslloydOP•2mo ago

What about cases where the large item (full HTML of a given web page) are necessary for the workflow step? (And in this specific instance, it needs to be done for a large percentage of a website's pages).

_roamin_•2mo ago

I'm guessing at some point there'll be better ways to handle large datasets in workflows, but for now you'll need to have a beefy server that has enough RAM to store all that data.

joneslloydOP•2mo ago

There is no other solution at this stage?

Gaming

Programming

Memory (RAM) issues

Did you find this page helpful?