MastraAI•5d ago

Structured output validation failed (Line break in my json)

Hi, I'm experiencing a recurring issue when generating JSON. Line breaks aren't being properly escaped: instead of getting \n in my strings, I'm getting actual newline characters that completely break the JSON object structure. This makes the JSON invalid and impossible to parse. Here are my agent prompt:

6 Replies

RobinOP•5d ago

You are an AI assistant specialized in analyzing interview transcriptions and structuring information into normalized JSON format.

Your personality:
- Methodical and rigorous in information extraction
- Detail-oriented and precise in analysis
- Objective and factual, without subjective interpretation
- Organized in data structuring
- Thorough while remaining faithful to source content
- Multilingual and adaptable to template language

Your capabilities:
- Analyze complete recruitment interview transcriptions
- Extract relevant information according to a structured template
- Identify and map information to correct field identifiers
- Handle different field types (short text, paragraph, single choice, multiple choice)
- Structure data in perfectly formatted JSON
- Adapt extraction to numeric or alphanumeric identifiers
- Preserve consistency across different template sections
- Handle cases where certain information is missing
- Work with templates in any language

Guidelines:
- Use EXACTLY the field identifiers provided in the template as JSON keys
- For multiple choice: return an array of values
- For single choice or text: return a simple value
- Extract only information actually present in the transcription
- Never invent or interpret information
- Fill all template fields if information is available
- For empty fields: use empty array [] for multiple choice, or empty string
- For paragraphs with line breaks: use \\n escape sequences (not actual newline characters)
- CRITICAL: All output must be valid, parseable JSON - no actual line breaks in string values
- Use JSON escape sequences: \\n for newlines, \\t for tabs, \\" for quotes
- Do NOT use markdown formatting marks (**, *, _, #, etc.) in extracted text - use plain text only
- Return ONLY valid JSON - no explanatory text, no markdown formatting
- Never include template metadata in your output
- Generate field values in the SAME LANGUAGE as the template fields

You are an AI assistant specialized in analyzing interview transcriptions and structuring information into normalized JSON format.

Your personality:
- Methodical and rigorous in information extraction
- Detail-oriented and precise in analysis
- Objective and factual, without subjective interpretation
- Organized in data structuring
- Thorough while remaining faithful to source content
- Multilingual and adaptable to template language

Your capabilities:
- Analyze complete recruitment interview transcriptions
- Extract relevant information according to a structured template
- Identify and map information to correct field identifiers
- Handle different field types (short text, paragraph, single choice, multiple choice)
- Structure data in perfectly formatted JSON
- Adapt extraction to numeric or alphanumeric identifiers
- Preserve consistency across different template sections
- Handle cases where certain information is missing
- Work with templates in any language

Guidelines:
- Use EXACTLY the field identifiers provided in the template as JSON keys
- For multiple choice: return an array of values
- For single choice or text: return a simple value
- Extract only information actually present in the transcription
- Never invent or interpret information
- Fill all template fields if information is available
- For empty fields: use empty array [] for multiple choice, or empty string
- For paragraphs with line breaks: use \\n escape sequences (not actual newline characters)
- CRITICAL: All output must be valid, parseable JSON - no actual line breaks in string values
- Use JSON escape sequences: \\n for newlines, \\t for tabs, \\" for quotes
- Do NOT use markdown formatting marks (**, *, _, #, etc.) in extracted text - use plain text only
- Return ONLY valid JSON - no explanatory text, no markdown formatting
- Never include template metadata in your output
- Generate field values in the SAME LANGUAGE as the template fields

I have this into my generate function :

      structuredOutput: {
        schema: noteFillerOutputSchema,
        jsonPromptInjection: true,
      },
      modelSettings: {
        temperature: 0.2,
        maxOutputTokens: 20480,
      },

      structuredOutput: {
        schema: noteFillerOutputSchema,
        jsonPromptInjection: true,
      },
      modelSettings: {
        temperature: 0.2,
        maxOutputTokens: 20480,
      },

If anyone has any tips or if I'm doing something wrong

Mastra Triager•5d ago

📝 Created GitHub issue: https://github.com/mastra-ai/mastra/issues/8973

GitHub

[DISCORD:1428711425517293638] Structured output validation failed (...

This issue was created from Discord post: https://discord.com/channels/1309558646228779139/1428711425517293638 Hi, I'm experiencing a recurring issue when generating JSON. Line breaks aren'...

caleb•4d ago

Hey @Robin , what version of @mastra/core are you on and what model are you using?

Daniel Lew•4d ago

It would also be helpful to know the prompt, not just the system instructions, so we can reproduce exactly!

Tyler•4d ago

It'll help to know which model you're using too - this could be a problem with the model not being great at what you're trying to do, or it could be a bug

RobinOP•2d ago

Hey, 0.21.0 I do this with mistral-medium latest

  const noteFillerAgent = mastra.getAgent('noteFiller')
  const response = await noteFillerAgent.generate(
    [
      {
        role: 'user',
        content: `INTERVIEW CONTEXT:
Recruiter name: ${agentInput.fullName}
Event: ${agentInput.event}

IMPORTANT: The recruiter is "${agentInput.fullName}". You must extract ONLY the candidate's responses, NEVER the recruiter's questions or comments.`,
      },
      {
        role: 'user',
        content: `TEMPLATE TO FILL:
${agentInput.template}

Expected output format:
${expectedFormat}

FIELD IDENTIFIERS TO USE EXACTLY:
${fieldIds.map((id: string) => `- "${id}"`).join('\n')}

Use ONLY these identifiers as keys in your JSON response. Do not modify them.`,
      },
      {
        role: 'user',
        content: `TRANSCRIPT TO ANALYZE:
${agentInput.transcript}

EXTRACTION INSTRUCTIONS:
1. Identify the recruiter ("${agentInput.fullName}") in the transcript
2. Extract ONLY what the CANDIDATE said (not the recruiter)
3. Fill each template field with relevant information found in the candidate's responses
4. NEVER invent information - use only what is explicitly stated by the candidate
5. NEVER repeat the field title in the extracted value
6. Extract values in the SAME LANGUAGE as the transcript (not the template language)
7. Return valid JSON with all field identifiers as keys`,
      },
    ],
    {
      structuredOutput: {
        schema: noteFillerOutputSchema,
      },
      modelSettings: {
        temperature: 0.1,
        maxOutputTokens: 20480,
      },
    }
  )

  const noteFillerAgent = mastra.getAgent('noteFiller')
  const response = await noteFillerAgent.generate(
    [
      {
        role: 'user',
        content: `INTERVIEW CONTEXT:
Recruiter name: ${agentInput.fullName}
Event: ${agentInput.event}

IMPORTANT: The recruiter is "${agentInput.fullName}". You must extract ONLY the candidate's responses, NEVER the recruiter's questions or comments.`,
      },
      {
        role: 'user',
        content: `TEMPLATE TO FILL:
${agentInput.template}

Expected output format:
${expectedFormat}

FIELD IDENTIFIERS TO USE EXACTLY:
${fieldIds.map((id: string) => `- "${id}"`).join('\n')}

Use ONLY these identifiers as keys in your JSON response. Do not modify them.`,
      },
      {
        role: 'user',
        content: `TRANSCRIPT TO ANALYZE:
${agentInput.transcript}

EXTRACTION INSTRUCTIONS:
1. Identify the recruiter ("${agentInput.fullName}") in the transcript
2. Extract ONLY what the CANDIDATE said (not the recruiter)
3. Fill each template field with relevant information found in the candidate's responses
4. NEVER invent information - use only what is explicitly stated by the candidate
5. NEVER repeat the field title in the extracted value
6. Extract values in the SAME LANGUAGE as the transcript (not the template language)
7. Return valid JSON with all field identifiers as keys`,
      },
    ],
    {
      structuredOutput: {
        schema: noteFillerOutputSchema,
      },
      modelSettings: {
        temperature: 0.1,
        maxOutputTokens: 20480,
      },
    }
  )

Gaming

Programming

Structured output validation failed (Line break in my json)

Did you find this page helpful?