Firecrawl Crawl Issue - n8n Integration

Issue: Firecrawl crawl via n8n node (@mendable/n8n-nodes-firecrawl v1) has three problems: Initial API error: First 2 attempts fail with scrapeOptions.formats validation error (expected array, received object), even though my config only has default empty headers. After 2-3 retries it starts. Rate limiting: Hitting "too many requests" on free tier despite delay: 1000, maxConcurrency: 5, and batching. Crawl never completes: Once started, crawl with limit: 5 stays in "running" indefinitely. Status never reaches "completed" and I have to manually stop it in the dashboard. No pages are returned. Config: Using crawl operation with a prompt to generate paths, excludePaths: ["data/*"], limit: 5, and default scrapeOptions (empty headers only). Main blocker: Crawl starts but never finishes, even for small limits. Detailed description with full config and response examples attached below.
3 Replies
Gaurav Chadha
Gaurav Chadha4w ago
@pjraven free plan only support a concurrency of 2 on free plan - https://docs.firecrawl.dev/rate-limits#rate-limits
Firecrawl Docs
Rate Limits | Firecrawl
Rate limits for different pricing plans and API requests
pjraven
pjravenOP4w ago
Yes the issue occurs also when I dont define concurrency. It should not give a 400 because of that Now it worked with the body below. Explanation: The v2 API doesn't accept nested crawlOptions or scrapeOptions.options. Flatten them: - Remove crawlOptions and move its properties to the root. - Remove scrapeOptions.options and keep only direct properties in scrapeOptions. Corrected body:
{
"url": "={{ $json.companyWebsite }}",
"sitemap": "include",
"crawlEntireDomain": false,
"limit": 5,
"allowSubdomains": true,
"excludePaths": [
"privacy/*",
"data/*",
"impressum/*",
"products/*"
],
"prompt": "Extract content related to recent changes at the company that can be used as a topic in an outreach message. Do not extract information that is not of current concern or is not reflecting a unique characteristic. It can be things mentioned on the home page, blog, news, press, about, why etc.",
"scrapeOptions": {
"formats": [
"markdown"
],
"onlyMainContent": true,
"maxAge": 172800000,
"parsers": [],
"waitFor": 5000,
"headers": {}
}
}
{
"url": "={{ $json.companyWebsite }}",
"sitemap": "include",
"crawlEntireDomain": false,
"limit": 5,
"allowSubdomains": true,
"excludePaths": [
"privacy/*",
"data/*",
"impressum/*",
"products/*"
],
"prompt": "Extract content related to recent changes at the company that can be used as a topic in an outreach message. Do not extract information that is not of current concern or is not reflecting a unique characteristic. It can be things mentioned on the home page, blog, news, press, about, why etc.",
"scrapeOptions": {
"formats": [
"markdown"
],
"onlyMainContent": true,
"maxAge": 172800000,
"parsers": [],
"waitFor": 5000,
"headers": {}
}
}
Changes: 1. Removed crawlOptions wrapper — moved allowSubdomains to root level 2. Removed scrapeOptions.options — moved headers directly into scrapeOptions The v2 API expects a flat structure, not nested option objects.
Gaurav Chadha
Gaurav Chadha4w ago
you can refer to this for supported v2 options for scrape - https://docs.firecrawl.dev/api-reference/endpoint/scrape

Did you find this page helpful?