Firecrawl

F

Firecrawl

Join the community to ask questions about Firecrawl and get answers from other members.

Join

❓┃community-help

💬┃general

🛠️┃self-hosting

Help with scraping a paginated website using Firecrawl Actions

Hey all! I'm new to Firecrawl. I have a website I'm looking to scrape that is paginated (think several "next" and "previous" buttons that javascript render different content without changing the URL). How would I do it with actions? I'm familiar with the "wait" and "click", but how would I actually aggregate the scraped content for each paginated page? the "scrape" isn't working, but I'm probably using it wrong: "actions": [...

Simple hello world on Search returning error, missing "integration" parameter in payload

I'm getting the following error in the most simple code for search
firecrawl.v2.utils.error_handler.BadRequestError: Bad Request: Failed to search. Invalid request body - [{'code': 'custom', 'message': "Invalid enum value. Expected 'dify' | 'zapier' | 'pipedream' | 'raycast' | 'langchain' | 'crewai' | 'llamaindex' | 'n8n' | 'camelai' | 'make' | 'flowise' | 'metagpt' | 'relevanceai'", 'path': ['integration']}]
firecrawl.v2.utils.error_handler.BadRequestError: Bad Request: Failed to search. Invalid request body - [{'code': 'custom', 'message': "Invalid enum value. Expected 'dify' | 'zapier' | 'pipedream' | 'raycast' | 'langchain' | 'crewai' | 'llamaindex' | 'n8n' | 'camelai' | 'make' | 'flowise' | 'metagpt' | 'relevanceai'", 'path': ['integration']}]
the code is: ```python ...

Extract playground returns expected results on web. API extract returns empty.

I used the web-based extract playground to figure out prompt + URL and it is returning the expected results. However, when I copy the generated code from the playground to a python file, each API call returns an empty set. ```from firecrawl import Firecrawl...

Help http request firecrawl v2 in N8N

Could someone help me with cURL for HTTP requests within n8n for "map," "extract," "search," and "batch scrape"? I only managed to get "scrape" and "crawl" to work....

firecrawlapp or firecrawl class help

why does it still say that Firecrawl class doesnt exist and makes me use FirecrawlApp? I am on the 4.3.1 ver and docs say that there is no FirecrawlApp ImportError: cannot import name 'Firecrawl' from 'firecrawl'...

Detecting Redirects

I'm using SCRAPE endpoint, is it possible to determine if you've been redirected from the original URL you supplied? Would love to extract the final URL I land at in this case

Got pydantic error when import firecrawl python v2

Hi, I've just imported Firecrawl / AsyncFirecrawl (from firecrawl import Firecrawl, AsyncFirecrawl) in a FastAPI application and got an error: pydantic.errors.PydanticUserError: Decorators defined with incorrect fields: firecrawl.v2.types.SearchRequest:4709679808.validate_parsers (use check_fields=False if you're inheriting from the model and intended this) Version: - Pydantic: 2.11.7 - firecrawl-py: 4.1.0...

ID change between extract and get extract status

Hi, I have a weird think happening in N8N. When a trigger an extract I recieve an ID. Normal When I use the get extract status with this ID : status is processing but when I check on the playground I see that the extract is done but the ID is not the same than the one I got. When I copy the ID from the playground to my N8N node manually in the get extract status : i works and I recieve the data. Anyone have a clue on what happens ?...

Excluding hidden elements from HTML or Markdown

Hey all... I'm crawling a page that has display: none elements, and they are being included in both the HTML and the Markdown. Is there any way to exclude these? Thanks!...

Firecrawl seems to ignore URL hash fragments for pagination.

- Site: https://europa.provincia.bz.it/it/bandi-e-avvisi (pagination via #start=N) - Expectation (Playwright): total=15, page 0 -> 10 items, page 1 -> 5 items (selector div.result.lv_faq). - Firecrawl scrape_url results: page 1 returns same content as page 0 or misses p.result_status entirely. - Tried: formats=["html"], formats=["rawHtml"], wait_for=12000–20000, max_age=0 (fresh fetch), correct base URL from container to host. ...

Building Chrome Extension

Hi, can someone from the team or anyone (who has tried this) answer if we can build a Chrome Extension if so any video, guide or documentation, please.

Search w/ JSON

hey everyone! I am trying to test out Firecrawl versus our app's current Perplexity integration, and trying to do a bit of a head to head comparison between sonar-pro w/ structured JSON, and Firecrawl's /search + json schema extraction. First, I am struggling to get search to return ANY json from the search sdk/endpoint. Is this working as of V2? I'm just trying to get any search or scrape result to return json according to my simple testing schema, and neither are working for search OR scrape, I'm using the v2 SDK....

Trouble extracting URLs from a JavaScript-based paginated search

Hi everyone, I'm struggling with scraping a list of URLs from a search results page. The site uses JavaScript to load 20 results at a time, and to get the next 20, I have to click a "Next" button. There’s no change in the URL or any query parameters that I can use to trigger pagination directly. I was thinking about using actions in Firecrawl to simulate clicking through the pages and extracting the links, but I’m not sure if that’s the right approach or the most efficient one....

Have a problem working with Firecrawl (im a beginner)

In all of the tutorials on Youtube for Firecrawl, the first step of extracting is to ''generate parameters''. Problem is that I cant seem to find this generate parameters. I know its a really noob question probably but any help is appreciated 🙂
No description

Glitch or Bug in n8n+Firecrawl Extract

Hey all. Really weird one here. My n8n workflow using Firecrawl Extract was working perfectly 2 days ago. Now, after using the Firecrawl EXTRACT node (and have tested this as HTTP Request node) , when the workflow then executes the next node, which is the Check Job Status Firecrawl node, it bizarrely changes the job ID which was passed as input from the EXTRACT node. Turns out it's weirdly triggering a new extract job in the previous node, and then using that new extract job id in the Check Job...

Firecrawl API consuming unstopped

Can someone helps me, My firecrawl API is being consumend but Iam not using it!!?? I am not using it and it is being consumed :((((((((((((...

Output like Apify

Hi, I'm using Firecrawl in conjunction with n8n to crawl some news websites. I was using Apify but gave up as their support was non-existent. I was crawling news articles and the Apify node managed to give me the article body as a text field. It had all of the other links, widgets etc in separate fields and just the main article text of each site in a field named 'text'. When i attempt to get the same output with Firecrawl, I'm getting differently named fields for different sites, and a large amount of unrelated info in them. While I can use it, theres way more data to parse, as well as added complexity of differently named fields, and my AI agents will use 2-3x more tokens per site because of that. Is there any way i can get Firecrawl to put just the article bodies in a field named the same way for each different website i scrape?...

Struggling with paginated sources - Firecrawl amazing for details, painful for lists

Hey team! Love Firecrawl for single page extraction (the JSON schema feature is amazing), but I'm hitting a wall with paginated list pages. My use case: Monitoring Italian government tender dashboards that update daily. Need to track new tenders across multiple paginated pages. The problem:...

why i use the proxy still can't crawl youtube

payload = {"url": url, "onlyMainContent": False, "formats": ["markdown"],"proxy": "stealth", "location": {"country": "US"}}