Firecrawl

F

Firecrawl

Join the community to ask questions about Firecrawl and get answers from other members.

Join

❓┃community-help

💬┃general

🛠️┃self-hosting

Firecrawl pro, N8n, Claudecode - upgraded but unsure how to continue.

I have subs to each of those, and i have a list of 7000 urls i need to scrape from nrd.gov - Could someone help suggest the most effecient way to scrape them all, so i can add them to my supabase backend?

Search Query Operators

I am using the /search endpoint , and we're making pretty heavy use of the query operators described here: https://docs.firecrawl.dev/api-reference/endpoint/search#supported-query-operators I just had a quick question around recommended usage or limits. Is there a maximum amount of operators that we can include in a single query? Or is there a recommended limit we should try to stay under for performance reasons? Or is it really just a free for all? Our use case is to exclude a fair number of URLs from our firecrawl queries, and I couldn't find any guidance in the documentation around if it matters if we use 10 query operators, 50 query operators, etc....

New to scraping. Have over 100k pages I need to scrape and maintain.

Been doing some ai research and stuck between browserbase/firecrawl for efficiency and cost consciousness. Bootstrapped out startup, so we’re worried about taking the wrong steps. Any advice? Are there any services that do this for you?

How to execute an API fetch inside a public web page

When I use: var aResult = await app.scrapeUrl( aURL, { ... } ) I get a nice table of many rows. The web page has a form for filtering rows. The form's key value pairs get sent to the web server via an API call.
In DevTools, I see this.ajax( aURL, "GET", { data: { key1: value, key2: value, ... } ) ...

maxDiscoveryDepth

Hello, I would like to ask how does the maxDiscoveryDepth works? Right now I am trying the depth to be two and the limit to be 10 for https://books.toscrape.com/ to test this parameters. I somehow don't get it. Tho the results were like this: ``` "data": [ { "links": [ ... },...

facing an issue here, while scraping one of the blog website on firecrawl

Hey, I am facing an issue here. while scraping one of the blog website, firecrawl still generating main website content. For ex - I am scraping for ||(www.main-brand.com/blog/any-blog-content), but its still giving me scrape content for (www.main-brand.com) only. it seems scraping is not working for any /blog subsection domain. Is there any prerequisite for scraping blog website? Am I missing something here? I have tried all feature correction like zero maxage, stealth mode, main content:false etc. but still no correct response. ...

Disappointing quality of PDF page scraping

We have several documents which contain tables with a lot of relevant information for AI tools. How can I create good markdown files from thes PDF's?

Extract (with fire-agent) taking long time.

Why is extract (with fire-agent) taking so long?

SCRAPE_SITE_ERROR : ERR_TUNNEL_CONNECTION_FAILED

Hello, I have this error on two differents website : {"success":false,"code":"SCRAPE_SITE_ERROR","error":"Specified URL is failing to load in the browser. Error code: ERR_TUNNEL_CONNECTION_FAILED"}. It happens 50% of the time and for the same url, sometimes it's worked, sometimes no. Few months ago, it worked fine....

/crawl robots.txt

Does /crawl respect robots.tx Crawl-delay?

Help Request: /map + search not returning product URLs

I’m using the /v2/map incl. search query endpoint to discover product detail page URLs, but I’ve run into an issue where some PDPs aren’t being returned even though they’re live and accessible on the site. Example Site: https://www.bstgroup.eu Query: "PRF Mouse"...

How can I use Firecrawl to crawl all the content from my Twitter search results?

Previously, I used to search for specific keywords on Twitter and read through the posts one by one to understand people’s current opinions and announcements. Now, how can I use Firecrawl to directly crawl this information and extract frequently mentioned keywords?...

Image Firecrawl arm64?

does the firecrawl image work on arm64 vps?

Take a long screenshot

Hi, I'm having issues taking screenshots of long pages. It cuts off half way, returning the rest of the page in white. I thought it may be a size issue, so had the idea of capturing multiple screenshots at different spots and stitch them together. But I haven't been able to get this to work (it just takes multiple screenshots of the initial viewport)...

Scraping not returning with FIRE-1

Hello, I have a scraping task (all in python). My approach is to use the default scraper, review schema results, and if they aren't great, I try again using FIRE-1. I'm using async in python to wrap a timeout around this, and I set a 120 second timeout, and the FIRE-1 results still are not returning. However, in the playground, I do see it work. ...

Unable to scrape and extract data. Starts to Parallelise then nothing happens

URL is this: https://am-arrowmax.com/collections/* Prompt:
Follow every product link to ensure all variants, including sold-out ones, are included. Capture the title, all variants (size, color, etc.), images per product and variant, price, and product/variant URLs. Ensure a clear mapping of product to variants to images to price to URLs.
Follow every product link to ensure all variants, including sold-out ones, are included. Capture the title, all variants (size, color, etc.), images per product and variant, price, and product/variant URLs. Ensure a clear mapping of product to variants to images to price to URLs.
The Schema was auto generated...
No description

Country not supported in proxies

Is there a way to scrape a site whose country is not supported in the proxy lists ? For example i am from greece and some specific sites i want to scrape have access only in my country!...