Firecrawl

F

Firecrawl

Join the community to ask questions about Firecrawl and get answers from other members.

Join

❓┃community-help

💬┃general

🛠️┃self-hosting

I'm trying to use it in TRAE and I'm getting this error, what could it be?

'env' is not recognized as an internal or external command, operable program or batch file....

Crawl endpoint retrieving wrong sourceURL

I’ve found a mismatch in the crawl endpoint. When I try scraping a specific website, the sourceURL is being returned as it was currentURL. However, when I run the same URL in scrape mode, the response looks correct and the sourceURL is fine. The URL I’m testing with is: https://forum.tufin.com/support/kc/ext/tm/ On the crawl I receive this sourceURL: https://forum.tufin.com/support/kc/ext/tm/Content/ext/tm/intro.htm On the scrape I receive this sourceURL: https://forum.tufin.com/support/kc/ext/tm/...

Scraping site for logged-in users

can it scrape sites that require users to be logged in? Specifically, does Firecrawl support cookie-based sessions (or other session/auth replay methods)? If so, please describe how to provide cookies/session tokens, handle CSRF, and any limits or best practices (session rotation, rate limits, anti-bot handling). Thanks!

Getting Empty Respose with Extract Node

Hi can someone help with the extract node in n8n its giving a success response with ID, when using this ID run get extract data, it giving empty response.tried hitting curl in postman also then also getting same issue. If i use the crul that work in playground , with that also getting issue. Kindly help....

JSON schema not working like earlier

I'm on a hobby plan and I was using json schema (v2) method to scrape market research websites. I used to extract information using the payload attached here. Now i'm almost getting no data regarding table of contents from the websites. Is something changed in the API? If not can someone help me with a json schema that firecrawl will respect. I'm facing pressure from the stakeholders as it affects production app.

Getting intermittent error page from sites ONLY when using Firecrawl

hello there! i've successfully loaded the following site through other scrapers, including Playwright, which I know Firecrawl runs under the hood. https://www.capptegy.com/events ...
No description

Why when I tried to scrape a page with pre tag / code block it's always empty in the markdown?

I want to get code block of the documentation site, but always get empty




...
No description

Clarification on Firecrawl Data Servers, Retention, and Deletion

I had a quick question regarding Firecrawl’s data handling. Could you clarify where your servers are located (region-wise), how long data from crawls is stored by default, and whether users can fully delete their data (including logs/backups) if needed? Lastly, does Firecrawl comply with European data protection laws (like GDPR)

User Segmentation and pincode level support

I’m using Dify workflow, I need to pull selling prices based on locality (like pincode) and also based on user types (premium, normal, new user). Does Firecrawl support extracting prices this way? Specifically, can it handle geo-based pricing and differentiate by user type or session? If not, any tips on how to achieve this would be awesome....

How can I make sure that my self-hosted firecrawl can handle large amount of request?

If the request is plenty how could I handle the large amounts of requests?

crawling isn't getting accurate data from site

when i am trying to crawl through my site using the playground, I am unable to retrieve all the numerical data from the site. and extract feature is also returning different results on re run even with similar configuration and prompt. link : https://softr.biglook.ai/...

Extract all PDF URLs from webpage

Hi, guys! I've been testing the /scrape endpoint here with the json format to extract the PDF files from webpages. For pages with small number of files it's working very well, but for pages with lots of files sometimes it returns only a few and some other times ti returns the full set. Since it's AI generated it can be expected, but is there a way to pass some more instructions to the AI during the API call? I'm using the Python SDK

How to set concurrency for Firecrawl self-hosted

I'm setting max_concurrency in my crawl options but doesn't seem to do anything. the docker service is still running scrapers serially.

Does the self-hosted have a rate-limiter?

How could we rate limit crawling/scraping on the self-hosted Firecrawl?

100% of URLs Domain?

Is there any way to make MAP return 100% of a domain's public URLs? I ask this because there are certain domains that return less than half of the URLs.

JSON SCHEMA NOT WORKING

Hi team I am using scrape endpoint for json Schema output but after 50 or more URLs but not more than 80 the API request just hangs I am facing thsi issue from past 2 days. Can someone help me here. Is this a rate limit issue or what because if it is a rate limit issue why I am still facing same issue after 2 days. Thanks....

n8n - looping through 300 records to search each result - 500 error

I am new to n8n + firecrawl. Building a workflow to iterate through 300 entries from json and firecrawl/search each. From n8n I am using generic HTTP POST with this payload: { "query": "{{ $json.name }}, {{$json.city}}, {{$json.state}}", "sources": ["news"],...

invalid JSON schema with python client

I have been unable to send a request with a prompt and schema. I get OK results when I write a simplified schema within the prompt, however passing the class itself would be ideal per the documentation. Code is as follows: ``` class SecondaryAsset(BaseModel):...

Best Firecrawl methods for large recurring tasks

I have another question if you @Gaurav Chadha don't mind.
Without addressing the appropriateness of such tasks, I'd like to do the following using Firecrawl -- 1. do a search for all bid publishing sites for one service 2. do a scrape of each site, extracting bids with a "Current" or "Open" criteria (dealing with pagination if present) 3. do a scrape of each bid, extracting a known schema for each bid...

Firecrawl pro, N8n, Claudecode - upgraded but unsure how to continue.

I have subs to each of those, and i have a list of 7000 urls i need to scrape from nrd.gov - Could someone help suggest the most effecient way to scrape them all, so i can add them to my supabase backend?
Next