Firecrawl

F

Firecrawl

Join the community to ask questions about Firecrawl and get answers from other members.

Join

❓┃community-help

💬┃general

🛠️┃self-hosting

is there a limit for screenshots per scrape ?

I am trying multiple screenshots and my scrape keep erroring out as per screenshot
No description

Error in scrapeController: Error: Job wait

I recently switched over to v1, everything was working just fine before the switch. I added a console log to the scrape controller for reference. I have tested this on different urls, all of which worked with v0. Any insight would be greatly appreciate. Thank you!...
No description

My webhook url is too long to be set on the interface

Hello, I'm trying to set up webhook for crawling but my URL is too long so I can't use it What can I do?...

Crawler stopped after encountering 404,

I am using start_crawl_and_watch, when the crawler encounters a broken page, 404 error, It stops with error websockets.exceptions.ConnectionClosedError: received 3000 (registered) {"type":"error"}; then sent 3000 (registered) {"type":"error"} doesn't call bacl to on_error hook...

Trying to get a list of YC startups by tags (i.e. hardware startups).

Trying to get a list of hardware startups in YC from the public directory but it doesn't seem to be getting the companies in the list after i apply tags Also, Hey @Caleb long time no chat and love seeing the evolution of y'alls ideas over time :) this product looks great https://www.ycombinator.com/companies?batch=S24&tags=Hard%20Tech&tags=Robotics&tags=Manufacturing&tags=Industrial&tags=Aerospace...
No description

Issues and Inconsistencies During FireCrawl Testing

Hey everyone! We’ve been testing FireCrawl for a bit at our company, and we’ve come across some issues that I wanted to get your thoughts on: 1. Crashes when viewing logs: We keep seeing crashes when trying to view crawl results on the logs page, with the error: "Application error: a client-side exception has occurred (see the browser console for more info)." It looks like a 504 Gateway Timeout. Is this something you guys are aware of? Any idea when it might be fixed?...

Map does not seem to return all the urls

How do I get assured that all /map endpoint is returning all the urls on the website? I need to extract external links as well if they are links to pdf files

Discrepancy in links returned by /map

I ran the map for the same URL twice, and I received two different numbers of links returned. I believe the inputs were the same, if I'm not mistaken. 1) map • id: 54de5ef3-9119-4d1b-9e5d-ba32ed81838d • 14.504s • success 2) map • id: 41b9f9f3-1d25-47ba-a6d7-76ae689e9922 • 2.233s • success...

Crawl "Include Only Paths" not working?

I'm trying to scrape the products that exist on catalog page, and to do so I'm setting up a crawl where I set an Include Only Paths (includesPath), however the crawl only returns the original catalog URL.
The catalog page / main crawl URL: https://www.ssense.com/en-us/men/sale/clothing. Ex. Product Page 1: https://www.ssense.com/en-us/men/product/essentials/black-patch-hoodie/14616841 Ex. Product Page 2: https://www.ssense.com/en-us/men/product/auralee/brown-pleated-trousers/14085441 ...

Crawl Status pagination not working

@Adobe.Flash the pagination for cral doesn't seem to be working. Job ID: 39e8951e-f0a9-434d-b869-c2b6fbc99437 it has crawled 490 pages but the next link is not giving any data...

Failed to Scrape some websites

Hey! Any idea why some pages like: https://foringdental.com/invisalign seem to return 500?

Make API Error

Hello I have created a Prompt Extraction and it works everywhere else except for Make. I am not sure why. I have troubleshooted multiple times and it just does not seem to work for some reason.
No description

HTML -> Markdown (Missing Info)

Is there a way to make the Markdown more accurate compared to the HTML? For example, it misses social links from the headers and footers. The website https://lcqualitydental.com does not seem to include any of the three social links....

`format: ["links"]` doesn't respect `excludeTags`

format:["links"] does not appear to respect excludeTags. for example: ```ts...

INTERNAL SERVER ERROR

Hi team, I have been getting this error more frequently when trying to scrape webpages using the /scrape endpoint. Error information - Internal Server Error: Failed to scrape URL. All scraping methods failed for URL: https://www.atroposhealth.com/research-informatics/ - ['Request failed with status code 404', 'INTERNAL SERVER ERROR'] Some other error reasons observed: [WebSocket is not open: readyState 3 (CLOSED), Timed out]...

Playground works on the site i'm trying to scrape, but API SDK returns the captcha

As per title, in the playground, the site returns the scrape as I expect, but using the node SDK, the scrape is not getting past the captcha ?

Does Markdown include image alt text?

I encountered a case where the extracted HTML included the correct images and alt text. However, the markdown version missed both the images and the alt text. Do you have any ideas what lead to this?...

Is there telemetry and can it be disabled?

Hi, I'm using firecrawl for a project that involves sensitive data. I'd like to know if there is any telemetry associated with firecrawl and if I can disable it. Thanks!

LLM Extract Does Not Do Whole Page?

@Caleb Trying To Extract Structured Data From A Website. But All The Data Is Not Being Scraped. Only The First Entries At The Top Of The Page Are Being Scraped. Any Suggestions?

Does the extract functionality allow to pick different models?

Hi! Was just playing around with the API and really like it. Was wondering if it's possible to select if I want to use Claude/OpenAI/others for the extractions/missions