How to transfer all data from Squarespace site to a Lovable Site?
How to transfer all data from Squarespace site to a Lovable Site? someone said i can use firecrawl to do this but im lost
Crawling website to get all data including pagination data
So there is this website, which has company data and which has pagination. initially it displays data for page 1 which lists 15 company names .I used " https://api.firecrawl.dev/v1/scrape " endpoint with prompt to get all company names, it does only give me 15 company names, which are listed in page 1. How to go about fetching compnay names of those page 2,page 3.... .
Also tried using "https://api.firecrawl.dev/v1/crawl" endpoint. The response is passed on to my webhook url and i do receive data but it is complete page data. I want only the company name data.
This endpoint also does not support prompt, using which i could have got only company names.
Please suggest me how to approach such problem....
Searching beyond the 100 page limit
If I want to search beyond the limit of 100 API requests, how can I do this? I'm looking to search for the top 5000 results for my search?
Downloading activity log
Hello, if I want to download all of the markdowns / JSONs of scraped websites (I have over 120,000), is there a way to do this in my activity log without individually downloading each of them?
Error: OnlyMainContent parameter not working in scraping
Hey im trying to scrape these urls "https://www.northeastern.edu/research
https://www.northeastern.edu/graduate" to get the main content from the pages. But i get back messy polluted data full of html content and UI/navigation residue that should be prevented by the OnlyMainContent script. What do i need to do to scrape only the main content from the urls? ps: im building a large dataset so using the extract function would be too time consuming....
ERROR: Specified URL is failing to load in the browser
I'm trying to scrape this URL
https://www.mnir.ro/wp-content/uploads/2023/11/024-IMG_8490-683x1024-1-1.webp
And i get this error...
Error The reqeust URL could not be retrived!
Hey guys were getting this error on most of our scrape data.
"Markdown": "# ERROR\n\n## The requested URL could not be retrieved\n\n* * \n\nThe following error was encountered while trying to retrieve the URL: http://superdry.com.au/products/vintage-b-boy-cap-eclipse-navy\n\n> Access Denied.\n\nAccess control configuration prevents your request from being allowed at this time. Please contact your service provider if you feel this is incorrect.\n\nYour cache administrator is webmaster%20Chrome%2F137.0.0.0%20Safari%2F537.36%0D%0AAccept-Language%3A%20en-AU,%20en-US%3Bq%3D0.7%3Bq%3D0.9%0D%0AAccept%3A%20text%2Fhtml,application%2Fxhtml+xml,application%2Fxml%3Bq%3D0.9,image%2Favif,image%2Fwebp,image%2Fapng,%2F%3Bq%3D0.8,application%2Fsigned-exchange%3Bv%3Db3%3Bq%3D0.7%0D%0AAccept-Encoding%3A%20gzip,%20deflate%0D%0AHost%3A%20superdry.com.au%0D%0A%0D%0A%0D%0A).\n\n * *"
can you advice on what to do? this is the first time i saw this error. Thank you...
Excluding SVG data URI from the markdown output
I'm getting very long
data:image/svg+xml
data URIs in the markdown outputs from bulk scrape endpoint. I'm trying to exclude all media using the parameters below:
```json
{
"removeBase64Images": true,...Proxy issue
May I know about your proxy?
1. do you only support 11 countries, like in this doc?
https://docs.firecrawl.dev/features/proxies
2. and I can only use stealth mode proxy in Br and US?...
Can't access dashboard
After signing in, when I click dashboard the page quickly gets re-direct through the following urls:
1. firecrawl.dev/app
2. firecrawl.dev/signin
3. firecrawl.dev/password_signin...
Scraping data from linked page
Hi everyone, I'm new to Firecrawl and trying to get used to the Crawl functionality. There's a page I'm interested in (https://www.ussportscamps.com/soccer/nike) that has a number of soccer camps listed with hyperlinks. Ideally I want to open each link and retrieve information on each camp like address, cost, etc. Here is an example camp: https://www.ussportscamps.com/soccer/nike/nike-soccer-camp-pima-county-surf
Is this possible?...
n8n Integration - Extract + FIRE-1
I am trying to build a db of available rows in a site that is behind auth; pass through an https request, thats not the issue. Im using a community node in n8n (n8n-nodes-firecrawl-scraper). The issue is when using the same prompt in n8n to perform the extract the results are only from the first page of the table, where as when I execute the extract from the site in the playground (with the agent enabled; which might be the difference) its extracts the entire table successfully. Additionally the...
self-hosted /scrape doesn't populate the JSON schema
Running
batch/scrape
on a local setup using Docker, I'm getting all the metadata, but no actual content.
See an example of my results:
```
[{...`/map`: sitemapOnly is FALSE and still only sitemap.xml is used
I'm running the same
/map
request both on Firecrawl Cloud, as well as locally (self hosted Docker).
Cloud returns 149 links, while my local setup returns 117 links - exactly the amount of links in the sitemap.xml
of the website.
This is my curl
payload: ...Getting error in self host using extract
Getting error in selfhosting
aiohttp.client_exception.ClientError:Failed to parse Firecrawl error response as JSON. Status code :404...

<style> replaced with <link>
Hey all... I'm having an issue with
<style>
tags being replaced with <link>
tags. I'm using formats: ['rawHtml']
, and onlyMainContent: false
and I've tried a whole bunch of other stuff, but no joy...
This is what I'm getting in place of the <style>
tags in the output:
<link rel="stylesheet" type="text/css" href="cid:css-6f98369c-94e5-4096-a6f6-755b9e5c5aff@mhtml.blink">
...Error when using Firecrawl MCP
Hey guys! How are you? I am Clara from Darwin's Product team. I am trying Firecrawl MCP and I am having an error that maybe you could help me. When I scrape the the web the output of the MCP is that it was unable to extract information(image) but when I check in firecrawl the activity logs I see that the url was screapper correctly. Can you help me understand what is the issue please?

Source URL is not allowed by includePaths/excludePaths
I am trying to crawl this website here, but I noticed I get the same error on other websites if I try to crawl a url with many subpages in the URL. The error being "Source URL is not allowed by includePaths/excludePaths".
https://www.dhl.com/se-sv/home/frakt/hjalpcenter-for-europeisk-vag-och-jarnvag/anvandbar-information-och-hamtningsbara-filer.html
I get the same error with this URL:...