Firecrawl

Join the Firecrawl server to ask questions!

Join

Firecrawl

Join the community to ask questions about Firecrawl and get answers from other members.

Join

❓┃community-help

💬┃general

🛠️┃self-hosting

micah.stairs

11/20/2024

Not getting back the favicon icon as expected

https://www.firecrawl.dev/app/playground?url=https%3A%2F%2Fdocs.emplifi.io%2Fplatform%2Flatest%2Fhome%2F&mode=scrape&limit=10&excludes=&includes=&formats=markdown&onlyMainContent=true&excludeTags=&includeTags=&includeSubdomains=true&mapSearch=&uniqueKey=1732131665066 The response includes a rectanglar SVG, but I don't see the favicon icon anywhere (which is what I see next to a tab's title in a browser). ``` "ogImage": "https://docs.emplifi.io/__assets-853f5193-e996-4d6b-8e42-47a63d0c2dd9/image/logo-svg.svg",...

Chushogen | Abstract Artist

11/19/2024

Exclude subdomain from being scraped

Is it possible to exclude a subdomain when scraping a website? I have a client site I would like to scrape, but they recently added a subdomain which list specific products/services not directly related to their main service. e.g...

Gunter

11/18/2024

Running locally: "The engine used does not support the following features: waitFor"

I run the latest version from Github on localhost. I can't use the v1 scrape with "waitFor". I get the warning:

The engine used does not support the following features: waitFor -- your scrape may be partial.

The log says "scraping via fetch". Which engines are at play here and how to use them? I can't find a parameter for that....

YoanG

11/17/2024

Webhook V0 doesn't work

Hello, I was using Webhook from V0 for weeks, it was working well and I noticed yesterday that it isn't working anymore: I'm not receiving any requests to the endpoint specified in the website settings...

dhruvs

11/15/2024

Markdown includes ASCII, Unicode, and unreadable HTML

See Crawl data attached for reference. Thank you!

Firecrawl_Documents....

muellali

11/14/2024

only cookies in scrape results

I'm encountering challenges with cookies while scraping and crawling various websites. For many websites, instead of retrieving the actual content, I am only able to extract cookie-related information or consent texts, without any meaningful content. This significantly affects the effectiveness of the scraping process. Following previous advice (as discussed in this Discord thread https://discord.com/channels/1226707384710332458/1226707384710332465/1251760606504288388), I have excluded only the most apparent tags associated with cookie prompts. Unfortunately, this approach has not been entirely effective, as cookie prompts still obscure much of the main content. Interestingly, when I test the scraping on Playground, I achieve excellent results on the same websites. If there’s a more recent solution or method for effectively handling these cookie prompts, I’d be eager to learn more about it....

scrape_output.json

d.abaev

11/11/2024

We keep getting error message "All scraping engines failed"?

pls help

ikristoph007

11/6/2024

account upgraded but limit still 10/min

I just upgraded my account just so I can run more tests but I still seem to be stuck at 10/min. Does it take some time to update/do I need to recreate the key?

micah.stairs

11/6/2024

Python API doesn't provide nice way to iterate over crawled data

Checking the crawling status (https://docs.firecrawl.dev/sdks/python#checking-crawl-status) gives a "next" link (e.g. https://api.firecrawl.dev/v1/crawl/789e6a93-81b6-44f5-9f0e-67f6263059e8?skip=0), but there doesn't appear to be a way to use the FirecrawlApp to fetch this data and instead I need to use a separate Python library like "requests" to make the request. Am I missing something?...

ikristoph007

11/4/2024

batch scrape repeating results

I shared this over X this morning but batch scrape seems to be failing for me today. I send three urls and I get three results but their all for the same url ( usually the first ). I am using the cloud API. There is a gist here: https://gist.github.com/kristoph/ee658b7d7fe0ea16a1d435a069be8295 ...

11/3/2024

Bug with auto-recharge credits

Hi, it looks to me like auto-recharge credits are getting consumed over "normal" account credits, meaning that the account never even gets within 1000 credits of the limit before getting charged again. If the auto-recharge credits roll over, it should consume the monthly credits first and then the recharge credits... or you should be allowed to wait to recharge until the account is completely out of credits. It is not helpful to have such a huge buffer.

whoisit1118

11/1/2024

Params Ignored on Batch Scrape API

Hi! Tried the hosted batch scrape API for the first time today and ran into some issues. The OnlyMainContent and timeout params are seemingly ignored when I use the batch scrape method. Is there any other variables that I need to include in the payload to enable these options?

alex

11/1/2024

Does `Map` return all the URLs in order of relevance even when the `search` parameter is not used?

https://docs.firecrawl.dev/features/map

JasonV

10/30/2024

Self-hosting Questions

Before I jump on the hosted product, I like to test locally. Does self-hosted Firecrawl support: * Javscript / SPAs? In my testing, it seems no. The results are empty. * Reading files from arbitrary URIs, including file:///path/to/localFile.pdf? I tried something similar, but clearly wget Thanks....

micah.stairs

10/30/2024

Getting "Invalid cookie fields" error

I'm using the Python FireCrawl API and am trying to crawl a website behind authentication. It keeps failing when trying to scrape the first page. Am I doing something wrong? ``` app.async_crawl_url( "https://portal2.vcinity.io/s/",...

xasdasdas

10/29/2024

Unable to scrape indeed url

It was working fine till yesterday. { "url": "https://ca.indeed.com/viewjob?jk=89d7f1cc3054c126&tk=1ibcemdv6jn2a801&from=hp&vjs=3&advn=7919383835287656&adid=384927630&ad=-6NYlbfkN0BkvSqPB7txKGhOQSuBkqljSXIPNNOywyQc03G4_L-y5zqtSmULOhCauyfaSLqGHXS23a8CzPu7re4alPi6E8hMqw5s3cEHYxZjG8OGoxz_BF_IfxuAlHg56GYAzmMjceZuvoV8s44gl01LNpYCVEX0lfWxuYaDVo3pJssRZzQSxfWsFAi6s0OlPQNbJoJL1MEAw2Rix8gt6acdUakJm8Cb0N32fGTt9nkyGSaA3MyCbNpeFkh2XE89A_4O0WuK1Asr16YTRYtcAEd_yVqLyWtfDxuC39rPOBIO6sQ8AOV0jNp0tfQWykAG2WWxIhHrZAH4tP0t4k4BrEY5RaMDpA7OsFoPq19tvgZiMhFgdPAGAiVdgcQZhkViA9PNKav9b90hoBzfQUf2q_rZTSfXoQ_BkAEG284hVf4tq6zXXXYoI5mA0b-r30rjZHnYBt3D6Vr7wM8ENQ9QdeBwaQo3JpCJZRnPHr0UOTJJG5lVd4zWww7UpxAEhloOUNGxz2EKnbtzkDpu8wKctrz5_Y0iNN0qURMfG8FA7F8PFaedc9sdnp8H6QX8xr0so5AhDuGDWWJnM2zjh_Y0LDCbaSswRO3fQNw7FsQV0OXw5RdiImZ3Nw==&xkcb=SoD76_M36sVWvywB5J0LbzkdCdPP&xpse=SoA06_I36sVWZGR7n50IbzkdCdPP&sjdu=o4-SOnWFj7zDQa1x_oNfXdq7ED1XT5Bb9w9Crk2BBM1TaV54WyBRVunvknJ4haBtDHorEu9E3Ggx0ZUSwOza2kECv-r3eSkK-iwMwl6MMJSNRL2n3bCrle9sRAzZvag-2_iCukWB1z7cU3HtF2xwj_O6G10EUUKLg5TyLOIYngRXNZI9_2MVj3m9UlTtZSnr9M4di0luNfVKRxCQMLM8_l_jZMIX13PdEexXVjH7xBrh6ncAorSjBiJBgA5ceMaQJJIeA7u8IlESiGla7gJyNA", "type": "scrape",...

Alexander

10/29/2024

How to secure FireCrawl Webhooks?

I am a newbie trying to implement crawling with Webhooks. I've setup a simple FastAPI endpoint to receive webhooks from FireCrawl, but I don't understand - how do I make sure that the webhook endpoint is secure?

micah.stairs

10/28/2024

Invalid PDF structure

I'm using the crawl endpoint and one of the URLs it discovered is https://www.gamweb.com/assets/files/lsk.pdf, however, I get a "Invalid PDF structure" error when the page is scraped by FireCrawl. I can see why, since it's webpage with an embedded PDF instead of just a raw PDF as the URL implies. However, I do think that FireCrawl should be able to gracefully handle this.

gmoney

10/28/2024

is there a way i can screenshot the mobile view of the site?

shepshep7

10/28/2024

Cannot access bull dashboard when running worker

I'm self hosting and have built a docker image. when running without the FLY_PROCESS_GROUP env variable (set to worker) I can access the dashboard, however jobs do not process because there is no worker. when I set that FLY_PROCESS_GROUP variable, the jobs process but the dashboard is not accessible

Previous Next

Gaming

Programming

Firecrawl

Join the Firecrawl server to ask questions!

Firecrawl

Join the community to ask questions about Firecrawl and get answers from other members.