Firecrawl

Join the Firecrawl server to ask questions!

Join

Firecrawl

Join the community to ask questions about Firecrawl and get answers from other members.

Join

❓┃community-help

💬┃general

🛠️┃self-hosting

Julie Grace

8/12/2024

Help Needed with Scraping Website Behind Anti-Bot Protection!

I've been trying to scrape this website: https://de.pandora.net/de/charms-armbander/charms/charms-mit-anhanger/bicolor-fahrrad-mit-drehenden-radern-charm-anhanger/763354C01.html The script works perfectly on my local playground setup, but when I move it to a self-hosted environment, it fails to scrape. I’ve also added a proxy server to bypass any unauthorized access issues, but I still can't get it to work. Here's the error message I'm encountering:...

rhyswynnastro

8/11/2024

Incomplete scraping

I'm trying to capture links from a page's table of contents. I can use pyppeteer with chromium and it returns the entire page contents, but when I use the scrape and crawl, it doesn't. I've tried both the firecrawl API and the self-hosted, neither one provides a complete response. it doesn't seem like playwright is even being used, but the logging doesn't work very well once everything is running, so I'm having a hard time troubleshooting. I know my playwright service is up, but it doesn't even seem to be used. I have the waitFor page option enabled in my scrape query, but it still doesn't seem to use playwright. https://docs.venafi.com/Docs/currentSDK/TopNav/Content/SDK/WebSDK/r-SDK-CertificatesModuleProgramming-Interfaces.php?tocpath=Web%20SDK%20REST%7CCertificate%20endpoints%20for%20TLS%7CCertificates%20API%7C_____0...

Julie Grace

8/11/2024

Error: Supabase Client Not Configured in Self-Hosted Firecrawl

I'm currently self-hosting Firecrawl and running into an issue with configuring the Supabase client. I'm using the example code from the documentation: import uuid from firecrawl.firecrawl import FirecrawlApp ...

dvmrp

8/10/2024

Support dynamic content?

Hi, is it possible in the moment? Thanks.

stan

8/10/2024

(YC W24) Inconsistent crawl results between prod and local

I'm trying to test out crawling https://fanfiction.net with a script locally before I switch to the Firecrawl API, however I'm getting different results. Locally I am just running Firecrawl with the docker setup: docker compose up from the SELF_HOST.md instructions and default .env variables with no DB. I am initiating the crawl sequence with the following command and a 200 response is returned with the jobId:...

p3nnywh1stl3

8/9/2024

Scraping openai . com

Can't seem to scrape any data from openai urls for example: https://platform.openai.com/docs/models any work arounds?...

agi

8/9/2024

SSL url3 lib

I'm using the following example to test the API but I get the shown error. Any ideas? https://docs.firecrawl.dev/api-reference/endpoint/crawl...

Julia from Storia

8/8/2024

Firecrawl doesn't seem to crawl everything

I'm trying to run Firecrawl on pytorch's documentation, and I merely get ~15 results with these URLs: ``` https://pytorch.org/docs/stable https://pytorch.org/docs/stable/distributed.pipelining.html https://pytorch.org/docs/stable/dynamo/index.html...

shivdeep

8/7/2024

Unauthorized - even after adding the valid token

Has anyone else experienced this, I am getting a 401 error even after adding a valid token. The same token was working a while ago.

Rae

8/7/2024

How can I use LLM Extract?

Any docs I can look into?

dhruvs

8/6/2024

is FireEngine live?

Just saw the blog post about the new system. Is this live today or has it already been deployed?

smon

8/6/2024

api updates

hi, we are developing a system with your api and we want auto update our requests to the api when the documentation is updated, is it possible?

kevinandthecats

8/6/2024

Cloudflare blocking scrape

I'm running into an issue with the /scrape endpoint. It's been running into Cloudflare verification. Can Firecrawl get passed this?

ハグリ

8/6/2024

Can I Retrieve Data from an SPA Website?

I am trying to retrieve information from a website built with SPA. However, my attempts have resulted in empty responses. Does your current service support data extraction from SPA websites? If so, could you provide guidance on how to achieve this?

Cody

8/6/2024

Support

Shit, i thought this meant 100,000 scrapes = 100,000 front page websites scraped 🤦‍♂️ 🤦‍♂️ 🤦‍♂️ I'm using an http api integration to Clay. Any solution to this? Was just planning on uploading a video tutorial on firecrawl on my linkedin https://www.linkedin.com/in/codycarnes, but if it cant scrape one page only, big waste of creds on my end imo...

Sachin

8/5/2024

Extraction of data from PDF files

Hi team @Adobe.Flash @rafaelmiller , Do we have the data extraction feature from PDF files (.pdf) using the FireCrawl Service in place at the moment?...

shivdeep

8/2/2024

Crawl job failed or was stopped. Status: Stuck

Is anyone seeing this now?

File "C:\Python312\Lib\site-packages\firecrawl\firecrawl.py", line 288, in _monitor_job_status
    raise Exception(f'Crawl job failed or was stopped. Status: {status_data["status"]}')
Exception: Crawl job failed or was stopped. Status: stuck

...

patrick_z_369

7/31/2024

Can i llm_extract from already scraped markdown seperately?

Is there a a way to seperately use the llm_extract method?

Antman

7/31/2024

Use Azure Openai for llm extraction

Hi I am using a local instance of firecrawl and tried the extraction function using an Openai model and it worked. I tried to use azure openai by setting up the base url and api key of my azure openai function but I am getting an error(Internal Server Error: Failed to scrape URL. Invalid status code: undefined). How do I resolve this? I want to use the azure instance as my employer pays for it.

Ray

7/30/2024

Scraping yelp.com

Hey, just wondering how proficient is Firecrawl at scraping yelp.com? I’m developing a product that relies on scraping businesses profile details from ‘yep.com/biz/{x}’ pages. Firecrawl works when I test it on a few pages, but I’m worried about it getting blocked by Captcha when deployed at scale. I’m not so worried about being IP blocked, as it seems that Firecrawl handles this....

Previous Next

Gaming

Programming

Firecrawl

Join the Firecrawl server to ask questions!

Firecrawl

Join the community to ask questions about Firecrawl and get answers from other members.