facing an issue here, while scraping one of the blog website on firecrawl
Hey, I am facing an issue here. while scraping one of the blog website, firecrawl still generating main website content. For ex - I am scraping for ||(www.main-brand.com/blog/any-blog-content), but its still giving me scrape content for (www.main-brand.com) only. it seems scraping is not working for any /blog subsection domain.
Is there any prerequisite for scraping blog website? Am I missing something here?
I have tried all feature correction like zero maxage, stealth mode, main content:false etc. but still no correct response.
Although scraping is working fine for other path like /about us etc
6 Replies
Can you share the URLs you are having trouble with?
https://www.gabit.com/blog/understanding-heart-rate-zones
likewise other blog post on this website
Mastering Your Workouts: Understanding Heart Rate Zones
HR zones are a way to understand exercise intensity and optimise workouts to meet specific fitness goals.
did we find the issue?
Are you using scrape or crawl?
Please share your job ID and I'll take a look
using scrape,
job ID:
1bffab95-255e-409c-9fc9-6d23159aa092
other job ID:
4c97ce64-5a32-4b24-b688-4479108ebe39
let me know if you find the issue. thanks
Are you self-hosting? I can't find those job IDs in our database.