help with scraping inside data
while i used firecrawl to scrape data from a job site it only scraped data from the initial page. but the actual data is present inside the job title link i wanted to extract that data too how can i achievev it? ...here is a sample screenshot of the page

9 Replies
Hey @Thaslu , might be wise to try an crawl with
allowBackwardLinks
option set to true
. That's because is very likely that the job pages might not be children (via url) of the page you are starting the crawl on.It still dont working. any help?
Ccing @thomas here to take a deeper look
@Thaslu can you share the url with us too?
Thanks @Thaslu ! Forwarded that to our web engineer to see whats going on
Hey @Thaslu scrape will only scrape the data that is visible in the page,If I understand correctly that you need the content of all the links you probably need crawl.
def scrape_data(url):
try:
app = FirecrawlApp(api_key=os.getenv('FIRECRAWL_API_KEY'))
scraped_data = app.scrape_url(url, {'pageOptions': {'onlyMainContent': False}})
if'markdown' in scraped_data:
return scraped_data['markdown']
else:
raise KeyError("The key'markdown' does not exist in the scraped data.")
except Exception as e:
logger.error(f"Error scraping data: {e}").......this is the condition i have been set what changes i need to do?
https://docs.firecrawl.dev/features/crawl check here
Firecrawl Docs
Crawl | Firecrawl
Firecrawl can recursively search through a urls subdomains, and gather the content
what are the best parametres for these type of websites?