Cannot fully crawl https://ordwaylabs.stoplight.io/

Using the map endpoint on the playground returns only 9 results (despite first showing a "This job contains over 500 documents" warning). I also tried this website from FireCrawl's Python API and it only returns 4 results. I'm expecting many, many more.
13 Replies
micah.stairs
micah.stairsOP12mo ago
@Caleb @Adobe.Flash this is fairly time-sensitive on my end, so please let me know if I need to try an alternative crawling service for this one if you don't think it's a quick fix
Caleb
Caleb12mo ago
What is the site! We will look into it asap Ah see it in the title @rafaelmiller can you create a ticket?
rafaelmiller
rafaelmiller12mo ago
taking a look right now
micah.stairs
micah.stairsOP12mo ago
Thanks @rafaelmiller, I really appreciate it! With the crawl endpoint, I'm possibly getting more success by adding a wait action
rafaelmiller
rafaelmiller12mo ago
yeah. I got 19 results with wait 5000 checking with other parameters now
micah.stairs
micah.stairsOP12mo ago
Same, but there's a lot more pages than that I also have maxDepth set to 10 (instead of the default 2)
rafaelmiller
rafaelmiller12mo ago
I got it why firecrawl is not finding the links. The links inside the docs page (naviagtion bar) are loaded only when you click on it checking with our scraping engineer if there's an option for clicking on every option on the navbar so the crawler can see the links it has to follow
micah.stairs
micah.stairsOP12mo ago
Okay thanks for the update! @rafaelmiller do you have any further updates or should I expect an update on Monday? Thanks!
rafaelmiller
rafaelmiller12mo ago
Hey @micah.stairs sorry for the delay in getting back to you. To resolve this issue, we’ll need to implement a "click all" feature within actions. I’ve added a GitHub issue for prioritization: https://github.com/mendableai/firecrawl/issues/854.
GitHub
Issues · mendableai/firecrawl
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API. - Issues · mendableai/firecrawl
micah.stairs
micah.stairsOP12mo ago
Okay! And to clarify, does the action get performed before FireCrawl looks links to traverse as part of the crawl? I was under the impression it just affected what data was scraped from that page
rafaelmiller
rafaelmiller12mo ago
Yes, actions are performed before Firecrawl retrieves the links on a page. This means any elements clicked or interacted with during the action phase can impact which links Firecrawl detects and traverses.
micah.stairs
micah.stairsOP12mo ago
Okay good to know! Is that properly communicated in the documentation? I don't remember seeing anything about that
rafaelmiller
rafaelmiller12mo ago
I'm not sure either. I'll add it to make the behavior clearer

Did you find this page helpful?