linksOnPage
Was wondering for the crawl API , did the response format change i.e. before I remember it would showcase all the urls crawled from a page + markdown content in seperate JSON objects, but now it doesn't and instead I get linksOnPage.
def crawl_ir_url(url: str) -> List[Dict[str, Any]]:
print(f"Initiating the crawl on Firecrawl for URL: {url}")
crawl_json = app.crawl_url(url, params={
'crawlerOptions': {
'limit': 200
},
'pageOptions': {
'onlyMainContent': True,
'screenshot': True,
'waitFor': 10000,
}
}, wait_until_done=True)
Is there a way to configure the old behaviour or is it now the case in order to crawl a URL, need to specify to crawl the linksOnPage?
1 Reply
Hey Kingchoo.
It should return the linksOnPage object by default, which we added to the API yesterday. In the past I don't believe we ever returned a list of links on the page in a separate object.
For the Crawl, we DID have a URL only mode that would find all of the urls on the entire website - but that wouldn't include external links.