How to retry only failed requests after the crawler has finished ?
I finished the crawler with around 1.7M, and got around 100k failed requests. Is there a way to retry just the failed requests ?
5 Replies
Someone will reply to you shortly. In the meantime, this might help:
eager-peach•4mo ago
hey, that's not currently supported, I would recommend creating a dataset/kv store for failed requests and push to it from failed request handler
genetic-orangeOP•4mo ago
Retrying it would necessitate writing a new scraper since the crawler consists of multiple routes handlers....
I thought I could just run the scraper again but increasing the retry count.
@Vi just advanced to level 3! Thanks for your contributions! 🎉
eager-peach•4mo ago
you can use failed request handler to handle all failed request in one place https://crawlee.dev/api/playwright-crawler/interface/PlaywrightCrawlerOptions#failedRequestHandler
PlaywrightCrawlerOptions | API | Crawlee · Build reliable crawlers....
Crawlee helps you build and maintain your crawlers. It's open source, but built by developers who scrape millions of pages every day for a living.