Max Depth option

Hello! Just wondering whether it is possible to set max depth for the crawl? Previous posts (2023) seems to make use of 'userData' to track the depth. Thank you.
4 Replies
Hall
Hall4mo ago
Someone will reply to you shortly. In the meantime, this might help:
inland-turquoise
inland-turquoise4mo ago
Hi, this can be set. You can add a maxdepth parameter in the INPUT_SCHEMA.json file, then retrieve it from the code and implement the specified depth functionality.
fascinating-indigo
fascinating-indigo4mo ago
@mjh Hey, when using python, you can use max_crawl_depth (https://crawlee.dev/python/api/class/BasicCrawlerOptions#max_crawl_depth) . Otherwise, you can track the current depth using request.userData. When adding new links in the request handler, just increment the depth by one and use that for each newly enqueued request (https://crawlee.dev/api/core/class/Request)
Request | API | Crawlee · Build reliable crawlers. Fast.
Crawlee helps you build and maintain your crawlers. It's open source, but built by developers who scrape millions of pages every day for a living.
BasicCrawlerOptions | API | Crawlee for Python · Fast, reliable Pyt...
Crawlee helps you build and maintain your Python crawlers. It's open source and modern, with type hints for Python to help you catch bugs early.
fascinating-indigo
fascinating-indigoOP4mo ago
Got this. thank you

Did you find this page helpful?