CA
Crawlee & Apify8mo ago
rival-black

How to set concurrency/cpu's/memory correcty

Hello, I would like to use PlayWrightCrawler for scraping , but it is not clear from the documentation how can I set up correctly concurrency, memory, cpu's, etc. Can someone help me out? What is the best practice to set up this Crawler to make scraping parallel? Thanks in advance!
2 Replies
Hall
Hall8mo ago
View post on community site
This post has been pushed to the community knowledgebase. Any replies in this thread will be synced to the community site.
Apify Community
Marco
Marco8mo ago
Hello! The best concurrency settings really depend on the context, for instance the available resources, the use-case and the scraped website. You can set the crawling options when creating the PlaywrightCrawler: see https://crawlee.dev/python/api/class/PlaywrightCrawler#__init__ and https://crawlee.dev/python/api/class/BasicCrawler#__init__. For instance, you can set concurrency_settings: https://crawlee.dev/python/api/class/ConcurrencySettings.
BasicCrawler | API | Crawlee for Python · Fast, reliable crawlers.
Crawlee helps you build and maintain your Python crawlers. It's open source and modern, with type hints for Python to help you catch bugs early.
PlaywrightCrawler | API | Crawlee for Python · Fast, reliable crawl...
Crawlee helps you build and maintain your Python crawlers. It's open source and modern, with type hints for Python to help you catch bugs early.

Did you find this page helpful?