`maxRequestsPerMinute` But for session

:perfecto: Hey! Firstly I just want to thank you for creating such an amazing product ❤️ ! Question itself: Regarding documentation (https://crawlee.dev/docs/guides/scaling-crawlers) we can set maxRequestsPerMinute limitation for global crawler process. But for some cases, maxRequestsPerMinute should be set by session itself. For example: Website iammastrongwebsite.com have request limit per session (5rpm). I have a bunch of proxies, and setting maxRequestsPerMinute to 5 not ideal, since all of my imaginative hundreds of proxies will be waiting without providing any payload
Of course, we can remember simple math, and make maxRequestsPerMinute = [SessionLimit] * 5, but it's actually could be worse, since this approach full breaks a purpose of autoscaling
Suggestion: Add ability to set maxRequestsPerMinute in sessions itself. Thanks again 😊
Scaling our crawlers | Crawlee · Build reliable crawlers. Fast.
Crawlee helps you build and maintain your crawlers. It's open source, but built by developers who scrape millions of pages every day for a living.
3 Replies
Hall
Hall6mo ago
Someone will reply to you shortly. In the meantime, this might help:
ambitious-aqua
ambitious-aqua6mo ago
Hi, you can try to use rotate proxy
other-emerald
other-emeraldOP5mo ago
@Oleg V. Hey! Thanks for your reaction! Question not resolved yet. Can you please unmark it? Hey! Thanks for your answer. It is of one possible workarounds... But again, it will work badly, if you have session-like approach to scrape resources

Did you find this page helpful?