Parsel Crawler way too dank with request speed

Hi everyone! I am creating a crawler using crawlee for python. I noticed the Parsel crawler makes the requests in much higher frequency than the Beautiful soup crawler. Is there a way to make the Parsel crawler slower, so we avoid getting blocked better? Thanks!
5 Replies
Hall
Hall5mo ago
Someone will reply to you shortly. In the meantime, this might help: -# This post was marked as solved by Rigos. View answer.
variable-lime
variable-lime5mo ago
Hi, did you try to use this solution to delay request await asyncio.sleep(random.uniform(1, 3))
genetic-orange
genetic-orangeOP5mo ago
Hi, I did not. But I was also curious why is it happening. I would expect the actors to behave the same way.
correct-apricot
correct-apricot5mo ago
Hi, @Rigos The reason for the difference in speed may be due to the speed of the parsing library used directly. Since this is a CPU bound task. If Parsel parses the page faster (which it does), then the overall speed of the crawler will be faster. In order to slow down the crawler, I would recommend using:
from crawlee import ConcurrencySettings

crawler = ParselCrawler(concurrency_settings=ConcurrencySettings(max_tasks_per_minute=100))
from crawlee import ConcurrencySettings

crawler = ParselCrawler(concurrency_settings=ConcurrencySettings(max_tasks_per_minute=100))
Using max_tasks_per_minute will give you better control over your parsing speed, especially if you encounter 429 blocking (too frequent requests to the server)
genetic-orange
genetic-orangeOP5mo ago
Thanks a lot!

Did you find this page helpful?