Parsel Crawler way too dank with request speed
Hi everyone! I am creating a crawler using crawlee for python. I noticed the Parsel crawler makes the requests in much higher frequency than the Beautiful soup crawler. Is there a way to make the Parsel crawler slower, so we avoid getting blocked better? Thanks!
5 Replies
Someone will reply to you shortly. In the meantime, this might help:
-# This post was marked as solved by Rigos. View answer.
variable-lime•5mo ago
Hi, did you try to use this solution to delay request
await asyncio.sleep(random.uniform(1, 3))
genetic-orangeOP•5mo ago
Hi, I did not. But I was also curious why is it happening. I would expect the actors to behave the same way.
correct-apricot•5mo ago
Hi, @Rigos
The reason for the difference in speed may be due to the speed of the parsing library used directly. Since this is a CPU bound task. If Parsel parses the page faster (which it does), then the overall speed of the crawler will be faster.
In order to slow down the crawler, I would recommend using:
Using
max_tasks_per_minute
will give you better control over your parsing speed, especially if you encounter 429 blocking (too frequent requests to the server)genetic-orangeOP•5mo ago
Thanks a lot!