How do I delay requests with HttpCrawler?
I am working with an API that has rate-limiting in place. The API gives me a timestamp of when the current rate limit will expire in seconds. I need to delay my next request by this many seconds, which is usually 15 ish minutes.
I tried adding a delay with
setTimeout
and Promise
like this and awaiting on it
(this delay happens inside my requestHandler)
But when it finishes I get this error on next request
RequestError: read ECONNRESET
I am not well-versed with networking, but I think this error is related to it. My understanding is that Crawlee tries to use the same connection that it opened ~15 minutes ago but that connection is closed from the API's side, thus this error.
Any help or suggestions on how to achieve this would be appreciated!5 Replies
extended-salmon•3y ago
Perhaps add delay in router.
See createHttpRouter (https://crawlee.dev/api/http-crawler/function/createHttpRouter)
rival-blackOP•3y ago
Can you elaborate more on how to do this please? thanks!
I mean the adding the delay to router part, I looked at the documentation and couldn't find a way to do it.
other-emerald•3y ago
1. How do they recognize that you are the one doing the same request? Is it authenticated or IP address based?
2. I would simply set
maxConcurrency
to 1, timeout to 999999 and sleep for the amount in the requestHandler
and then enqueue the next requestrival-blackOP•3y ago
Sorry for the late response,
1. Yes authenticated using a bearer token
2. This was exactly my initial solution and also what led me to discover the ECONNRESET error when the requestHandler waited for more than ~15 minutes
I ended up using BasicCrawler and Axios to make requests and that seems to have fixed the problem.