Crawlee & Apify•3y ago

Resume crawler based on request queues from previous run locally and in apify

Is it possible to stop a crawler and resume it from the previous run's request queues? I have a crawler that has run for a couple hours locally and I would like to add proxies to it to speed up processing speed because I am getting throttled by using 1 IP, but without starting from scratch because it will be unnecessary and a waste of time. I want to use my existing request queues. Is this possible? Also is this possible on Apify?

4 Replies

deep-jade•3y ago

Use a named request queue instead of an unnamed one. It is persisted. The default request queue is unnamed and is tied to the actor's run by default

national-goldOP•3y ago

Thanks

foreign-sapphire•3y ago

You can also do it by canceling the process and then starting but without the storage purge crawlee run --no-purge. We are also figuring out graceful abort - https://github.com/apify/crawlee/issues/1531

GitHub

Graceful abort of the runtime process (emulation of Apify platform ...

Motivation Apify platform provides a nice feature to gracefully abort an actor run. Instead of exiting the process right away, a user can choose to abort gracefully which makes the Apify platform: ...

national-goldOP•3y ago

Thanks

Gaming

Programming

Resume crawler based on request queues from previous run locally and in apify

Did you find this page helpful?