Resume crawler based on request queues from previous run locally and in apify
Is it possible to stop a crawler and resume it from the previous run's request queues?
I have a crawler that has run for a couple hours locally and I would like to add proxies to it to speed up processing speed because I am getting throttled by using 1 IP, but without starting from scratch because it will be unnecessary and a waste of time. I want to use my existing request queues. Is this possible?
Also is this possible on Apify?
4 Replies
deep-jade•3y ago
Use a named request queue instead of an unnamed one. It is persisted. The default request queue is unnamed and is tied to the actor's run by default
national-goldOP•3y ago
Thanks
foreign-sapphire•3y ago
You can also do it by canceling the process and then starting but without the storage purge
crawlee run --no-purge
.
We are also figuring out graceful abort - https://github.com/apify/crawlee/issues/1531GitHub
Graceful abort of the runtime process (emulation of Apify platform ...
Motivation Apify platform provides a nice feature to gracefully abort an actor run. Instead of exiting the process right away, a user can choose to abort gracefully which makes the Apify platform: ...
national-goldOP•3y ago
Thanks