Crawlee & Apify•3y ago

About "Requests Queue"

what is the purpose of Request queues storages. Does it automatically fetch request, or it just purely just for storing URLs. thanks you in advance (sorry for noob question).

3 Replies

optimistic-gold•3y ago

On Apify platform requests should be stored e.g. in case of migration event - this way after migration actor could pretty much continue where it left off. Locally - well - pretty much the same, you could abort and restart the actor where if left off. Or actually with the latest Crawlee - you could pretty much you memoryStorage for the queue: https://github.com/apify/crawlee/pull/1901, but it's definitely not a good for running on the platform

harsh-harlequin•17mo ago

why it's not good to running in The Platform @Andrey Bykov ? I found out that requests queu expensive for my 95 scrappers. It take almost 60% of my budget. Do you think we can use something cheaper?

optimistic-gold•17mo ago

That depends - if you are scraping static list of URLs - meaning you have set of URLs and you vist them and extract data, without adding more - then you could use RequestList (you will have to explicitly specify it in the crwaler options). More here: https://crawlee.dev/api/core/class/RequestList If you are adding more URLs during the run - then RequestQueue is a way to go

RequestList | API | Crawlee

Represents a static list of URLs to crawl. The URLs can be provided either in code or parsed from a text file hosted on the web. RequestList is used by {@apilink BasicCrawler}, {@apilink CheerioCrawler}, {@apilink PuppeteerCrawler} and {@apilink PlaywrightCrawler} as a source of URLs to crawl. Each URL is represented using an instance of the ...

Gaming

Programming

About "Requests Queue"

Did you find this page helpful?