example of manually adding requests to requestQueue
Hi
I have html like:
..
<a href="subpage.php?id=1">Title 1</a>
<a href="subpage.php?id=2">Title 2</a>
..
and I want to add to requests queue
- url from href
- title from <A>
but I want to use manual request queue
I don't want to use enqueueLinks() cause A-s are tangled inside HTML - it's not so easy to distill them.
BTW. I'm still learning TypeScript, sorry for possibly basic questions ;]2 Replies
absent-sapphire•3y ago
Hey there! You could use
requestQueue.addRequest()
- https://crawlee.dev/api/core/class/RequestQueue#addRequests or crawler.addRequests([])
- https://crawlee.dev/api/core/class/RequestQueue#addRequests.
You would also need to extract the url and title first. Then you could add the extracted title to the Request userData object - https://crawlee.dev/api/core/class/RequestRequest | API | Crawlee
Represents a URL to be crawled, optionally including HTTP method, headers, payload and other metadata.
The
Request
object also stores information about errors that occurred during processing of the request.
Each Request
instance has the uniqueKey
property, which can be either specified
manually in the constructor or generated automaticall...RequestQueue | API | Crawlee
Represents a queue of URLs to crawl, which is used for deep crawling of websites
where you start with several URLs and then recursively
follow links to other pages. The data structure supports both breadth-first and depth-first crawling orders.
Each URL is represented using an instance of the {@apilink Request} class.
The queue can only contain...
other-emeraldOP•3y ago
Thank you. I will code it in that way ;]