Trying to use enqueueLinksByClickingElements
The page and requestQueue parameters are needed for this function but i dont know what should i put.
This is the doc: https://crawlee.dev/api/playwright-crawler/namespace/playwrightClickElements#enqueueLinksByClickingElements
Thanks for the help
21 Replies
optimistic-gold•3y ago
You could use the https://crawlee.dev/api/playwright-crawler/interface/PlaywrightCrawlingContext#enqueueLinksByClickingElements - it's the same function, but it's context-aware, so you don't need to provide request queue and page. Part of the PlaywrightCrawlingContext
quickest-silverOP•3y ago
I am a little bit lost to be fair
this is a exemple of what im trying to do
optimistic-gold•3y ago
Then it should pretty much work, this function is context-aware, you don't need to provide page or requestQueue params there
quickest-silverOP•3y ago
ok thanks, but its not working actually
optimistic-gold•3y ago
Just to clarify - there are two ways you could use this function. The link you sent above could be imported separately - and you would need to provide page/requestQueue. When used inside of the crawler - it's part of the context, and function already know about the page where it's being called and requestQueue which is being used. Note that your link goes to playwrightUtils namespace, while second link (the one i sent) goes to PlaywrightCrawlingContext
Basically you could use it out of crawler for some edge case, just by using an instance of playwright and some separate request queue. But when used in crawler - it's not needed
quickest-silverOP•3y ago
ok i got it
@Lesourdingo just advanced to level 1! Thanks for your contributions! 🎉
quickest-silverOP•3y ago
But nothing happens and i don't get an error, so i might do something wrong else where
optimistic-gold•3y ago
note those warnings in the docs:
IMPORTANT: To be able to do this, this function uses various mutations on the page, such as changing the Z-index of elements being clicked and their visibility. Therefore, it is recommended to only use this function as the last operation in the page.
USING HEADFUL BROWSER: When using a headful browser, this function will only be able to click elements in the focused tab, effectively limiting concurrency to 1. In headless mode, full concurrency can be achieved.
PERFORMANCE: Clicking elements with a mouse and intercepting requests is not a low level operation that takes nanoseconds. It’s not very CPU intensive, but it takes time. We strongly recommend limiting the scope of the clicking as much as possible by using a specific selector that targets only the elements that you assume or know will produce a navigation. You can certainly click everything by using the * selector, but be prepared to wait minutes to get results on a large and complex page.
Also - make sure that selector is correct
quickest-silverOP•3y ago
ok thanks, i will check
Looks like a selector problem
After this: await page.waitForSelector("#next");
the locator seems to be hidden
locator resolved to hidden <button id="next" type="button" class="btn pagingBtn hid…>…</button>
optimistic-gold•3y ago
could you share the URL?
quickest-silverOP•3y ago
optimistic-gold•3y ago
I meant there URL on which you're trying to enqueue more pages 🙂 on this one I don't see
#next
selector at allquickest-silverOP•3y ago
thats this one, u need to reload after accepting the prompt
optimistic-gold•3y ago
ah, I see it now 👍
Well - the button is hidden indeed - so it cannot really click on it. I don't know why it's hidden - something website specific apparently..
quickest-silverOP•3y ago
oh ok
@Lesourdingo just advanced to level 2! Thanks for your contributions! 🎉
optimistic-gold•3y ago
might be easier to replicate XHR request as it's a web app, so it's not really reloading the page, it sends POST requests which only differ in currentPage number
(at least what I saw, maybe there's more)
quickest-silverOP•3y ago
how could i do this?
optimistic-gold•3y ago
I guess you could start from here: https://docs.apify.com/academy/api-scraping
API scraping | Apify Documentation
Learn all about how the professionals scrape various types of APIs with various configurations, parameters, and requirements.
quickest-silverOP•3y ago
ok thanks i will take a look