infinite scrolling of pages
i have a crawler that goes through collection pages of stores
and scrapes their product links
and goes through those product page links to get product data
when getting the product links in the collection pages,
many sites utilize an infinite scrolling to render in all the products
how do i implement infinite scrolling into this specific crawler route handler here below
while scraping the product page urls to render in all the products to make sure i scraped all the products on the page:
6 Replies
xenophobic-harlequinOP•2y ago
(PLAYWRIGHT crawler btw)
eastern-cyan•2y ago
hey, what about using the
infiniteScroll
function: https://crawlee.dev/api/playwright-crawler/namespace/playwrightUtils#infiniteScrollplaywrightUtils | API | Crawlee
A namespace that contains various utilities for
Playwright - the headless Chrome Node API.
Example usage:
```javascript
import { launchPlaywright, playwrightUtils } from 'crawlee';
// Navigate to https://www.example.com in Playwright with a POST request
const browser = await launchPlaywright();
c...
xenophobic-harlequinOP•2y ago
I’m not sure on how to implement those playwright Utils properly to keep scrolling incrementally and use that in my touter sorry but I’m not as experienced with the utils
optimistic-gold•2y ago
Here is an example on how to use it, it's using Puppeteer but it works the exact same with Playwright, scroll to the
infiniteScroll
example:
https://docs.apify.com/academy/node-js/dealing-with-dynamic-pages#scraping-dynamic-contentHow to scrape from dynamic pages | Academy | Apify Documentation
Learn about dynamic pages and dynamic content. How can we find out if a page is dynamic? How do we programmatically scrape dynamic content?
xenophobic-harlequinOP•2y ago
Thanks!
Does this implement the scroll and pause too?
optimistic-gold•2y ago
Here are the options you can pass to it to control it:
https://crawlee.dev/api/playwright-crawler/namespace/playwrightUtils#InfiniteScrollOptions