CA
conscious-sapphire
How to determine if dynamic content is loaded or not. PuppeteerCrawler
in the requestHandler I'm trying to click to the pagination next button and I cannot determine if the content is changed or not.
How can I do it? waitfornetworkidle does not seem to work here. any ideas? See the GIF
Here's my code so far. Currently button is clicked OK, the data is fetched OK. it just hangs in the end, I guess waitForNetworkIdle is never resolving

4 Replies
unwilling-turquoise•13mo ago
Hi @4unkur
You might want to try Page.waitForFunction() method
https://pptr.dev/api/puppeteer.page.waitforfunction
Or you could wait for a specific selector that is loaded when the request is resolved
https://pptr.dev/api/puppeteer.page.waitforselector
Or you could wait for the request that fetches the data with Page.waitForResponse() method
https://pptr.dev/api/puppeteer.page.waitforresponse
It depends what works for you the best 🙂 . Hope this helps
Page.waitForFunction() method | Puppeteer
Waits for the provided function, pageFunction, to return a truthy value when evaluated in the page's context.
Page.waitForSelector() method | Puppeteer
Wait for the selector to appear in page. If at the moment of calling the method the selector already exists, the method will return immediately. If the selector doesn't appear after the timeout milliseconds of waiting, the function will throw.
conscious-sapphireOP•13mo ago
@Lukas Celnar Thank you for your response.
waitForSelector
does not seem to work as the html is already there it just updates once ajax request is complete
waitForResponse
is fired before DOM is changed, so this does not work as well. I was able to make it work adding timer after response is ready for 3 sec, but that's not the right way I guess.
waitForFunction - I am not sure how can I utilize this in my case.
Anyway, I was able to implement the scraper via adding next page URL to the request queue instead. So task is complete but the question I've asked is still open for me (unwilling-turquoise•13mo ago
Waiting for some time with
page.waitForTimeout
is another way, but of course if there are going to be some delays from the website there are going to be troubles with it, so i would use it only as a last option if nothing else works.
with waitForFunction you could save the initial content
const initialContent = await page.evaluate(() => document.querySelector('[data-testid="content-element"]').textContent);
and then wait for changes
await page.waitForFunction(
(initialContent) => {
const newContent = document.querySelector('[data-testid="content-element"]').textContent;
return newContent !== initialContent;
},
{ timeout: 10000 },
initialContent
)
But if you can just add the urls to the request que then i would go with that approach.conscious-sapphireOP•13mo ago
Thank you @Lukas Celnar !