Navigation timed out after 60 seconds.

I'm scraping a website, if I run it in headless mode I get this error, if I run it headed I see the webpage completely loaded (the loading wheel still spins somehow). this is my routes:
router.addDefaultHandler(async ({ enqueueLinks, log, page, request }) => {
if (request.url.includes('/ayuda/')){
await page.close()
return;
}
if (request.url.includes('sitemaps.org')) {
await page.close()
return;
}
try {
await page.waitForSelector('#cookies-agree');
await page.click('#cookies-agree');
log.info('cookies button clicked')
} catch (error) {
log.error('cookies button not available');
}
await page.setViewport({
width: 1920,
height: 6000,
});

let content = await page.content()
let securityCheck = await checkForChallengeValidation(page, request, content, log)
if (!securityCheck){
throw RetryRequestError
}

await enqueueLinks({
// regexps: [/(\/supermercado\/[a-zA-Z]+)/],
selector: 'li._next > a._pagination_link',
forefront: true,
});
await enqueueLinks({
// regexps: [/(\/supermercado\/[0-9]+)/],
selector: 'a.product_link',
label: 'product',
forefront: true,
});

await page.close()
page = null;
});
router.addDefaultHandler(async ({ enqueueLinks, log, page, request }) => {
if (request.url.includes('/ayuda/')){
await page.close()
return;
}
if (request.url.includes('sitemaps.org')) {
await page.close()
return;
}
try {
await page.waitForSelector('#cookies-agree');
await page.click('#cookies-agree');
log.info('cookies button clicked')
} catch (error) {
log.error('cookies button not available');
}
await page.setViewport({
width: 1920,
height: 6000,
});

let content = await page.content()
let securityCheck = await checkForChallengeValidation(page, request, content, log)
if (!securityCheck){
throw RetryRequestError
}

await enqueueLinks({
// regexps: [/(\/supermercado\/[a-zA-Z]+)/],
selector: 'li._next > a._pagination_link',
forefront: true,
});
await enqueueLinks({
// regexps: [/(\/supermercado\/[0-9]+)/],
selector: 'a.product_link',
label: 'product',
forefront: true,
});

await page.close()
page = null;
});
5 Replies
NeoNomade
NeoNomadeOP•3y ago
crawler config :
const crawler = new PuppeteerCrawler({
proxyConfiguration,
requestHandler: router,
maxConcurrency: 16,
maxRequestRetries: 15,
maxRequestsPerMinute: 5,
headless: false,
useSessionPool: true,
failedRequestHandler({ request }) {
log.debug(`Request ${request.url} failed 15 times.`);
},
const crawler = new PuppeteerCrawler({
proxyConfiguration,
requestHandler: router,
maxConcurrency: 16,
maxRequestRetries: 15,
maxRequestsPerMinute: 5,
headless: false,
useSessionPool: true,
failedRequestHandler({ request }) {
log.debug(`Request ${request.url} failed 15 times.`);
},
Pepa J
Pepa J•3y ago
@NeoNomade What is the last log that you see how does the error looks like?
equal-aqua
equal-aqua•3y ago
Was also dealing with this issue for a few sites today. Never really fixed it, but I tried to change the default confirmation of page load from "load" to "domcontentloaded" (first thing to fire) and that seemed to help for some pages: preNavigationHooks: [ async (context, gotoOptions) => { gotoOptions.waitUntil = "domcontentloaded"; }, ], I then also included this in my router: let errorLoading = false; try { // try waiting until the network is idle or a max of 10 seconds until just moving on to get the text etc. this happens far more than I would have expected await utils.puppeteer.gotoExtended(page, request, { waitUntil: 'networkidle2', timeout: 10000 }); } catch { errorLoading = true; log.error(`Waited 10 seconds for network to be more idle on ${request.loadedUrl} but never happened, moving on anyway with pulling html etc.`); }
MEE6
MEE6•3y ago
@companyData just advanced to level 2! Thanks for your contributions! 🎉
equal-aqua
equal-aqua•3y ago
An example of one of the urls that was giving me trouble was: https://pezeshkanekhoob.com

Did you find this page helpful?