CA
Crawlee & Apify•3y ago
fascinating-indigo

Finishing requestHandler() request early

Im using Puppeteer requestHandler() and is there a way for me to end the request early instead of waiting for the whole "script" to finish so I could move on to the next URL in the queue?
3 Replies
Pepa J
Pepa J•3y ago
Hello @liutc, not sure what you mean exactly. You may end processing the request from inside of the script, just by returning a void.
router.addHandler('START', async ({ $, crawler, request }) => {
/// ...

if (nothingToScrape) {
return; // sucessfully ends this request and continues in another one
}
// ...
});
router.addHandler('START', async ({ $, crawler, request }) => {
/// ...

if (nothingToScrape) {
return; // sucessfully ends this request and continues in another one
}
// ...
});
fascinating-indigo
fascinating-indigoOP•3y ago
Understood, I was doing page.close() and browser.close() Basically on the page I have a setInterval function that checks for page changes every second, if in 10 seconds the "change" that Im looking for doesn't appear I just want to close the current session and move on to the next request in the queue 🙂
Pepa J
Pepa J•3y ago
If it is everything that the routerHandler do, then the implementation could be something simple like:
router.addDefaultHandler(async ({ page, log }) => {
for (let i = 0; i < 10; i++) {
const content = await page.content();
// Check content page
if (/foobar/.test(content)) {
log.info(`Founded foobar!`);
break;
}
await utils.sleep(1000);
}
});
router.addDefaultHandler(async ({ page, log }) => {
for (let i = 0; i < 10; i++) {
const content = await page.content();
// Check content page
if (/foobar/.test(content)) {
log.info(`Founded foobar!`);
break;
}
await utils.sleep(1000);
}
});

Did you find this page helpful?