JSDOMCrawler, website breaks crawlee

Hey, after im getting a Warning the whole process stops, is it possible to catch it? WARN JSDOMCrawler: Reclaiming failed request back to the list or queue. ReferenceError: request is not defined at JSDOMCrawler.requestHandler (/home/vue/repo/test/fofo.js:14:31) at /home/vue/repo/test/node_modules/@crawlee/http/internals/http-crawler.js:336:81 at wrap (/home/vue/repo/test/node_modules/@apify/timeout/index.js:52:27) at /home/vue/repo/test/node_modules/@apify/timeout/index.js:66:7 at AsyncLocalStorage.run (node:async_hooks:319:14) at /home/vue/repo/test/node_modules/@apify/timeout/index.js:65:13 at new Promise (<anonymous>) at addTimeoutToPromise (/home/vue/repo/test/node_modules/@apify/timeout/index.js:59:10) at JSDOMCrawler._runRequestHandler (/home/vue/repo/test/node_modules/@crawlee/http/internals/http-crawler.js:336:53) at runMicrotasks (<anonymous>) {"id":"AbgRUNVvKFwQD3K","url":"https://mmmiyama.com","retryCount":1}
2 Replies
like-gold
like-goldOP2y ago
A example would be
const { JSDOMCrawler, log } = require("crawlee");

(async () => {
const crawler = new JSDOMCrawler({
runScripts: true,
requestHandler: async ({ window }) => {
const { document } = window;
log.debug(`Processing ${request.url}...`);
// Extract data from the page
const title = document.title;
log.debug(`title: ${title}`);
},
});
await crawler.run(["https://klaviyo.com"]);
log.debug("Crawler finished.");
})();
const { JSDOMCrawler, log } = require("crawlee");

(async () => {
const crawler = new JSDOMCrawler({
runScripts: true,
requestHandler: async ({ window }) => {
const { document } = window;
log.debug(`Processing ${request.url}...`);
// Extract data from the page
const title = document.title;
log.debug(`title: ${title}`);
},
});
await crawler.run(["https://klaviyo.com"]);
log.debug("Crawler finished.");
})();
never getting Crawler finished
absent-sapphire
absent-sapphire2y ago
It says request is not defined You need to get it from CrawlingContext (https://crawlee.dev/api/next/core/interface/CrawlingContext#request)
requestHandler: async ({ request }) => {
...
requestHandler: async ({ request }) => {
...

Did you find this page helpful?