Ignore URLs the matches the current url but does have query params

I do not want to crawl a url that is already crawled but have different query params, how can i do this?
1 Reply
optimistic-gold
optimistic-goldOP2y ago
await enqueueLinks({ selector: 'a[href]', transformRequestFunction: (link) => { const { url } = link; const urlWithoutQuery = url.split('?')[0]; if (!visitedUrls.has(urlWithoutQuery)) { visitedUrls.add(urlWithoutQuery); return { url: urlWithoutQuery }; } }, strategy: EnqueueStrategy.SameHostname, });

Did you find this page helpful?