Efficient css selectors

Hey I’m looking for some help to pick more efficient css selectors. I’ve looked into a few tools but never had much luck with speeding anything up. Currently some of my textContent variables are timing out at 10 seconds. And a request is taking anywhere from 20-30 seconds. The data is being stored and written to the dataset and there’s a total of 17 selectors using textContent() And 6 using count() I am using a proxy and it’s currently placed in the launchContext for the chromium launcher. So that’s going to be some of the latency but I wasn’t expecting 20-30 seconds 😅
6 Replies
Pepa J
Pepa J2y ago
Hi @Infernoman , I have experience, that using
const texts = await page.evaluate(() => {
return [...document.querySelectorAll('...')].map(el => el.textContent);
})
const texts = await page.evaluate(() => {
return [...document.querySelectorAll('...')].map(el => el.textContent);
})
Is generally much faster then using puppeteer/playwright methods - especially in loops. Or do you have some specific use-case?
equal-jade
equal-jadeOP2y ago
No i dont have a specific use case at all I can switch to another method if it is faster I'll give that a shot shortly and let you know how it goes. Thank you I noticed there were a few references in the manual that you could block image loading. I'm just not sure where to put it. 🤦‍♂️
Pepa J
Pepa J2y ago
should be easy as
const crawler = new PuppeteerCrawler({
// ...
preNavigationHooks: [async ({ blockRequests, page, request, ht }, gotoOptions, ) => {
await blockRequests();

}]
});
const crawler = new PuppeteerCrawler({
// ...
preNavigationHooks: [async ({ blockRequests, page, request, ht }, gotoOptions, ) => {
await blockRequests();

}]
});
It even should have some decent default blocking values, otherwise You might set some more specific fitlers to the blockRequests parameters.
equal-jade
equal-jadeOP2y ago
amazing lol
MEE6
MEE62y ago
@Infernoman just advanced to level 1! Thanks for your contributions! 🎉
equal-jade
equal-jadeOP2y ago
just the block requests, with some third party blocking. and we're at 7 seconds lol

Did you find this page helpful?