CheerioCrawler headerGenerator help

Hello ! I kept reading the docs but couldn't find a clear information about this. When we use Puppeteer or Playwright we can tweak in browserPool the fingerprintGenerator. For Cheerio we have the headerGenerator from got, how we can adjust it inside the CheerioCrawler ?
3 Replies
Hall
Hall3mo ago
Someone will reply to you shortly. In the meantime, this might help:
Louis Deconinck
Louis Deconinck3mo ago
Here's an example on how to work with headerGeneratorOptions using the BasicCrawler. I would assume it works in the same way for the CheerioCrawler. https://crawlee.dev/docs/next/guides/got-scraping#useheadergenerator
Got Scraping | Crawlee · Build reliable crawlers. Fast.
Crawlee helps you build and maintain your crawlers. It's open source, but built by developers who scrape millions of pages every day for a living.
eastern-cyan
eastern-cyan3mo ago
Hi! You could also attempt to add the following option to CheerioCrawler:
preNavigationHooks: [
async (crawlingContext, opts: OptionsInit) => {
opts.headers = {
...opts.headers,
// your headers
};
}
]
preNavigationHooks: [
async (crawlingContext, opts: OptionsInit) => {
opts.headers = {
...opts.headers,
// your headers
};
}
]

Did you find this page helpful?