custom CheerioCrawler User-Agent?

hi, could anyone show me an example of CheerioCrawler User-Agent? I had tried preNavigationHooks but got a stranger error
preNavigationHooks: [
function customUserAgent(_ctx, opts = {}) {
opts.headers = opts.headers || {};
opts.useHeaderGenerator = true;
opts.headerGeneratorOptions = {
browsers: [ 'chrome', 'safari' ],
devices: [ 'mobile' ],
operatingSystems: [ 'android', 'ios' ],
};
},
]
preNavigationHooks: [
function customUserAgent(_ctx, opts = {}) {
opts.headers = opts.headers || {};
opts.useHeaderGenerator = true;
opts.headerGeneratorOptions = {
browsers: [ 'chrome', 'safari' ],
devices: [ 'mobile' ],
operatingSystems: [ 'android', 'ios' ],
};
},
]
but got error: RequestError: No headers based on this input can be generated. I had tried use this options with header-generator then it works, but don't work with CheerioCrawler hooks
7 Replies
foreign-sapphire
foreign-sapphire•3y ago
How about using the search function in the forum? https://discord.com/channels/801163717915574323/1022116407053394011
continuing-cyan
continuing-cyan•3y ago
Like this:
preNavigationHooks: [
(crawlingContext) => {
const { request } = crawlingContext;
request.headers = webClientHeaders;
},
],
preNavigationHooks: [
(crawlingContext) => {
const { request } = crawlingContext;
request.headers = webClientHeaders;
},
],
ambitious-aqua
ambitious-aquaOP•3y ago
thanks, I had searched before. My Question is a little differnce.
1. I want to use useHeaderGenerator so the user-agent will change each crawle to avoid block 2. it seems that is a bug of headerGeneratorOptions ?
foreign-sapphire
foreign-sapphire•3y ago
You can do this with fingerprint-suite (https://github.com/apify/fingerprint-suite/). Like this
// https://crawlee.dev/api/cheerio-crawler/interface/CheerioCrawlerOptions#preNavigationHooks
preNavigationHooks: [
async (_, gotOptions) => {
// https://github.com/apify/fingerprint-suite/blob/master/docs/guides/fingerprint-generator.md
let fingerprintGenerator = new FingerprintGenerator({
browsers: ['chrome', 'edge', 'firefox', 'safari'],
devices: [ 'desktop' ],
operatingSystems: [ 'windows' ]
});
let { headers } = fingerprintGenerator.getFingerprint();
gotOptions.headers = headers
},
]
// https://crawlee.dev/api/cheerio-crawler/interface/CheerioCrawlerOptions#preNavigationHooks
preNavigationHooks: [
async (_, gotOptions) => {
// https://github.com/apify/fingerprint-suite/blob/master/docs/guides/fingerprint-generator.md
let fingerprintGenerator = new FingerprintGenerator({
browsers: ['chrome', 'edge', 'firefox', 'safari'],
devices: [ 'desktop' ],
operatingSystems: [ 'windows' ]
});
let { headers } = fingerprintGenerator.getFingerprint();
gotOptions.headers = headers
},
]
Attached POC: cheerio_crawler with this.
MEE6
MEE6•3y ago
@LeMoussel just advanced to level 2! Thanks for your contributions! 🎉
ambitious-aqua
ambitious-aquaOP•3y ago
@LeMoussel thanks, I had use header-generator and it works. But my question is why built-in headerGeneratorOptions fail, did I miss something?
metropolitan-bronze
metropolitan-bronze•3y ago
@petrpatek. Pls check here why the header generator fails to generate

Did you find this page helpful?