CA
itchy-amethyst

Dynamic Proxies by Request?

Currently I set proxy on my crawler,
const proxyConfiguration = await Actor.createProxyConfiguration({groups: ['RESIDENTIAL', 'BUYPROXIES1234'], countryCode: 'US'});

const myCrawler = new PuppeteerCrawler({
// To use the proxy IP session rotation logic, you must turn the proxy usage on.
proxyConfiguration: proxyConfiguration,
......
});
const proxyConfiguration = await Actor.createProxyConfiguration({groups: ['RESIDENTIAL', 'BUYPROXIES1234'], countryCode: 'US'});

const myCrawler = new PuppeteerCrawler({
// To use the proxy IP session rotation logic, you must turn the proxy usage on.
proxyConfiguration: proxyConfiguration,
......
});
Then the crawler will use this proxyConfig for all requests. Is there a way to intercept each request and change proxy dynamically? e.g. using this pseudo code,
crawler.on('request', (req) => {
if (req.url === 'https://myweb.com') {
req.proxy.disable();
// or disable certain group
req.proxy.disableIfGroup('RESIDENTIAL');
}
return req.continue();
});
crawler.on('request', (req) => {
if (req.url === 'https://myweb.com') {
req.proxy.disable();
// or disable certain group
req.proxy.disableIfGroup('RESIDENTIAL');
}
return req.continue();
});
4 Replies
Alexey Udovydchenko
see https://sdk.apify.com/api/apify/class/ProxyConfiguration#newUrl and I guess consider BasicCrawler, solution will be more clear
other-emerald
other-emerald3y ago
We are thinking about introducing this as a class. Right now you have 2 options: 1. Use BasicCrawler, create multiple proxy configurations and assign them as needed. 2. You can create a local server and your own proxy configuration wrapper that will route forward but that requires some internal knowledge.
itchy-amethyst
itchy-amethystOP3y ago
Excellent! It will help us reduce the proxy usage. Our immediate need is to whitelist URLs or domains for using proxies. If you make a release in the near future that can help that, it is much appreciated.
other-emerald
other-emerald3y ago
Yeah, this is a common problem so we want to find a good long-term solution. You can follow here - https://github.com/apify/crawlee/issues/1662
GitHub
Smart proxy configuration rotator · Issue #1662 · apify/crawlee
Which package is the feature request for? If unsure which one to select, leave blank No response Feature Currently, we only allow single ProxyConfiguration per Crawler. This is fine for most cases ...

Did you find this page helpful?