PlaywrightCrawler proxy issue

my crawler with PlaywrightCrawler works just fine but I have issue when adding proxy !!! this is the code
import { PlaywrightCrawler, ProxyConfiguration } from "crawlee";

const startUrls = ['http://quotes.toscrape.com/js/'];

const crawler = new PlaywrightCrawler({
requestHandler: async ({ page, parseWithCheerio }) => {
await page.waitForSelector("div.quote span.text", { "timeout": 60000 });
const $ = await parseWithCheerio()

const quotes = $("div.quote span.text")
quotes.each((_, element) => { console.log($(element).text()) });
},
});

await crawler.run(startUrls);
import { PlaywrightCrawler, ProxyConfiguration } from "crawlee";

const startUrls = ['http://quotes.toscrape.com/js/'];

const crawler = new PlaywrightCrawler({
requestHandler: async ({ page, parseWithCheerio }) => {
await page.waitForSelector("div.quote span.text", { "timeout": 60000 });
const $ = await parseWithCheerio()

const quotes = $("div.quote span.text")
quotes.each((_, element) => { console.log($(element).text()) });
},
});

await crawler.run(startUrls);
however when I add my proxy port I always get timeout erros !!!
const proxyConfiguration = new ProxyConfiguration({
proxyUrls: ["url-to-proxy-port-im-using"]
})

// and the add it to crawler
const crawler = new PlaywrightCrawler({
proxyConfiguration,
...
const proxyConfiguration = new ProxyConfiguration({
proxyUrls: ["url-to-proxy-port-im-using"]
})

// and the add it to crawler
const crawler = new PlaywrightCrawler({
proxyConfiguration,
...
and also the same code with the proxy configuration works with CheerioCrawler !!!! can anyone help with this issue !?
5 Replies
Hall
Hall7mo ago
Someone will reply to you shortly. In the meantime, we’ve found some posts that could help answer your question.
flat-fuchsia
flat-fuchsiaOP7mo ago
Error :
WARN PlaywrightCrawler: Reclaiming failed request back to the list or queue. requestHandler timed out after 60 seconds.
WARN PlaywrightCrawler: Reclaiming failed request back to the list or queue. requestHandler timed out after 60 seconds.
Pepa J
Pepa J7mo ago
Hi, @mktr If you setup your proxies to your regular browser, does it work? I am thinking about the delays being caused by combination of the website and proxy, that you are using.
flat-fuchsia
flat-fuchsiaOP7mo ago
wym setup the proxy to regular browser ? normally the proxies are working fine, and also I can collect data from the website fine without a proxy
Pepa J
Pepa J7mo ago
@mktr Unfortunatelly I cannot reproduce the issue. I've just tried your code with the most common datacenter proxies I have available and it worked for me.

Did you find this page helpful?