HTTPCrawler proxy not working
I am trying to add a proxy to my HTTPCrawler and it doesn't seem to work. I am following the same code structure that is in the docs but i keep getting an error saying:
ERROR HttpCrawler: Request failed and reached maximum retries. RequestError: Client network socket disconnected before secure TLS connection was established
It works when i do it with my PuppeteerCrawler and i have them the same way.
Heres my code:
import { Dataset, HttpCrawler, log, ProxyConfiguration } from "crawlee";
import fs from "fs/promises";
const proxyConfiguration = new ProxyConfiguration({
proxyUrls: ["http://127.0.0.1:3128"]
});
const crawler = new HttpCrawler({
// ignoreSslErrors: true,
proxyConfiguration,
// minConcurrency: 10,
// maxConcurrency: 50,
// maxRequestRetries: 5,
// requestHandlerTimeoutSecs: 20,
async requestHandler({ request, body, proxyInfo }) {
....
});
await crawler.run([
"https://www.amazon.com/Spandex-Superhero-Adults-Cosplay-Zentai/dp/B09T9DCDRW/ref=sr_1_3_sspa?crid=X39M1CTW3HPQ&keywords=spiderman+costumes&qid=1680718302&sprefix=spiderman+costumes%2Caps%2C137&sr=8-3-spons&psc=1&spLa=ZW5jcnlwdGVkUXVhbGlmaWVyPUFQUzNIQklTSVBONFYmZW5jcnlwdGVkSWQ9QTA2MzE4MzkzOVM3OVk1VDhBU1hHJmVuY3J5cHRlZEFkSWQ9QTAxNzc4NDQyQVpONVhSN1UyQjFHJndpZGdldE5hbWU9c3BfYXRmJmFjdGlvbj1jbGlja1JlZGlyZWN0JmRvTm90TG9nQ2xpY2s9dHJ1ZQ=="
])
CHGCHLCO 3D Print Spandex Superhero Adults&Kids Cosplay Zentai Outfit
CHGCHLCO 3D Print Spandex Superhero Adults&Kids Cosplay Zentai Outfit
1 Reply
xenial-black•3y ago
The code looks good.
Maybe issue is with the proxy itself?
Did try to use different proxy group/proxy URLs?