Help Needed with Scraping Website Behind Anti-Bot Protection!

I've been trying to scrape this website: https://de.pandora.net/de/charms-armbander/charms/charms-mit-anhanger/bicolor-fahrrad-mit-drehenden-radern-charm-anhanger/763354C01.html The script works perfectly on my local playground setup, but when I move it to a self-hosted environment, it fails to scrape. I’ve also added a proxy server to bypass any unauthorized access issues, but I still can't get it to work. Here's the error message I'm encountering: { "content": "", "markdown": "", "html": "", "linksOnPage": [], "metadata": { "sourceURL": "https://de.pandora.net/de/charms-armbander/charms/charms-mit-anhanger/bicolor-fahrrad-mit-drehenden-radern-charm-anhanger/763354C01.html", "pageStatusCode": 401, "pageError": "UNAUTHORIZED" } } Has anyone dealt with similar issues or have any ideas on how to scrape websites behind anti-bot measures effectively? Any advice or tips would be greatly appreciated!
PANDORA
Bicolor Fahrrad mit Drehenden Rädern Charm-Anhänger
Trage den Bicolor Fahrrad mit Drehenden Rädern Charm-Anhänger als Symbol für frische Luft und Freiheit. Das zweifarbige Charm zeigt eine reduzierte Version eines Fahrrads mit realistischen Details wie Sattel, Lenker, Pedalen, Licht, Rädern und Reifen. Die Räder können sich sogar drehen. Ganz gleich, ob du es zu deiner eigenen Sammlung hinzufügst...
8 Replies
Caleb
Caleb14mo ago
Hey there Julie. To do this, you'll have to set up your own proxy network. Or, you should use the cloud service, where we handle this all for you!
Julie Grace
Julie GraceOP14mo ago
I am running playwright service ts with the env keys. But I have a confusion do I need to mention the port or just host address PROXY_SERVER=host_address:port @Caleb
Caleb
Caleb14mo ago
@Adobe.Flash Bringing you into the convo, not familiar with setting up proxies here
Julie Grace
Julie GraceOP14mo ago
I am using Geonode Site Unblocker proxies.
Adobe.Flash
Adobe.Flash14mo ago
Hey @Julie Grace I believe you should have them both: PROXY_SERVER=http://PROXY_SERVER:PROXY_PORT
Julie Grace
Julie GraceOP14mo ago
Hi @Adobe.Flash I did set PROXY_SERVER=http://PROXY_SERVER:PROXY_PORT and in addition to that, I tried out three providers. Somehow I don't see any traffic getting routed. I get the console message that server is running on 3000 but my proxy server port is 9000. Sorry my knowledge in proxies is little.
Adobe.Flash
Adobe.Flash14mo ago
hey @Julie Grace gotcha.. ccing @thomas here who can provide better help around proxyies
Julie Grace
Julie GraceOP14mo ago
Hi @thomas , any idea or update about it ? Hi @Adobe.Flash , can we use headless browser like Browserbase or can this be a feature request ?

Did you find this page helpful?