Site can detect headless mode

I have a Crawlee Playwright bot that logs into a website and performs some actions on a schedule. I made a public version here without the site or actions: https://github.com/raywalz/web-automation-starter For some reason, the website can detect headless mode despite the stealth plugin. It works fine in headed mode though. Any ideas? I have documentation on the setup in the readme of that project. I may give up and use XVFB and headed mode all the time like I’ve seen a previous post here mention, but I want to try to keep it headless if I can.
GitHub
GitHub - raywalz/web-automation-starter: My starter project for aut...
My starter project for automatically interacting with web apps that require user login. - raywalz/web-automation-starter
3 Replies
Hall
Hall5mo ago
Someone will reply to you shortly. In the meantime, this might help: -# This post was marked as solved by foxt141. View answer.
inland-turquoise
inland-turquoise5mo ago
Hi! As explained here I would recommend to try something other than the puppeteer stealth plugin, for example Crawlee's PlaywrightCrawler . If it doesn't work I would attempt to use PuppeteerCrawler - some websites are able to detect playwright, but fail with puppeteer. Also, refer to this guide If it still doesn't help, disabling headless might be necessary - from my experience some websites with advanced web-scraping protection will indeed have scripts that are able to determine that.
Reddit
From the webscraping community on Reddit: Is puppeteer-extra-plugin...
Explore this post and more from the webscraping community
Playwright crawler | Crawlee · Build reliable crawlers. Fast.
Crawlee helps you build and maintain your crawlers. It's open source, but built by developers who scrape millions of pages every day for a living.
Avoid getting blocked | Crawlee · Build reliable crawlers. Fast.
Crawlee helps you build and maintain your crawlers. It's open source, but built by developers who scrape millions of pages every day for a living.
ambitious-aqua
ambitious-aquaOP5mo ago
Thanks I’ll look into those. To be clear I am using Playwright, not Puppeteer. I’m just using the puppeteer-extra-plugin-stealth plugin with it. Though it has puppeteer in the name, it’s compatible with both since Playwright is just a fork.

Did you find this page helpful?