How to authenticate PlaywrightCrawler
I see that there is a Session object, but I can't find any examples of how to instantiate it with user credentials.
I have a Typescript NodeJS application that I trigger with a HTTP call and running locally (all works nicely).
I'm trying to crawl an internal CMS but can't get past the front door.
Has anyone had success in doing this? I'm also looking to see if express-session might help.
4 Replies
other-emeraldOP•2y ago
FYI to all, I managed it like this:
await page.getByLabel('Username').fill('not-my-real-username');
await page.waitForTimeout(1000); await page.getByLabel('Password').fill('not-my-real-password');
await page.waitForTimeout(1000); await page.getByRole('button').click(); await page.waitForTimeout(3000);
await page.waitForTimeout(1000); await page.getByLabel('Password').fill('not-my-real-password');
await page.waitForTimeout(1000); await page.getByRole('button').click(); await page.waitForTimeout(3000);
correct-apricot•2y ago
@Projle do you have a full example ?
It's a configuration method ?
rare-sapphire•2y ago
Hey there! Please, take a look at our academy for more info and code examples: https://docs.apify.com/academy/puppeteer-playwright/common-use-cases/logging-into-a-website
Logging into a website | Apify Documentation
Understand the "login flow" - logging into a website, then maintaining a logged in status within different browser contexts for an efficient automation process.
other-emeraldOP•2y ago
@lemurio Yes I saw this and tried it but it didn't work. I took the same approach, but used the auto-complete on the page object to achieve the same result. I also looked at the page source of the url I wanted to scrape to determine what the tags were to look for. There was a bit of trial and error..
@Guillaume C
try {
// Chromium enters username/password and clicks on the 'Log in' button
const url = 'https://the-url-of-the-redirect-login-page'; console.log('Redirect to login'); await page.goto(url); await page.content();
await page.getByLabel('Username').fill('not-my-real-username'); await page.waitForTimeout(1000); await page.getByLabel('Password').fill('not-my-real-password'); await page.waitForTimeout(1000); await page.getByRole('button').click(); await page.waitForTimeout(3000); console.log(
} } catch (error) { console.log('error', error); } @Guillaume C This is what I have in my handleRequest() method. It's a proof of concept so there isn't any other code in there yet (eg. page scraping) and enqueLinks() is also not called yet.
// Chromium enters username/password and clicks on the 'Log in' button
const url = 'https://the-url-of-the-redirect-login-page'; console.log('Redirect to login'); await page.goto(url); await page.content();
await page.getByLabel('Username').fill('not-my-real-username'); await page.waitForTimeout(1000); await page.getByLabel('Password').fill('not-my-real-password'); await page.waitForTimeout(1000); await page.getByRole('button').click(); await page.waitForTimeout(3000); console.log(
Redirect to request url: ${parentUrl}
);
if (parentUrl) {
await page.goto(parentUrl);
await page.content();} } catch (error) { console.log('error', error); } @Guillaume C This is what I have in my handleRequest() method. It's a proof of concept so there isn't any other code in there yet (eg. page scraping) and enqueLinks() is also not called yet.