How do we handle authenticated scrapers on apify cli locally

Iam using playwright crawlee + TS template , should i handle login , save session and session injection by myself? or is there any apify tool that can help with that ?
6 Replies
Hall
Hall4d ago
Someone will reply to you shortly. In the meantime, this might help: -# This post was marked as solved by IrshaiD'. View answer.
Pepa J
Pepa J4d ago
Hi @IrshaiD' I am not sure if I understand, but maybe you mean the apify login command for apify-cli? Or can you describe a bit more what you try to achieve?
IrshaiD'
IrshaiD'OP3d ago
What I mean is: I need to scrape an goverment contracts website where the data is only accessible after logging in. It contains multiple contract listings and around 150,000 contract detail pages, so it's a large-scale operation. I want every request to carry the authenticated session. How would you approach this using Apify? Let me know if you have any suggestions. My approach is to set up two separate actors with an integrated flow: One actor login once and handles scraping the list pages. The other actor scrapes the detail pages and sends the data to an S3 bucket. This follows a divide-and-conquer approach. However, I want to avoid logging in every time in the details actor—I’d prefer to log in once and maintain the browser session across both actors.
memo23
memo233d ago
just save cookies and use them in your request(s) after login, also probably they have API so no need for PW
Pepa J
Pepa J3d ago
@IrshaiD' Yes, in most cases you need to handle the inital login on your side and then persist the browser cookies (for example to the key-value store), in the pre-navigation hook you can setup loading of the cookies from key-value store and setting them to the page context. You may also need to handle expiration of the cookies.
IrshaiD'
IrshaiD'OP3d ago
Yep got it . thanks guys

Did you find this page helpful?