Crawlee & Apify•4d ago

How do we handle authenticated scrapers on apify cli locally

Iam using playwright crawlee + TS template , should i handle login , save session and session injection by myself? or is there any apify tool that can help with that ?

6 Replies

Hall•4d ago

Someone will reply to you shortly. In the meantime, this might help: -# This post was marked as solved by IrshaiD'. View answer.

Pepa J•4d ago

Hi @IrshaiD' I am not sure if I understand, but maybe you mean the apify login command for apify-cli? Or can you describe a bit more what you try to achieve?

IrshaiD'OP•3d ago

What I mean is: I need to scrape an goverment contracts website where the data is only accessible after logging in. It contains multiple contract listings and around 150,000 contract detail pages, so it's a large-scale operation. I want every request to carry the authenticated session. How would you approach this using Apify? Let me know if you have any suggestions. My approach is to set up two separate actors with an integrated flow: One actor login once and handles scraping the list pages. The other actor scrapes the detail pages and sends the data to an S3 bucket. This follows a divide-and-conquer approach. However, I want to avoid logging in every time in the details actor—I’d prefer to log in once and maintain the browser session across both actors.

memo23•3d ago

just save cookies and use them in your request(s) after login, also probably they have API so no need for PW

Pepa J•3d ago

@IrshaiD' Yes, in most cases you need to handle the inital login on your side and then persist the browser cookies (for example to the key-value store), in the pre-navigation hook you can setup loading of the cookies from key-value store and setting them to the page context. You may also need to handle expiration of the cookies.

IrshaiD'OP•3d ago

Yep got it . thanks guys

Gaming

Programming

How do we handle authenticated scrapers on apify cli locally

Did you find this page helpful?