how do i create organize 1 auth per session, ip, user agent ?
I want to create bunch of authenticated users, each with their consistent browser, proxy, user agent, fingerprints, schedule, browsing pattern, etc.
13 Replies
Someone will reply to you shortly. In the meantime, this might help:
foreign-sapphireOP•4mo ago
found it
dependent-tan•4mo ago
hi, would you mind sharing your solution? im facing a similar issue. 🙂 Ta.
foreign-sapphireOP•4mo ago
For proxy
- put proxy at the start, or put a function to return a new proxy everytime its called. Its in the constructor of crawlee scrapper
For session:
- set session pool config to have it invalidated upon a single error.
- use preNavigate hook, in there put logic to do a check if context has session user data or if its signed in. If not, then we update session with a new user data and other pattern associated with the user, sign in user and attach the cookie to the context. (If theres user data it means user is signed in).
- initiate session pool to be the same amount like the # of the accounts, so 1 user map to 1 session.
For behaviour:
- manual customization of user behaviour by relying on the context attached to the session userData.
@Vi just advanced to level 1! Thanks for your contributions! 🎉
foreign-sapphireOP•4mo ago
i got it wrong, only newUrl is usable to associate 1 proxy with 1 session

foreign-sapphireOP•4mo ago
so has to use this one

foreign-sapphireOP•4mo ago

foreign-sapphireOP•4mo ago
Optimizing web scraping: Scraping auth data using JSDOM | Crawlee ·...
Crawlee helps you build and maintain your crawlers. It's open source, but built by developers who scrape millions of pages every day for a living.
foreign-sapphireOP•4mo ago
Still doesnt work if max concurrency is more than 1.
@Vi just advanced to level 2! Thanks for your contributions! 🎉
foreign-sapphireOP•4mo ago
I think its a bug
Tried running it many times against a single url but with different unique key. Even with max session usage : 1, a single session keeps being reused many times.
Maybe the session usage count get incremented after the request instead of before ? Or the batching is bugged
exotic-emerald•4mo ago