Using SessionStorage in PuppeteerCrawler

How can we use SessionStorage in Puppeteer Crawler? I didn't find anything related to Session Storage in the documentation so I tried to guess some reasonable config values. useSessionPool: true, persistCookiesPerSession: true, maxConcurrency sessionPoolOptions: { maxPoolSize: 1, sessionOptions: { maxUsageCount: 9999, maxErrorScore: 9999, maxAgeSecs: 99999, }, }, I still looks like between requests the data is wiped out (there is always an about:blank request showing up between the different requests) Thanks in advance!
3 Replies
ratty-blush
ratty-blushOP3y ago
Trying to find out what's happening here i stumbled upon https://developers.apify.com/academy/puppeteer-playwright/browser-contexts#persistent-vs-non-persistent could it be that puppeteer in crawlee is no longer using a persistent context anymore?
Apify
VI - Creating multiple browser contexts · Apify Developers
Learn what a browser context is, how to create one, how to emulate devices, and how to use browser contexts to automate multiple sessions at one time.
ratty-blush
ratty-blushOP3y ago
PuppeteerLaunchContext | API | Crawlee
Apify extends the launch options of Puppeteer. You can use any of the Puppeteer compatible LaunchOptions options by providing the launchOptions property. Example: ```js // launch a headless Chrome (not Chromium) const launchContext = { // Apify helpers useCh...
ratty-blush
ratty-blushOP3y ago
Ok won't work. There's a comment in source that says crawlingContexts are created on per request basis. Puppeteer crawler cookies are managed per default (copied from one context to the other) but SessionStorage is not. The best workaround I can think of is copying the session storage to cookies and back to session storage via pre navigation hooks, but I'm open to suggestions ...

Did you find this page helpful?