Using SessionStorage in PuppeteerCrawler
How can we use SessionStorage in Puppeteer Crawler?
I didn't find anything related to Session Storage in the documentation so I tried
to guess some reasonable config values.
useSessionPool: true,
persistCookiesPerSession: true,
maxConcurrency
sessionPoolOptions: {
maxPoolSize: 1,
sessionOptions: {
maxUsageCount: 9999,
maxErrorScore: 9999,
maxAgeSecs: 99999,
},
},
I still looks like between requests the data is wiped out
(there is always an about:blank request showing up between the different requests)
Thanks in advance!
3 Replies
ratty-blushOP•3y ago
Trying to find out what's happening here i stumbled upon https://developers.apify.com/academy/puppeteer-playwright/browser-contexts#persistent-vs-non-persistent could it be that puppeteer in crawlee is no longer using a persistent context anymore?
Apify
VI - Creating multiple browser contexts · Apify Developers
Learn what a browser context is, how to create one, how to emulate devices, and how to use browser contexts to automate multiple sessions at one time.
ratty-blushOP•3y ago
Ok getting closer - it can be managed in the launchcontext https://crawlee.dev/api/puppeteer-crawler/interface/PuppeteerLaunchContext#useIncognitoPages
PuppeteerLaunchContext | API | Crawlee
Apify extends the launch options of Puppeteer.
You can use any of the Puppeteer compatible
LaunchOptions
options by providing the launchOptions
property.
Example:
```js
// launch a headless Chrome (not Chromium)
const launchContext = {
// Apify helpers
useCh...ratty-blushOP•3y ago
Ok won't work. There's a comment in source that says crawlingContexts are created on per request basis. Puppeteer crawler cookies are managed per default (copied from one context to the other) but SessionStorage is not. The best workaround I can think of is copying the session storage to cookies and back to session storage via pre navigation hooks, but I'm open to suggestions ...