Could not find file at storage/key_value_stores/default/SDK_SESSION_POOL_STATE.json

Hi there! πŸ‘‹ I'm crawling some pages from different countries using proxy configurations. Function running crawling:
async crawl(proxyUrl: string, sites: string[]) {
const config = new Configuration({
disableBrowserSandbox: true,
});

const proxyConfiguration = new ProxyConfiguration({
proxyUrls: [proxyUrl],
});

const urlsHash: Record<string, string[]> = {};

const crawler = new PuppeteerCrawler(
{
proxyConfiguration,
maxRequestRetries: 0,
requestHandler: async ({ page }) => {
const pageUrl = page.url();

page.on('request', (request) => {
...
});

await this.autoScroll(page);

await page.waitForNetworkIdle({
idleTime: 1000,
});
},
},
config,
);

await crawler.run(sites);

return urlsHash;
}
async crawl(proxyUrl: string, sites: string[]) {
const config = new Configuration({
disableBrowserSandbox: true,
});

const proxyConfiguration = new ProxyConfiguration({
proxyUrls: [proxyUrl],
});

const urlsHash: Record<string, string[]> = {};

const crawler = new PuppeteerCrawler(
{
proxyConfiguration,
maxRequestRetries: 0,
requestHandler: async ({ page }) => {
const pageUrl = page.url();

page.on('request', (request) => {
...
});

await this.autoScroll(page);

await page.waitForNetworkIdle({
idleTime: 1000,
});
},
},
config,
);

await crawler.run(sites);

return urlsHash;
}
After the first run finished(and a promise is resolved), I call this function again with another params, but get the following error: Could not find file at /usr/src/app/storage/key_value_stores/default/SDK_SESSION_POOL_STATE.json Could someone help what is wrong? I suppose that a session after the first run was cleared and the file from the error message was deleted, but it seems like it expects to have this file.
12 Replies
flat-fuchsia
flat-fuchsiaβ€’2y ago
Hey there! Important question is - are you calling this function subsequently (i.e. let one crawler run and then call another one), or you are doing it simultenously/concurrently?
continuing-cyan
continuing-cyanOPβ€’2y ago
subsequently I found a solution how to fix my issue:
await crawler.run(sites);

const store = await KeyValueStore.open();
await store.drop();

const queue = await RequestQueue.open();
await queue.drop();
await crawler.run(sites);

const store = await KeyValueStore.open();
await store.drop();

const queue = await RequestQueue.open();
await queue.drop();
Anyway thank you for response!
flat-fuchsia
flat-fuchsiaβ€’2y ago
That's still a bit weird, but if you figured it out, let's probably not go further into details. One question though - which crawlee version you're on?
continuing-cyan
continuing-cyanOPβ€’2y ago
3.3.2
ambitious-aqua
ambitious-aquaβ€’2y ago
@snowbandit7
conscious-sapphire
conscious-sapphireβ€’2y ago
@Andrey Bykov seeing the same error. i did some digging, and it looks like the key value store remains in StorageManager.cache.default even after it's purged on disk
flat-fuchsia
flat-fuchsiaβ€’2y ago
@snowbandit7 could you confirm you're using latest Crawlee?
conscious-sapphire
conscious-sapphireβ€’2y ago
3.3.2 i see there’s a newer release from 5 days ago, can try that
flat-fuchsia
flat-fuchsiaβ€’2y ago
Yes, please give it a try. If it still has the problem - please let me know!
conscious-sapphire
conscious-sapphireβ€’2y ago
yeah still seeing the same issue with 3.3.3
flat-fuchsia
flat-fuchsiaβ€’2y ago
@vladdy could you have a look?
absent-sapphire
absent-sapphireβ€’2y ago
Oh jeez..could you get a minimum repro sample for this please and I'll take a look πŸ™

Did you find this page helpful?