Disable write to disk

By default, data will be write to ./storage, is there a way to turn off this and use memory instead ?
6 Replies
Hall
Hall3mo ago
Someone will reply to you shortly. In the meantime, this might help:
Louis Deconinck
Louis Deconinck3mo ago
Like storing the data in an array variable?
azzouzana
azzouzana3mo ago
What data you want to disable writing to disk? Scraping output or crawling stats/queues or what? I don't think you should disable it altogether (especially the crawlee stats/queues)
other-emerald
other-emerald3mo ago
If the problem is that data, like queues, is persisted across runs, you can try using apify run --purge: https://docs.apify.com/cli/docs/reference#apify-run
Apify CLI Reference Documentation | CLI | Apify Documentation
The Apify CLI provides tools for managing your Apify projects and resources from the command line. Use these commands to develop Actors locally, deploy them to Apify platform, manage storage, orchestrate runs, and handle account configuration.
xenial-black
xenial-black3mo ago
Configuration.set("persistStorage", false)
Configuration.set("persistStorage", false)
setting this before starting your crawler should do the trick. btwm you can also change the storage dir using something like
const storageClient = new MemoryStorage({
localDataDirectory: crawlStoragePath,
persistStorage: true,
});
const storageClient = new MemoryStorage({
localDataDirectory: crawlStoragePath,
persistStorage: true,
});
if you simply wanted to change the storage location
Pepa J
Pepa J3mo ago
I'll just add another example:
import { MemoryStorage } from '@crawlee/memory-storage';
import { PlaywrightCrawler } from 'crawlee';
import { RequestQueue } from 'apify';

export const memoryRequestQueue = await RequestQueue.open(null, {
storageClient: new MemoryStorage(),
});

const crawler = new PlaywrightCrawler({
proxyConfiguration,
requestQueue: memoryRequestQueue,
// ...
});
import { MemoryStorage } from '@crawlee/memory-storage';
import { PlaywrightCrawler } from 'crawlee';
import { RequestQueue } from 'apify';

export const memoryRequestQueue = await RequestQueue.open(null, {
storageClient: new MemoryStorage(),
});

const crawler = new PlaywrightCrawler({
proxyConfiguration,
requestQueue: memoryRequestQueue,
// ...
});
etc.

Did you find this page helpful?