Only-once storage

Helllo all, I’m looking to understand how crawlee uses storage a little better and have a question regarding that: Crawlee truncates the storage of all indexed pages every time I run. Is there a way to not have it do that? Almost like using it as an append-only log for new items found. Worst case scenario, I can keep an in-memory record of all pages and simply not write to disk when I see it. Curious what best practices are here.
2 Replies
Hall
Hall2mo ago
Someone will reply to you shortly. In the meantime, this might help: -# This post was marked as solved by royrusso. View answer.
flat-fuchsia
flat-fuchsiaOP2mo ago
Ignore. I was calling datastore.dump() and then wondered why purgeonstart wasn’t working. :perfecto:

Did you find this page helpful?