querying data-set on filesystem - like SQL
Hi
Let's assume I scrap data to folders:
./storage/datasets/
products
categories
And I have sample products:
{
"id": "product1",
"categoryId": "category1",
"name": "Hammer",
}
{
"id": "product2",
"categoryId": "category2",
"name": "Bread",
}
// ... 100 other products
And sample categories:
{
"id": "category1",
"name": "Tools",
}
{
"id": "category2",
"name": "Food",
}
How can I make queries on data-set - similary to SQL - for example 'join':
select p.* from product as p
join categories as c on p.categoryId = c.id
where c.name == 'Tools'
Guess it's not part of Crawlee - but maybe you have ideas how to query?
Thank you ;]2 Replies
quickest-silver•3y ago
You could get all items from each dataset with https://crawlee.dev/api/core/class/Dataset#getData and then just work with two arrays (first array of products, second array of categories) - or you could build third array with all the data and then just filter it
Dataset | API | Crawlee
The
Dataset
class represents a store for structured data where each object stored has the same attributes,
such as online store products or real estate offers. You can imagine it as a table,
where each object is a row and its attributes are columns.
Dataset is an append-only storage - you can only add new records to it but you cannot modify or...absent-sapphireOP•3y ago
Thanks, that'd work ;]