Storing and retrieving lists of entities

Moving from: https://discord.com/channels/665254494820368395/699946888572567611/1413211154020565002 From @nphair @kevinsullivan Hi all, is there a general strategy you recommend for converting from inductive data types to typedb's ER model? The simplest example that I suspect has come up before is a List. Assuming a typical definition (this is Lean)
inductive List (α : Type u) where
| nil : List α
| cons (head : α) (tail : List α) : List α
inductive List (α : Type u) where
| nil : List α
| cons (head : α) (tail : List α) : List α
Then i am trying to model data of this form for a small set of types with
ListValueEntity sub entity;
ListValueEntityToElementRel sub relation;
ListValueEntityToElementRel owns ListValueIndexAttr;
ListValueEntityToElementRel relates ListValueEntityToElementRelListRole;
ListValueEntityToElementRel relates ListValueEntityToElementRelElementRole;

ListValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelListRole;
ListValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelElementRole;
NatValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelElementRole;
BoolValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelElementRole;
StrValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelElementRole;
...
ListValueEntity sub entity;
ListValueEntityToElementRel sub relation;
ListValueEntityToElementRel owns ListValueIndexAttr;
ListValueEntityToElementRel relates ListValueEntityToElementRelListRole;
ListValueEntityToElementRel relates ListValueEntityToElementRelElementRole;

ListValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelListRole;
ListValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelElementRole;
NatValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelElementRole;
BoolValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelElementRole;
StrValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelElementRole;
...
Does that seem reasonable? I am able to write data of this form, but am having a hard time preserving the structure of things like Lists of Lists on fetches.
24 Replies
georgii
georgiiOP4d ago
↪️ @georgii : I don't think you need BoolValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelElementRole if BoolValueEntity sub ListValueEntity Your approach looks quite reasonable to me. Can you share an example of a problematic query? ↪️ @krishnan : I've never thought of this in depth but I expected it to be something simpler like:
relation List relates head, relates tail, plays List:tail;
entity ListElement @abstract, plays List:Head;
relation List relates head, relates tail, plays List:tail;
entity ListElement @abstract, plays List:Head;
ListElement is then subtyped as needed. This isn't perfect since the end of the list would ideally have been a List instance that doesn't link any players, but TypeDB doesn't allow this, so you may need a entity ListEndMarker plays List: tail;
Joshua
Joshua4d ago
you can basically currently make linked lists, and indexed maps:
entity entry, owns position, owns value; (or replace value with a relation to a concept, or make this a relation and do the same)
entity entry, owns position, owns value; (or replace value with a relation to a concept, or make this a relation and do the same)
relation linked-node, relates prev, plays linked-node:prev, relates node-content;
relation linked-node, relates prev, plays linked-node:prev, relates node-content;
those are the normal workarounds
georgii
georgiiOP4d ago
↪️ @nphair : Thank you both for weighing in. It seems like there are at least two different approaches. The approach that i first posted, which minimizes subtyping; in favor of explicit role playing statements . And the linked list approach mentioned by @krishnan which depends much more on subtyping. My concern with the linked-list style is having scaling. BoolValueEntity would have to be subtyped for ever kind-of collection. It seems I should be able to define BoolValueEntity without any mention of list. Regardless of representation, what's really important to me though is being able to fetch Lists (or Lists of Lists of Lists) in a way that preserves the structure. @georgii you asked for an example query. Using my original schema, I am inserting [["a"],["b"]] with:
insert $v0 isa ListValueEntity;
$v1 isa ListValueEntity;
$v2 isa StrValueEntity;
$v3 "a" isa StrValueAttr;
$v2 has $v3;
$v4 (ListValueEntityToElementRelListRole: $v1,ListValueEntityToElementRelElementRole: $v2) isa ListValueEntityToElementRel;
$v5 0 isa ListValueIndexAttr;
$v4 has $v5;
$v6 (ListValueEntityToElementRelListRole: $v0,ListValueEntityToElementRelElementRole: $v1) isa ListValueEntityToElementRel;
$v7 0 isa ListValueIndexAttr;
$v6 has $v7;
$v8 isa ListValueEntity;
$v9 isa StrValueEntity;
$v10 "b" isa StrValueAttr;
$v9 has $v10;
$v11 (ListValueEntityToElementRelListRole: $v8,ListValueEntityToElementRelElementRole: $v9) isa ListValueEntityToElementRel;
$v12 0 isa ListValueIndexAttr;
$v11 has $v12;
$v13 (ListValueEntityToElementRelListRole: $v0,ListValueEntityToElementRelElementRole: $v8) isa ListValueEntityToElementRel;
$v14 1 isa ListValueIndexAttr;
$v13 has $v14;
insert $v0 isa ListValueEntity;
$v1 isa ListValueEntity;
$v2 isa StrValueEntity;
$v3 "a" isa StrValueAttr;
$v2 has $v3;
$v4 (ListValueEntityToElementRelListRole: $v1,ListValueEntityToElementRelElementRole: $v2) isa ListValueEntityToElementRel;
$v5 0 isa ListValueIndexAttr;
$v4 has $v5;
$v6 (ListValueEntityToElementRelListRole: $v0,ListValueEntityToElementRelElementRole: $v1) isa ListValueEntityToElementRel;
$v7 0 isa ListValueIndexAttr;
$v6 has $v7;
$v8 isa ListValueEntity;
$v9 isa StrValueEntity;
$v10 "b" isa StrValueAttr;
$v9 has $v10;
$v11 (ListValueEntityToElementRelListRole: $v8,ListValueEntityToElementRelElementRole: $v9) isa ListValueEntityToElementRel;
$v12 0 isa ListValueIndexAttr;
$v11 has $v12;
$v13 (ListValueEntityToElementRelListRole: $v0,ListValueEntityToElementRelElementRole: $v8) isa ListValueEntityToElementRel;
$v14 1 isa ListValueIndexAttr;
$v13 has $v14;
What I would expect is a fetch query that returns json structured much in the same way as the original list ↪️ @georgii : Currently, the structure of the output of fetch queries is identical to the query. The lists are not natively supported in the database, and the queries cannot generate the structure themselves, so all the inner lists should be declared in the query. You will probably need some programmatic logic on top of TypeQL to reach your goal right now ↪️ @kevinsullivan : Kevin Sullivan here. Thank you. Georgii. Nick's question comes from the project he and I have together. Let's continue this discussion here. We're not quite sure what you mean by "all the inner lists should be declared in the query." What we're querying for are those inner lists. And they are entities, not attributes, so sub-queries won't work, as we understand them. We don't want to distract you all from the hard work you're doing, but the question here seems like it'd be relevant to many potential customers: how to store and fetch list entities the elements of which are of some specified type, to include possibly list. @nphair @kevinsullivan Sorry for moving you between threads, I was not sure if your questions were related. I agree that this question can be interesting to many developers, so I moved everything to this discussion post. Let's continue here & please check this message from Joshua above
Joshua
Joshua4d ago
thanks for moving here!
nphair
nphair4d ago
they are 🙂 thanks for joining Joshua. Yes, i think we are all interested in the canonical approach to this problem
georgii
georgiiOP4d ago
Now, explaining my latest message. I was trying to warn you that fetch's structure does not change dynamically based on your data. Thus, the distinction between "lists of scalars" and "lists of lists" must be static: in your schema and in your fetch query. In case you were seeking something else. Since your questions are related, now I see that you're not interested in lists of attributes in your case, so my suggestions about built-in lists from another thread were wrong. When we talk about this structure, I think that one of the approaches we discussed above is the way to go. I don't find your initial idea too weird, although it is very verbose, and I'd personally use more inheritance in this case. Krishnan's suggestion is basically the implementation of your inductive type, while Josh shared a common list structure, similar to what you'd build in C. To me, everything works, the choice depends on the usage constraints. Building fetch queries of lists of lists should not be a problem if your structure is set in the query (you already know the difference between lists of lists and lists of scalars). However, if your query is too general and you want to "logically traverse" the list you build, I'm afraid you will need to introduce branches for "list of scalars" / "list of lists" / ......., without recursion. But you can easily produce a flat document and additionally repack it programmatically separately.
nphair
nphair4d ago
Thanks @georgii .
Thus, the distinction between "lists of scalars" and "lists of lists" must be static: in your schema and in your fetch query. In case you were seeking something else.
In our case we will always know if we are reading a list of scalars or a list of lists of scalars. We have that type information available.
Building fetch queries of lists of lists should not be a problem if your structure is set in the query.
That's good to hear. So in my example case, I know I am reading a list of list of strings. Assume there is a single one of these in the database. Then I should be able to fetch it and preserve the nested structure. @joshua mentioned the linked-list being the common workaround. Being common, are there examples you can point to? particular of nested lists fetch calls?
Joshua
Joshua4d ago
definitely users have been doing that, we've given that advice in various support contexts & had some users do longer writeups on ordering in TypeDB before that they showed us. However, in general since we can't yet handle dynamic/recursive structures in Fetch (I have some ideas, but they require a nontrivial language upgrade), you basically have to return the lists as nodes with previous pointers (or indices) + the container list identifier/some link to the parent list, which then you can reconstruct on the application side
georgii
georgiiOP4d ago
For sure. @kevinsullivan mentioned
We're having trouble with a fetch query that works in all cases.
, so I thought about generic queries with arbitrary nesting. I also heard that you don't consider subqueries. Is it true? If yes, why?
Joshua
Joshua4d ago
with fun traverse_list($list: list-head) -> { list-entry }:
match
try {
$first isa list-entry (head: $list);
{
$node is $first;
} or {
let $node in $traverse_list_rom($list, $first);
};
};
return { $node };
with fun traverse_list_from($list: list-head, $node: list-entry) -> { list-entry }:
match
$next isa list-entry (prev: $node);
{
$next-node is $next;
} or {
let $next-node in traverse_lise_from($list, $next);
};
return { $next-node };
match $l isa list-head; # list entity
fetch {
"first-list": [
match $node in traverse_list($l);
fetch {
# some properties you want from the list nodes
};
]
}
with fun traverse_list($list: list-head) -> { list-entry }:
match
try {
$first isa list-entry (head: $list);
{
$node is $first;
} or {
let $node in $traverse_list_rom($list, $first);
};
};
return { $node };
with fun traverse_list_from($list: list-head, $node: list-entry) -> { list-entry }:
match
$next isa list-entry (prev: $node);
{
$next-node is $next;
} or {
let $next-node in traverse_lise_from($list, $next);
};
return { $next-node };
match $l isa list-head; # list entity
fetch {
"first-list": [
match $node in traverse_list($l);
fetch {
# some properties you want from the list nodes
};
]
}
linked list traversal i suppose it might be simplifiable just off the top of my head
nphair
nphair4d ago
oh no we could use subqueries. just havent been able to make them work how we wanted Thank you Joshua. You mention a nontrivial language upgrade - do you mean on my end from 2.x to 3.x? Or on your end in the language. And is that example you post assuming the upgrade? Also, just to put out another concrete example...
# Example usage. Assume database has schema written but no data.
data_in: List[List[str]] = [["a"], ["b"]]
write_to_typedb(data_in)

result: Any = read_from_typedb()
data_out: List[List[str]] = process(result)

assert data_in == data_out
# Example usage. Assume database has schema written but no data.
data_in: List[List[str]] = [["a"], ["b"]]
write_to_typedb(data_in)

result: Any = read_from_typedb()
data_out: List[List[str]] = process(result)

assert data_in == data_out
because I know I am writing a List[List[str]] i would expect to be able to craft a very specific read query to get that data back and require little processing to turn back into python. Do you thing a traversal is needed to do that?
Joshua
Joshua4d ago
@nphair do you know the structure you're reading back out is exactly that?
nphair
nphair4d ago
I know it will be List[List[str]] but not if elements have been added or deleted
Joshua
Joshua4d ago
oh that should be doable then!
fetch {
"list-1": [
match
# conceptually, for each $element in the list, which let's say is a Root entity of another list
fetch {
"list-2": [
match
# conceptually, traverse through the inner $element list
fetch {
# something from the inner list elements
}
]
};
],
};
fetch {
"list-1": [
match
# conceptually, for each $element in the list, which let's say is a Root entity of another list
fetch {
"list-2": [
match
# conceptually, traverse through the inner $element list
fetch {
# something from the inner list elements
}
]
};
],
};
you might be able to stip out the objects actually ah wait i think currently fetch must return an object
nphair
nphair4d ago
ah wait i think currently fetch must return an object
yes 🙂
Joshua
Joshua4d ago
so this returns:
{
"list-1": [
{
"list-2": [ ... ],
},
{
"list-2": [...],
},
...
]
}
{
"list-1": [
{
"list-2": [ ... ],
},
{
"list-2": [...],
},
...
]
}
you can probably flatten that into [str[]]
nphair
nphair4d ago
sorry, fetch is an attribute i believe
Joshua
Joshua4d ago
fetch leaves have to be values so at some point you have to get to "key": <primitive>
nphair
nphair4d ago
thank you. so the strategy is to nest subqueries. The key for those is arbitrary - "list-<depth>" - as you had it, and then you bottom out at values
Joshua
Joshua4d ago
pretty much if you know what shape of list you're expecting, you can generate these nested subqueries and submit them the tricky bit we don't have a good way to do right now is if you have variable depths nested-ness (up to infinity!) that's when you end up having to reconstruct the lists & nesting on the client side fully
nphair
nphair4d ago
if you know what shape of list you're expecting, you can generate these nested subqueries and submit them
great, thats what i was hoping for 🙂
The tricky bit we don't have a good way to do right now is if you have variable depths nested-ness (up to infinity!)
And I see, yes, fortunately in out case I think we have much more context to generate exactly what we need. So just to be clear, is the fetch you posted assuming the linked-list schema you originally posted? or the one i had posted? or perhaps the strategy is the same regardless? Oh and isn't an outermost match a requirement?
Joshua
Joshua4d ago
you can use it in any way that encodes a list you can retrieve, just retrieve the elements in the match and yeah need to precede it with something
nphair
nphair4d ago
got it, thank you both for your help, ill go see if i can take it the rest of the way
Joshua
Joshua4d ago
:typedblove:

Did you find this page helpful?