Storing and retrieving lists of entities
Moving from: https://discord.com/channels/665254494820368395/699946888572567611/1413211154020565002
From @nphair @kevinsullivan
Hi all, is there a general strategy you recommend for converting from inductive data types to typedb's ER model?
The simplest example that I suspect has come up before is a List. Assuming a typical definition (this is Lean)
Then i am trying to model data of this form for a small set of types with
Does that seem reasonable? I am able to write data of this form, but am having a hard time preserving the structure of things like Lists of Lists on fetches.
24 Replies
↪️ @georgii :
I don't think you need
BoolValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelElementRole
if BoolValueEntity sub ListValueEntity
Your approach looks quite reasonable to me. Can you share an example of a problematic query?
↪️ @krishnan :
I've never thought of this in depth but I expected it to be something simpler like:
ListElement
is then subtyped as needed. This isn't perfect since the end of the list would ideally have been a List instance that doesn't link any players, but TypeDB doesn't allow this, so you may need a entity ListEndMarker plays List: tail;
you can basically currently make linked lists, and indexed maps:
those are the normal workarounds
↪️ @nphair :
Thank you both for weighing in.
It seems like there are at least two different approaches. The approach that i first posted, which minimizes subtyping; in favor of explicit role playing statements . And the linked list approach mentioned by @krishnan which depends much more on subtyping.
My concern with the linked-list style is having scaling.
BoolValueEntity
would have to be subtyped for ever kind-of collection. It seems I should be able to define BoolValueEntity
without any mention of list.
Regardless of representation, what's really important to me though is being able to fetch Lists (or Lists of Lists of Lists) in a way that preserves the structure.
@georgii you asked for an example query. Using my original schema, I am inserting [["a"],["b"]]
with:
What I would expect is a fetch query that returns json structured much in the same way as the original list
↪️ @georgii :
Currently, the structure of the output of fetch
queries is identical to the query. The lists are not natively supported in the database, and the queries cannot generate the structure themselves, so all the inner lists should be declared in the query. You will probably need some programmatic logic on top of TypeQL to reach your goal right now
↪️ @kevinsullivan :
Kevin Sullivan here. Thank you. Georgii. Nick's question comes from the project he and I have together. Let's continue this discussion here. We're not quite sure what you mean by "all the inner lists should be declared in the query." What we're querying for are those inner lists. And they are entities, not attributes, so sub-queries won't work, as we understand them. We don't want to distract you all from the hard work you're doing, but the question here seems like it'd be relevant to many potential customers: how to store and fetch list entities the elements of which are of some specified type, to include possibly list.
@nphair @kevinsullivan Sorry for moving you between threads, I was not sure if your questions were related. I agree that this question can be interesting to many developers, so I moved everything to this discussion post. Let's continue here & please check this message from Joshua abovethanks for moving here!
they are 🙂
thanks for joining Joshua.
Yes, i think we are all interested in the canonical approach to this problem
Now, explaining my latest message. I was trying to warn you that
fetch
's structure does not change dynamically based on your data. Thus, the distinction between "lists of scalars" and "lists of lists" must be static: in your schema and in your fetch query. In case you were seeking something else.
Since your questions are related, now I see that you're not interested in lists of attributes in your case, so my suggestions about built-in lists from another thread were wrong. When we talk about this structure, I think that one of the approaches we discussed above is the way to go. I don't find your initial idea too weird, although it is very verbose, and I'd personally use more inheritance in this case. Krishnan's suggestion is basically the implementation of your inductive type, while Josh shared a common list structure, similar to what you'd build in C. To me, everything works, the choice depends on the usage constraints.
Building fetch queries of lists of lists should not be a problem if your structure is set in the query (you already know the difference between lists of lists and lists of scalars). However, if your query is too general and you want to "logically traverse" the list you build, I'm afraid you will need to introduce branches for "list of scalars" / "list of lists" / ......., without recursion. But you can easily produce a flat document and additionally repack it programmatically separately.Thanks @georgii .
Thus, the distinction between "lists of scalars" and "lists of lists" must be static: in your schema and in your fetch query. In case you were seeking something else.In our case we will always know if we are reading a list of scalars or a list of lists of scalars. We have that type information available.
Building fetch queries of lists of lists should not be a problem if your structure is set in the query.That's good to hear. So in my example case, I know I am reading a list of list of strings. Assume there is a single one of these in the database. Then I should be able to fetch it and preserve the nested structure. @joshua mentioned the linked-list being the common workaround. Being common, are there examples you can point to? particular of nested lists fetch calls?
definitely users have been doing that, we've given that advice in various support contexts & had some users do longer writeups on ordering in TypeDB before that they showed us.
However, in general since we can't yet handle dynamic/recursive structures in Fetch (I have some ideas, but they require a nontrivial language upgrade), you basically have to return the lists as nodes with previous pointers (or indices) + the container list identifier/some link to the parent list, which then you can reconstruct on the application side
For sure. @kevinsullivan mentioned
We're having trouble with a fetch query that works in all cases., so I thought about generic queries with arbitrary nesting. I also heard that you don't consider subqueries. Is it true? If yes, why?
linked list traversal i suppose
it might be simplifiable
just off the top of my head
oh no we could use subqueries. just havent been able to make them work how we wanted
Thank you Joshua.
You mention a nontrivial language upgrade - do you mean on my end from 2.x to 3.x? Or on your end in the language. And is that example you post assuming the upgrade?
Also, just to put out another concrete example...
because I know I am writing a
List[List[str]]
i would expect to be able to craft a very specific read query to get that data back and require little processing to turn back into python.
Do you thing a traversal is needed to do that?@nphair do you know the structure you're reading back out is exactly that?
I know it will be
List[List[str]]
but not if elements have been added or deletedoh that should be doable then!
you might be able to stip out the objects actually
ah wait i think currently fetch must return an object
ah wait i think currently fetch must return an objectyes 🙂
so this returns:
you can probably flatten that into
[str[]]
sorry, fetch is an attribute i believe
fetch leaves have to be values
so at some point you have to get to "key": <primitive>
thank you.
so the strategy is to nest subqueries. The key for those is arbitrary - "list-<depth>" - as you had it, and then you bottom out at values
pretty much
if you know what shape of list you're expecting, you can generate these nested subqueries and submit them
the tricky bit we don't have a good way to do right now is if you have variable depths nested-ness (up to infinity!)
that's when you end up having to reconstruct the lists & nesting on the client side fully
if you know what shape of list you're expecting, you can generate these nested subqueries and submit themgreat, thats what i was hoping for 🙂
The tricky bit we don't have a good way to do right now is if you have variable depths nested-ness (up to infinity!)And I see, yes, fortunately in out case I think we have much more context to generate exactly what we need. So just to be clear, is the fetch you posted assuming the linked-list schema you originally posted? or the one i had posted? or perhaps the strategy is the same regardless? Oh and isn't an outermost match a requirement?
you can use it in any way that encodes a list you can retrieve, just retrieve the elements in the match
and yeah need to precede it with something
got it, thank you both for your help, ill go see if i can take it the rest of the way
:typedblove: