TypeDB•2mo ago

Storing and retrieving lists of entities

Moving from: https://discord.com/channels/665254494820368395/699946888572567611/1413211154020565002 From @nphair @kevinsullivan Hi all, is there a general strategy you recommend for converting from inductive data types to typedb's ER model? The simplest example that I suspect has come up before is a List. Assuming a typical definition (this is Lean)

inductive List (α : Type u) where
  | nil : List α
  | cons (head : α) (tail : List α) : List α

inductive List (α : Type u) where
  | nil : List α
  | cons (head : α) (tail : List α) : List α

Then i am trying to model data of this form for a small set of types with

ListValueEntity sub entity;
ListValueEntityToElementRel sub relation;
ListValueEntityToElementRel owns ListValueIndexAttr;
ListValueEntityToElementRel relates ListValueEntityToElementRelListRole;
ListValueEntityToElementRel relates ListValueEntityToElementRelElementRole;

ListValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelListRole;
ListValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelElementRole;
NatValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelElementRole;
BoolValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelElementRole;
StrValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelElementRole;
...

ListValueEntity sub entity;
ListValueEntityToElementRel sub relation;
ListValueEntityToElementRel owns ListValueIndexAttr;
ListValueEntityToElementRel relates ListValueEntityToElementRelListRole;
ListValueEntityToElementRel relates ListValueEntityToElementRelElementRole;

ListValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelListRole;
ListValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelElementRole;
NatValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelElementRole;
BoolValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelElementRole;
StrValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelElementRole;
...

Does that seem reasonable? I am able to write data of this form, but am having a hard time preserving the structure of things like Lists of Lists on fetches.

24 Replies

georgiiOP•2mo ago

↪️ @georgii : I don't think you need BoolValueEntity plays ListValueEntityToElementRel:ListValueEntityToElementRelElementRole if BoolValueEntity sub ListValueEntity Your approach looks quite reasonable to me. Can you share an example of a problematic query? ↪️ @krishnan : I've never thought of this in depth but I expected it to be something simpler like:

relation List relates head, relates tail, plays List:tail;
entity ListElement @abstract, plays List:Head;

relation List relates head, relates tail, plays List:tail;
entity ListElement @abstract, plays List:Head;

ListElement is then subtyped as needed. This isn't perfect since the end of the list would ideally have been a List instance that doesn't link any players, but TypeDB doesn't allow this, so you may need a entity ListEndMarker plays List: tail;

Joshua•2mo ago

you can basically currently make linked lists, and indexed maps:

entity entry, owns position, owns value; (or replace value with a relation to a concept, or make this a relation and do the same)

entity entry, owns position, owns value; (or replace value with a relation to a concept, or make this a relation and do the same)

relation linked-node, relates prev, plays linked-node:prev, relates node-content;

relation linked-node, relates prev, plays linked-node:prev, relates node-content;

those are the normal workarounds

georgiiOP•2mo ago

↪️ @nphair : Thank you both for weighing in. It seems like there are at least two different approaches. The approach that i first posted, which minimizes subtyping; in favor of explicit role playing statements . And the linked list approach mentioned by @krishnan which depends much more on subtyping. My concern with the linked-list style is having scaling. BoolValueEntity would have to be subtyped for ever kind-of collection. It seems I should be able to define BoolValueEntity without any mention of list. Regardless of representation, what's really important to me though is being able to fetch Lists (or Lists of Lists of Lists) in a way that preserves the structure. @georgii you asked for an example query. Using my original schema, I am inserting [["a"],["b"]] with:

insert $v0 isa ListValueEntity;
$v1 isa ListValueEntity;
$v2 isa StrValueEntity;
$v3 "a" isa StrValueAttr;
$v2 has $v3;
$v4 (ListValueEntityToElementRelListRole: $v1,ListValueEntityToElementRelElementRole: $v2) isa ListValueEntityToElementRel;
$v5 0 isa ListValueIndexAttr;
$v4 has $v5;
$v6 (ListValueEntityToElementRelListRole: $v0,ListValueEntityToElementRelElementRole: $v1) isa ListValueEntityToElementRel;
$v7 0 isa ListValueIndexAttr;
$v6 has $v7;
$v8 isa ListValueEntity;
$v9 isa StrValueEntity;
$v10 "b" isa StrValueAttr;
$v9 has $v10;
$v11 (ListValueEntityToElementRelListRole: $v8,ListValueEntityToElementRelElementRole: $v9) isa ListValueEntityToElementRel;
$v12 0 isa ListValueIndexAttr;
$v11 has $v12;
$v13 (ListValueEntityToElementRelListRole: $v0,ListValueEntityToElementRelElementRole: $v8) isa ListValueEntityToElementRel;
$v14 1 isa ListValueIndexAttr;
$v13 has $v14;

insert $v0 isa ListValueEntity;
$v1 isa ListValueEntity;
$v2 isa StrValueEntity;
$v3 "a" isa StrValueAttr;
$v2 has $v3;
$v4 (ListValueEntityToElementRelListRole: $v1,ListValueEntityToElementRelElementRole: $v2) isa ListValueEntityToElementRel;
$v5 0 isa ListValueIndexAttr;
$v4 has $v5;
$v6 (ListValueEntityToElementRelListRole: $v0,ListValueEntityToElementRelElementRole: $v1) isa ListValueEntityToElementRel;
$v7 0 isa ListValueIndexAttr;
$v6 has $v7;
$v8 isa ListValueEntity;
$v9 isa StrValueEntity;
$v10 "b" isa StrValueAttr;
$v9 has $v10;
$v11 (ListValueEntityToElementRelListRole: $v8,ListValueEntityToElementRelElementRole: $v9) isa ListValueEntityToElementRel;
$v12 0 isa ListValueIndexAttr;
$v11 has $v12;
$v13 (ListValueEntityToElementRelListRole: $v0,ListValueEntityToElementRelElementRole: $v8) isa ListValueEntityToElementRel;
$v14 1 isa ListValueIndexAttr;
$v13 has $v14;

What I would expect is a fetch query that returns json structured much in the same way as the original list ↪️ @georgii : Currently, the structure of the output of fetch queries is identical to the query. The lists are not natively supported in the database, and the queries cannot generate the structure themselves, so all the inner lists should be declared in the query. You will probably need some programmatic logic on top of TypeQL to reach your goal right now ↪️ @kevinsullivan : Kevin Sullivan here. Thank you. Georgii. Nick's question comes from the project he and I have together. Let's continue this discussion here. We're not quite sure what you mean by "all the inner lists should be declared in the query." What we're querying for are those inner lists. And they are entities, not attributes, so sub-queries won't work, as we understand them. We don't want to distract you all from the hard work you're doing, but the question here seems like it'd be relevant to many potential customers: how to store and fetch list entities the elements of which are of some specified type, to include possibly list. @nphair @kevinsullivan Sorry for moving you between threads, I was not sure if your questions were related. I agree that this question can be interesting to many developers, so I moved everything to this discussion post. Let's continue here & please check this message from Joshua above

Joshua•2mo ago

thanks for moving here!

nphair•2mo ago

they are 🙂 thanks for joining Joshua. Yes, i think we are all interested in the canonical approach to this problem

georgiiOP•2mo ago

Now, explaining my latest message. I was trying to warn you that fetch's structure does not change dynamically based on your data. Thus, the distinction between "lists of scalars" and "lists of lists" must be static: in your schema and in your fetch query. In case you were seeking something else. Since your questions are related, now I see that you're not interested in lists of attributes in your case, so my suggestions about built-in lists from another thread were wrong. When we talk about this structure, I think that one of the approaches we discussed above is the way to go. I don't find your initial idea too weird, although it is very verbose, and I'd personally use more inheritance in this case. Krishnan's suggestion is basically the implementation of your inductive type, while Josh shared a common list structure, similar to what you'd build in C. To me, everything works, the choice depends on the usage constraints. Building fetch queries of lists of lists should not be a problem if your structure is set in the query (you already know the difference between lists of lists and lists of scalars). However, if your query is too general and you want to "logically traverse" the list you build, I'm afraid you will need to introduce branches for "list of scalars" / "list of lists" / ......., without recursion. But you can easily produce a flat document and additionally repack it programmatically separately.

nphair•2mo ago

Thanks @georgii .

Thus, the distinction between "lists of scalars" and "lists of lists" must be static: in your schema and in your fetch query. In case you were seeking something else.

In our case we will always know if we are reading a list of scalars or a list of lists of scalars. We have that type information available.

Building fetch queries of lists of lists should not be a problem if your structure is set in the query.

That's good to hear. So in my example case, I know I am reading a list of list of strings. Assume there is a single one of these in the database. Then I should be able to fetch it and preserve the nested structure. @joshua mentioned the linked-list being the common workaround. Being common, are there examples you can point to? particular of nested lists fetch calls?

Joshua•2mo ago

definitely users have been doing that, we've given that advice in various support contexts & had some users do longer writeups on ordering in TypeDB before that they showed us. However, in general since we can't yet handle dynamic/recursive structures in Fetch (I have some ideas, but they require a nontrivial language upgrade), you basically have to return the lists as nodes with previous pointers (or indices) + the container list identifier/some link to the parent list, which then you can reconstruct on the application side

georgiiOP•2mo ago

For sure. @kevinsullivan mentioned

We're having trouble with a fetch query that works in all cases.

, so I thought about generic queries with arbitrary nesting. I also heard that you don't consider subqueries. Is it true? If yes, why?

Joshua•2mo ago

with fun traverse_list($list: list-head) -> { list-entry }:
  match
  try {
    $first isa list-entry (head: $list);
    {
      $node is $first;
    } or {
      let $node in $traverse_list_rom($list, $first);
    };
  };
  return { $node };
with fun traverse_list_from($list: list-head, $node: list-entry) -> { list-entry }:
  match
    $next isa list-entry (prev: $node);
    {
      $next-node is $next;
    } or {
      let $next-node in traverse_lise_from($list, $next);
    };
  return { $next-node };
match $l isa list-head; # list entity
fetch {
  "first-list": [
    match $node in traverse_list($l);
    fetch {
       # some properties you want from the list nodes
    };
  ]
}

with fun traverse_list($list: list-head) -> { list-entry }:
  match
  try {
    $first isa list-entry (head: $list);
    {
      $node is $first;
    } or {
      let $node in $traverse_list_rom($list, $first);
    };
  };
  return { $node };
with fun traverse_list_from($list: list-head, $node: list-entry) -> { list-entry }:
  match
    $next isa list-entry (prev: $node);
    {
      $next-node is $next;
    } or {
      let $next-node in traverse_lise_from($list, $next);
    };
  return { $next-node };
match $l isa list-head; # list entity
fetch {
  "first-list": [
    match $node in traverse_list($l);
    fetch {
       # some properties you want from the list nodes
    };
  ]
}

linked list traversal i suppose it might be simplifiable just off the top of my head

nphair•2mo ago

oh no we could use subqueries. just havent been able to make them work how we wanted Thank you Joshua. You mention a nontrivial language upgrade - do you mean on my end from 2.x to 3.x? Or on your end in the language. And is that example you post assuming the upgrade? Also, just to put out another concrete example...

# Example usage. Assume database has schema written but no data.
data_in: List[List[str]] = [["a"], ["b"]]
write_to_typedb(data_in)

result: Any = read_from_typedb()
data_out: List[List[str]] = process(result)

assert data_in == data_out

# Example usage. Assume database has schema written but no data.
data_in: List[List[str]] = [["a"], ["b"]]
write_to_typedb(data_in)

result: Any = read_from_typedb()
data_out: List[List[str]] = process(result)

assert data_in == data_out

because I know I am writing a List[List[str]] i would expect to be able to craft a very specific read query to get that data back and require little processing to turn back into python. Do you thing a traversal is needed to do that?

Joshua•2mo ago

@nphair do you know the structure you're reading back out is exactly that?

nphair•2mo ago

I know it will be List[List[str]] but not if elements have been added or deleted

Joshua•2mo ago

oh that should be doable then!

fetch {
  "list-1": [
    match
      # conceptually, for each $element in the list, which let's say is a Root entity of another list
      fetch {
        "list-2": [
          match
          # conceptually, traverse through the inner $element list
          fetch {
            # something from the inner list elements
          }
        ]
      };
  ],
};

fetch {
  "list-1": [
    match
      # conceptually, for each $element in the list, which let's say is a Root entity of another list
      fetch {
        "list-2": [
          match
          # conceptually, traverse through the inner $element list
          fetch {
            # something from the inner list elements
          }
        ]
      };
  ],
};

you might be able to stip out the objects actually ah wait i think currently fetch must return an object

nphair•2mo ago

ah wait i think currently fetch must return an object

yes 🙂

Joshua•2mo ago

so this returns:

{
  "list-1": [
    {
      "list-2": [ ... ],
    },
    {
      "list-2": [...],
    },
    ...
  ]
}

{
  "list-1": [
    {
      "list-2": [ ... ],
    },
    {
      "list-2": [...],
    },
    ...
  ]
}

you can probably flatten that into [str[]]

nphair•2mo ago

sorry, fetch is an attribute i believe

Joshua•2mo ago

fetch leaves have to be values so at some point you have to get to "key": <primitive>

nphair•2mo ago

thank you. so the strategy is to nest subqueries. The key for those is arbitrary - "list-<depth>" - as you had it, and then you bottom out at values

Joshua•2mo ago

pretty much if you know what shape of list you're expecting, you can generate these nested subqueries and submit them the tricky bit we don't have a good way to do right now is if you have variable depths nested-ness (up to infinity!) that's when you end up having to reconstruct the lists & nesting on the client side fully

nphair•2mo ago

if you know what shape of list you're expecting, you can generate these nested subqueries and submit them

great, thats what i was hoping for 🙂

The tricky bit we don't have a good way to do right now is if you have variable depths nested-ness (up to infinity!)

And I see, yes, fortunately in out case I think we have much more context to generate exactly what we need. So just to be clear, is the fetch you posted assuming the linked-list schema you originally posted? or the one i had posted? or perhaps the strategy is the same regardless? Oh and isn't an outermost match a requirement?

Joshua•2mo ago

you can use it in any way that encodes a list you can retrieve, just retrieve the elements in the match and yeah need to precede it with something

nphair•2mo ago

got it, thank you both for your help, ill go see if i can take it the rest of the way

Joshua•2mo ago

:typedblove:

Gaming

Programming

Storing and retrieving lists of entities

Did you find this page helpful?