Isolated vertices vs connected vertices with no join benefit
Is there any downside to storing an isolated vertex with references to other nodes? Creating relationships makes the query more complicated than it needs to be, but storing references to other vertices seems like an anti-pattern/smell.
The relationship between nodes is defined as follows:
Querying this looks like:
With an isolated vertex:
Querying in this way produces the same result:
Please let me know if this doesn't make sense
The relationship between nodes is defined as follows:
Querying this looks like:
With an isolated vertex:
Querying in this way produces the same result:
Please let me know if this doesn't make sense
Solution
We're basically talking about denormalization here. denormalization is common for graphs as it is for relational data structures and it comes with the same drawbacks. Is this a case for denormalization? Based on this simple example, I'd say "no" because you really just have a single hop or so to collect what you need and you're done. But, I also don't know any other statistics about your data structure and other expected query patterns so it's hard to say with certainty that you shouldn't denormalize.
That said, if denormalization is the answer, do you denormalize to a wholly disconnected vertex? i still don't think i'd recommend that based on what i know. You're most painful traversal is a two hop of
That seems like the most natural model to me since "isolated" is really just a "batch" with properties containing various things its connected to. Seems better to me to not introduce an "isolated" concept for that and denormalize to a thing that is actually part of your graph and connected.
I'm still hesitant to say denormalize at all though. I suppose you can do it for ease of querying sake, but it's more often done for performance reasons. I also could be missing more context with this advice, but I think i'll stick with what I've posted here as an answer.
That said, if denormalization is the answer, do you denormalize to a wholly disconnected vertex? i still don't think i'd recommend that based on what i know. You're most painful traversal is a two hop of
__.in('relates_to_batch').out("related_to_reusable") to get an id or perhaps multiple ids. Considering adding a property of "reusableIds" to "batch-a" and store a List of the ids there (or use multi-properties, https://tinkerpop.apache.org/docs/current/reference/#vertex-properties).That seems like the most natural model to me since "isolated" is really just a "batch" with properties containing various things its connected to. Seems better to me to not introduce an "isolated" concept for that and denormalize to a thing that is actually part of your graph and connected.
I'm still hesitant to say denormalize at all though. I suppose you can do it for ease of querying sake, but it's more often done for performance reasons. I also could be missing more context with this advice, but I think i'll stick with what I've posted here as an answer.
