Limiting .path() results to a number of valid starting vertices
Hey folks, for context, we're using AWS Neptune, and Neptune Notebook for visualisation. We would like to visualise neighbourhoods of data with a given criteria:
Let's say our 'root' of the neighbourhood is any
But let's say we want exactly
The way we've made this work is to perform a seemingly redundant filter at the beginning to validate that we're starting from a
We have experimented with grouping by the starting vertex and limiting this, but it seems this does not execute lazily and first collects all neighbourhoods before grouping and limiting:
I think this might be a trivial question, but I can try to provide some sample data to work with if needed.
Thanks all!
- We would like an exact number of neighbourhoods, e.g. 5 distinct neighbourhoods
- We only want to consider neighbourhoods that have a particular node in its tree
Let's say our 'root' of the neighbourhood is any
Foo vertex, and we want to graph the neighbourhoods which include a Bar vertex in its tree (through some explicit traversal). The important point here is that a neighbourhood starting from Foo might not contain a vertex Bar in its tree, in which case we want to skip this one and find another. We know that this query will graph all valid neighbourhoods:But let's say we want exactly
5 valid neighbourhoods instead,. If we write g.V().hasLabel("Foo").limit(5)..., then we are not guaranteed that all 5 Foo vertices will actually lead to Bar, sometimes our traversal never makes it to a Bar from one of the randomly chosen Foo starting vertices, and we are left with fewer than 5 neighbourhoods. Placing it at the end, e.g. ....out("c").hasLabel("Bar").limit(5), filters the actual paths returned rather than by the starting .The way we've made this work is to perform a seemingly redundant filter at the beginning to validate that we're starting from a
Foo that definitely leads to Bar, but there must be a simpler way of expressing it:We have experimented with grouping by the starting vertex and limiting this, but it seems this does not execute lazily and first collects all neighbourhoods before grouping and limiting:
I think this might be a trivial question, but I can try to provide some sample data to work with if needed.
Thanks all!
Solution
Just realized that the approach I suggested with
project() doesn't need deduplication - just need to unfold() the collection to get back to the original form: