Limiting .path() results to a number of valid starting vertices

Hey folks, for context, we're using AWS Neptune, and Neptune Notebook for visualisation. We would like to visualise neighbourhoods of data with a given criteria:
  1. We would like an exact number of neighbourhoods, e.g. 5 distinct neighbourhoods
  2. We only want to consider neighbourhoods that have a particular node in its tree
Example:
Let's say our 'root' of the neighbourhood is any Foo vertex, and we want to graph the neighbourhoods which include a Bar vertex in its tree (through some explicit traversal). The important point here is that a neighbourhood starting from Foo might not contain a vertex Bar in its tree, in which case we want to skip this one and find another. We know that this query will graph all valid neighbourhoods:
g.V().hasLabel("Foo")
  .out("a")
  .in("b")
  .out("c").hasLabel("Bar")
  .path()

But let's say we want exactly 5 valid neighbourhoods instead,. If we write g.V().hasLabel("Foo").limit(5)..., then we are not guaranteed that all 5 Foo vertices will actually lead to Bar, sometimes our traversal never makes it to a Bar from one of the randomly chosen Foo starting vertices, and we are left with fewer than 5 neighbourhoods. Placing it at the end, e.g. ....out("c").hasLabel("Bar").limit(5), filters the actual paths returned rather than by the starting .

The way we've made this work is to perform a seemingly redundant filter at the beginning to validate that we're starting from a Foo that definitely leads to Bar, but there must be a simpler way of expressing it:
g.V().where(hasLabel("Foo")
    .out("a")
    .in("b")
    .out("c").hasLabel("Bar"))
  .limit(5)
  .out("a")
  .in("b")
  .out("c").hasLabel("Bar")
  .path()


We have experimented with grouping by the starting vertex and limiting this, but it seems this does not execute lazily and first collects all neighbourhoods before grouping and limiting:

g.V().hasLabel("Foo").as("start") 
  .out("a")
  .in("b")
  .out("c").hasLabel("Bar")
  .path()
  .group().by(select("start").values("id"))
  .select(values).unfold()
  .limit(5)


I think this might be a trivial question, but I can try to provide some sample data to work with if needed.

Thanks all!
Solution
Just realized that the approach I suggested with project() doesn't need deduplication - just need to unfold() the collection to get back to the original form:
gremlin> g.V().project('s','e').
......1>   by().
......2>   by(out().hasLabel('software').path().fold()).
......3>   filter(select('e').unfold()).
......4>   limit(2).
......5>   select('e').unfold()
==>[v[1],v[3]]
==>[v[4],v[5]]
==>[v[4],v[3]]
Was this page helpful?