I have a Gremlin Query that starts simple (one Label), and then branches out to many different paths to collect unrelated informations (aka, I need to follow those paths). I'm considering using range() to break down that query into smaller chunks of, say 1k rows' and avoid processing the whole set of Labels into one. Of course, I'll have to run the query several times, but I expect each run to be faster, better fit in memory. May be I'll escape some fast degradation by keeping the load small enough.
Does that sound like a good idea?
I'm usually concerned such partitioning means that the common part of the query (before the range()) is executed several times, and that limits the speed potential. In the current case, it is merely a hasLabel() + some property collection.
g.V().hasLabel('xxx'). sideEffect().sideEffect().map()
vs
g.V().hasLabel('xxx'). range(x, x + 1000).sideEffect().sideEffect().map()