AT

RepeatStep does not appear to respect barriers

LLyndon9/13/2023
I was digging into some traversal performance and had something similar to the following:
g.V(<ids>).repeat(out()).until(out().count().is(0)).toList()
g.V(<ids>).repeat(out()).until(out().count().is(0)).toList()
For the graph implementation in question, out() is implemented on top of a CollectingBarrierStep. I noticed that that fact is not respected by the repeat and it only gets 1 item at a time, i.e no aggregation. I removed my strategy and changed the query to:
g.V(<ids>).repeat(barrier(10).out()).times(2).toList()
g.V(<ids>).repeat(barrier(10).out()).times(2).toList()
and then put breakpoints in the FlatMapStep and it seemed that the barrier was still not respected and items came in 1 at a time. Fully verifying this is an issue is difficult and I'd be surprised that this was not noticed before. Has anyone noticed anything similar or have reason to believe this is not the case?
Sspmallette9/18/2023
I'm not sure I can explain what you're seeing, but it's worth noting that adding barrier(10) introduces a NoOpBarrierStep which doesn't extend from CollectingBarrierStep so you might not be testing what you think you're testing. As far as NoOpBarrierStep goes it does seem to collect items inside the repeat() according to the size given to it. Is it possible that in extending CollectingBarrierStep you've overridden the barrier behavior in some way that isn't allowing it to work the way you expect?
LLyndon9/20/2023
It works correctly in any place besides the repeat. The barrier i have is a simple consumer, it does not have anything fancy and with that it only gets 1 item at a time. I was looking at the repeat step and it seems like it has no additional barrier friendly logic to allow you to pull more from the left before pushing right, always grabbing one checkign emit/until conditions, executing step. I will try to dig into this on the Repeat step side sometiem soon and see if I can prove this out there
LLyndon10/12/2023
Hey @spmallette so I got some time to actually test this finally.... I made this test: (in a gist since discord character limits me...) https://gist.github.com/lyndonbauto/0cb964eaa1d4ed4ad67c41aeb97ca630 And what I found was that the input for the until step was 1 item at a time, while the repeat step was properly pulled with the barrier. So if you were using a barrier style strategy in your in/out step to allow batch execution like many likely are, the performance of the until step which runs first would be very poor. I am working on a fix for this
Gist
Barrier not working in repeat/until
Barrier not working in repeat/until. GitHub Gist: instantly share code, notes, and snippets.
KKennh10/13/2023
This might be expected behavior for the UntilStep. Those nested traversals are generally passed just one traverser at a time from the parent traversal. If this is something you wanted to change, we would probably have to do that for all TraversalParent's to keep it consistent.
LLyndon10/13/2023
You might be right, i talked to Marko a bit about it and he seemed to think it was a bug but I will mention that point and see what he thinks. IMO TraversalParents should respect barriers that are in their traversals because it has a very large performance implications. Forcing it to only behave one way seems to have no benefit, but maybe I am missing something. I will look into this a bit more though, cause now you have me thinking and recalling that project() seems to do the same thing and maybe others... Yeah seems like
g.V(?).out().project("a").by(__.barrier().out()).toList()
g.V(?).out().project("a").by(__.barrier().out()).toList()
and
g.V(?).out().where(__.barrier().out()).toList()
g.V(?).out().where(__.barrier().out()).toList()
both have items from the out() come in 1 at a time. You can do:
g.V(?).out().fold().project("a").by(__.unfold().barrier().out()).toList()
g.V(?).out().fold().project("a").by(__.unfold().barrier().out()).toList()
and
g.V(?).out().fold().where(__.unfold().barrier().out()).toList()
g.V(?).out().fold().where(__.unfold().barrier().out()).toList()
However then you have different behavior than intended, because now your project doesn't come out as a bunch of "a" : <vertex list>, instead you have "a" : <list of vertex lists> which isn't exactly desired. IMO it makes sense to do this under the hood and allow barrier() steps in the traversalparent's innards to barrier and batch execute under the hood and then reassign the output as is approriate so you can have the performance benefits of a barrier inside the project's by step but still have the output assigned as "a" : <vertex list> Would like to hear others perspectives if they think this doesnt make sense
KKennh10/14/2023
At first glance, this would probably be ok, but I get the feeling there will be a small subset of cases that will have their semantics changed. In which case, you might want to make a devlist post.
LLyndon10/16/2023
yeah I think there's more tinking i need to do here before proceeding. Will come back to this when I have thought about it more

Looking for more? Join the community!

Want results from more Discord servers?
Add your server
Recommended Posts
Trying to find a Vertex using a variable injected earlier in the traversalI am trying to add a series of vertices and edges to an existing graph. The newly created Vertex wiDoes Gremlin support API for CRUD operations?Currently using g.V() for read and g.addV() for write.Individual Vertex per property or Vertex with grouped propertiesI'm building an identity graph that also stores User profile data - things like email address, phoneFilter out empty resultsgremlin> g.V().hasLabel('metadata').valueMap() ==>{} ==>{} ==>{oncall_roster=[oncall_schedule]} HowQuestion on running queries in windows env.I get an error `RuntimeError: Event loop is closed`, but after troubleshooting I notice that my scriPropertiesStep.hashcode() not always uniqueAs background... We're working with Gremlin (groovy) to write queries against an in memory graph modConditionally update one vertex property when another property matches a certain provided valuehttps://stackoverflow.com/questions/76971695/update-vertex-properties-when-property-a-matches-properTrying to run a local version for a test, what is the correct serializer?Windows machine, local host. I can't find the Running this, https://github.com/bricaud/gremlin-servadding edges to multiple vertices at onceHey all. Working with tinkerpop on Cosmos Gremlin DB which is horrific. Wondering is there anyway aDoes .math() always return a Double?I have the following query, how can I get the result as a Long instead of a Double? In context, I wTrying to update a property value based on another propertyI have a query that looks something like the following g.V('9999').hasLabel('someLabel').propertiesCasting issue with Gremlin JavaI wrote the following query and I can't get it to compile, tried a ton of casting but it just isn't valueMap and MultivaluesI was going to use the recipe from @KelvinL 's book to return lists only when the property has multiAWS Neptune bulk load notificationsI wonder if anyone has knowledge of a way to receive a notification event(s) for bulk loading. RightVertexProgram filter graph before terminationI have a VertexProgram that operates on vertices of type A and B. B vertices are "below" A verticesStraightforward way to render a force directed graph svg/pngI was wondering if there is a "simple" way in java for me to take a GraphTraversal and render a forcCan't do explain() traversal step using Gremlin-Python ..Hi I just started messing around in gremlin-python this week, so likely to be doing something wrong Gremlin Query for amount of time and return all results?Is there a way to make gremlin keep running until time elapses then return the results? I have a queHow do I make a ssl connection using only ARN from neptune (AWS)I have a simple connection in my project using remotecon = DriverRemoteConnection(neptune_url) But Can gremlin-server be started via its Java packageI'm considering exposing the G.V() Playground graphs, which runs on TinkerGraph, to the network via