LazyBarrierStrategy/NoOpBarrierStep incompatible with path-tracking
👋🏻 Hi all!
In this JanusGraph post (https://discord.com/channels/981533699378135051/1195313165278388334/1195313165278388334), we were investigating if
TreeStep
could be used jointly with bulked traversers so as to improve traversal time.
Based on answers there, it looks like TinkerPop's LazyBarrierStrategy
explicitly excludes "path-tracking" traversals (https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/LazyBarrierStrategy.java#L85) and won't insert NoOpBarrierStep
s in those cases, preventing us from bulking traversers.
At a high level, I imagine this could be needed if traversers history was lost on bulking. Though, I'm not familiar enough with the code to be able to tell if this could concretely be a reason.
I have observed some "path-tracking" traversals (with TreeStep
) to run properly with bulking on JanusGraph (https://discord.com/channels/981533699378135051/1195313165278388334/1196486987474022400). In addition, commenting out the check on TraverserRequirement.PATH
(https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/optimization/LazyBarrierStrategy.java#L85) does not break gremlin-core
unit tests locally.
So my question is: would anyone have context on why "path-tracking" traversals can't use bulking? Is there any chance this exclusion could be scoped down to a subset of "path-tracking" traversals maybe?
Any pointer appreciated! 🙇🏻♂️10 Replies
does not break gremlin-core unit tests locally.i don't think the unit tests will tell you much. you will need to run the full build to find out what it breaks.
mvn clean install
the entire project and see what fails. essentially, you need gremlin-test
to be built so that tinkergraph-gremlin
can execute those tests over multiple configurations.Thanks Stephen. I actually ran unit tests for more than that, incl. Gremlin Test and TinkerGraph Gremlin.
well, perhaps the change causes trouble for OLAP given the spark failure? what were the test failures there?
Gremlin Spark was failing due to my corp VPN (i.e. https://stackoverflow.com/questions/52133731/how-to-solve-cant-assign-requested-address-service-sparkdriver-failed-after). But now that I turned it off, all tests are passing.
My local diff (compared to
master
branch) is just
So it could also be that this is not tested or that another factor prevents us from hitting this condition.hmm interesting. that bit of code tends to be tricky. you move it one way and X breaks then you fix that and it makes Y break. i'm surprised that removing that condition had no effect at all as virtually every change to that strategy has come as a resul tof a bug that had a related test. i guess we'd want to see the origin of that change to see if ther is any particular hint as to why that is there. there may also be something in obvous i'm currently not thinking about as to why that strategy shouldn't execute if path tracking is enabled
when you say "path tracking is enabled", you mean a step with TraverserRequirement.PATH is part of the path, right? Just to make sure I'm not misunderstanding.
yes
fi: thought I'd try
mvn clean install -DskipIntegrationTests=false -DincludeNeo4j
as well and it also succeeds 100% with the changethat line of code goes back to the inception of the strategy itself. as a result, it makes me feel like there is some specific need for it and that we simply don't have the test that triggers the problem.
No! Say it ain't so: as conceptually beautiful as is Gremlin on the outside, it cannot be just like real code on the inside. I am crushed.