Stackoverflow when adding a larger list of property values using traverser.property()

Hey, we encounter a stack overflow:
Exception during Transaction, rolling back ...
org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(org/apache/tinkerpop/gremlin/process/traversal/step/util/AbstractStep.java:150): Java::JavaLang::StackOverflowError
from org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(org/apache/tinkerpop/gremlin/process/traversal/step/util/ExpandableStepIterator.java:55)
from org.apache.tinkerpop.gremlin.process.traversal.step.sideEffect.SideEffectStep.processNextStart(org/apache/tinkerpop/gremlin/process/traversal/step/sideEffect/SideEffectStep.java:38)
from org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(org/apache/tinkerpop/gremlin/process/traversal/step/util/AbstractStep.java:150)
Exception during Transaction, rolling back ...
org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(org/apache/tinkerpop/gremlin/process/traversal/step/util/AbstractStep.java:150): Java::JavaLang::StackOverflowError
from org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(org/apache/tinkerpop/gremlin/process/traversal/step/util/ExpandableStepIterator.java:55)
from org.apache.tinkerpop.gremlin.process.traversal.step.sideEffect.SideEffectStep.processNextStart(org/apache/tinkerpop/gremlin/process/traversal/step/sideEffect/SideEffectStep.java:38)
from org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(org/apache/tinkerpop/gremlin/process/traversal/step/util/AbstractStep.java:150)
when we try to add large list of values (~4000 entries) via a traversers property() calls. It seems the property method is implemented using recursion which causes the problem. See also https://github.com/JanusGraph/janusgraph/issues/3479 Is this a known issue?
GitHub
StackOverflowError adding alot of vertex properties inside one trav...
Version: 0.6.1 Storage Backend: berkleyje Mixed Index Backend: lucene Expected Behavior: No Stackoverflow Error Current Behavior: Stackoverflow Error Running this gremlin query inside JRuby: traver...
K
kelvinl281654d ago
At some point, depending on the JVM thread stack setting (-Xss), if you have a few thousand Gremlin steps chained together, you are highly likely to encounter a stack overflow. The recommended mitigations are to either 1/ break the query up into multiple queries, or 2/ increase the -Xss value if that is something you have control over. Note that increasing the stack size is not an ideal fix as it will bump it for all JVM threads and thus can increase memory usage significantly. You can check the stack size a JVM is currently using, running a command like this one:
$ java -XX:+PrintFlagsFinal -version | grep ThreadStackSize

intx CompilerThreadStackSize = 0
intx ThreadStackSize = 1024
intx VMThreadStackSize = 1024

$ java -XX:+PrintFlagsFinal -version | grep ThreadStackSize

intx CompilerThreadStackSize = 0
intx ThreadStackSize = 1024
intx VMThreadStackSize = 1024

M
mrckzgl54d ago
Thanks for the reply. But the thing is those steps are not chained in "userspace". These are all separate traversal.property() calls. Tinkerpop is internally calling things recursively inside traversal.iterate(). This is the problem and it shouldn't occur IMHO if traversal.iterate() would not be implemented recursively... Between, we have also figured the workaround of splitting things up into separate traversals (see the linked JanusGraph issue). For simple traversals this might be feasible, but the more complex they get the more problematic to implement this will turn out...increasing stack size is of course not an option.
K
kelvinl281654d ago
So just to clarify, you are not doing something like .property().property().property().... 4000 times but doing something different?
M
mrckzgl54d ago
no, we are not. please have a look at the linked issue, there it is described more verbose. But the essence is:
traversal = dbtx.traversal.add_v()

large_list.each{|v|
traverser.property(VertexProperty::Cardinality::list, 'search_value', v["value"])
}

traverser.iterate()
traversal = dbtx.traversal.add_v()

large_list.each{|v|
traverser.property(VertexProperty::Cardinality::list, 'search_value', v["value"])
}

traverser.iterate()
stack overflow is happening inside the iterate() call
K
kelvinl281654d ago
That's the same thing. You are just building it up stepwise. The stack overflow can (and does) still happen if you built them all inline.
M
mrckzgl54d ago
But the problem is that iterate is handling all those steps recursively instead of iteratively. It does not need to be implemented like this. (and btw. it is not the same thing. In what you describe, the stack overflow is happening outside of tinkerpops code base, even before the iterate call, for what we do it is happening inside) I guess, the problem would not be so apparent, if the property call / step would allow to take a list as value argument, instead of just one single value (and of course handle that list iteratively), but, I haven't found a way to add multiple values inside one property call / step.
T
triggan54d ago
stephen mallette
Inserting a Vertex Using a Map
The typical method for setting properties on a graph element, such as a Vertex, is to use the property()-step. This step looks a bit like the put() method of a Java Map which takes a key and a value as its argument (though property() can optionally take additional arguments for Cardinality and meta-properties). It’s fitting that these APIs are s...
M
mrckzgl54d ago
Oh nice one. No I did not. Have to try this out, thanks alot.
S
spmallette54d ago
i wonder if that will work. it isn't designed to handle Cardinality.list well and that looks like what's desired here
M
mrckzgl54d ago
Currently testing this:
traversal = db.traversal.v(some_id).as('vertex')
traversal.side_effect(
T.inject(large_list)
.unfold().as('value')
.select('vertex').property(VertexProperty::Cardinality::list, 'search_id', T.select('value'))
)
traversal.iterate()
traversal = db.traversal.v(some_id).as('vertex')
traversal.side_effect(
T.inject(large_list)
.unfold().as('value')
.select('vertex').property(VertexProperty::Cardinality::list, 'search_id', T.select('value'))
)
traversal.iterate()
At first glance,it does not produce an error on a large list of 7000 values. If I understood correctly, it should be equivalent to:
large_list.each{|v|
db.traversal.v(some_id).property(VertexProperty::Cardinality::list, 'search_id', v).next()
}
large_list.each{|v|
db.traversal.v(some_id).property(VertexProperty::Cardinality::list, 'search_id', v).next()
}
but hopefully much more performant. It does not produce the stack overflow as the steps following unfold are handled iteratively and not recursively as in the OP case. something is still wrong, the values won't be found in the db. Maybe there is a problem in passing directly the JRuby version of the large_list, maybe it would work converting this to a native Java List, but I won't persue this further. We already implemented the split traversal work around, so this is fine at the moment. Still, not very good that tinkerpop is generating stack overflows out of its own ...
Want results from more Discord servers?
Add your server
More Posts
java: package org.apache.tinkerpop.shaded.jackson.core does not existWhile trying to `mvn clean install` with jdk11, I ran into the above error using the master branch. Performance issue in large graphsWhen performing changes in large graph (ca. 100K nodes, 500K edges) which is stored in one kryo fileConcurrent queries to authentication required sever resulted in 401 errorHey guys, playing around with gremlin & encountered this very odd error where concurrent queries wilDiscrepancy between console server id conventions and NeptuneSo I'm working with my test server and on Neptune--and I'm noticing a difference in the type of the how to connect the amothic/neptune container to the volume?I need to know which directory needs to attach to containeer. so that the data is stored safely. eveDocker yaml authentication settings (gremlinserver.authentication) questionDoes anyone have any experience setting up authentication on Docker by using the supplied .yaml fileGremlin Injection Attacks?Is anyone talking about or looking into attacks and mitigations for Gremlin Injection Attacks? That Returned vertex properties (JS client)Hi, I've got a question regarding the returned vertex value when using the JS client. How come non-aAnyone using Tinkerpop docker as a local Cosmos replacementRunning into some random issues. Looking for tips and tricks.Configuring Websockets connection to pass through a proxy serverHey, I'm working on making G.V() fully proxy aware, but I can't seem to get websockets connection tpython goblin vs spring-data-goblin for interactions with gremlin serverI want an OGM to interact with my gremlin server. What would be a good choice?Is there any open source version of data visualizer for aws neptune?Is there any open source version of data visualizer for aws neptune. I'll need it since it essentialDynamic select within query not working.Any insights or help would be greatly appreciated. I have to pass a list of lists in the format beAdding multiple properties to a vertex using gremlin-goHello Community, I have a question regarding how multiple properties can be added to a vertex using Is it possible to walk 2 different graphs using custom TraversalStrategy in Gremlin?I have 2 different graphs in 2 different Neptune cluster. Both of them can have few reference verticSideEffect a variable, Use it later after BarrierStep?I seek a query that builds a list and then needs to both sum the list's mapped values and divide theMemory issue on repeatI am traversing all nodes occuring in the same cluster given one of the nodes in that cluster. SurpWhich database should i use for my DJ set planning software?Hi, i want to develop a software that lets DJs plan a set (i.e. playlist) and i'm wondering if graphHow will i add unique values to the vertices or edge properties in NeptuneI can't get a doc regarding adding unique data through gremlin. Is there any way to do it, other thaNot getting result in hasId() but id().is() worksI don't get any response using g.V().hasId(48). But when i use g.V().id().is(48). it shows output. S