Query optimisation

Hey, I'm optimising some queries, and found that these 2 seemingly identical queries behave very differently in term of performance
g.V().
union(
has("account", "id", "my_account"),
has("account", "id", "my_account").
out("owns")).
union(
out("completed").values("points"),
inE("rewarded").has("claimed", true).values("points")).
sum().
next()
g.V().
union(
has("account", "id", "my_account"),
has("account", "id", "my_account").
out("owns")).
union(
out("completed").values("points"),
inE("rewarded").has("claimed", true).values("points")).
sum().
next()
vs
g.V().
has("account", "id", "my_account").
union(identity(), out("owns")).
union(
out("completed").values("points"),
inE("rewarded").has("claimed", true).values("points")).
sum().
next()
g.V().
has("account", "id", "my_account").
union(identity(), out("owns")).
union(
out("completed").values("points"),
inE("rewarded").has("claimed", true).values("points")).
sum().
next()
The 2nd query performed about 10 times faster than the first. Can anyone with experience let me know what's the different for the 2? And what I should watch out for to avoid bad performing query like the first one? Thank you.
D
Dave12d ago
If would suggest looking at the profile of the query and post it here as well as what database you are using (e.g. Gremlin Server, JanusGraph, Neptune, etc.) to see where the time is being spent. Without having more information it is difficult to give specifics as to why they query is slow.
Without any additional context I would take a guess that most of the difference in time is being spent doing has("account", "id", "my_account") since the first version is doing that filter twice.
T
tien12d ago
Oh yeah good point Here's a snippet (the entire thing is too large)
==>Traversal Metrics
Step Count Traversers Time (ms) % Dur
=============================================================================================================
JanusGraphStep(vertex,[]) 3043 3043 263.877 71.17
constructGraphCentricQuery 0.006
constructGraphCentricQuery 0.001
GraphCentricQuery 350.768
\_condition=()
\_orders=[]
\_isFitted=false
\_isOrdered=true
\_query=[]
scan 350.644
\_query=[]
\_fullscan=true
\_condition=VERTEX
JanusGraphMultiQueryStep 3043 3043 2.992 0.81
NoOpBarrierStep(2500) 3043 3043 3.158 0.85
UnionStep([[JanusGraphHasStep([~label.eq(accoun... 2 2 99.037 26.71
JanusGraphHasStep([~label.eq(account), addres... 1 1 57.522
==>Traversal Metrics
Step Count Traversers Time (ms) % Dur
=============================================================================================================
JanusGraphStep(vertex,[]) 3043 3043 263.877 71.17
constructGraphCentricQuery 0.006
constructGraphCentricQuery 0.001
GraphCentricQuery 350.768
\_condition=()
\_orders=[]
\_isFitted=false
\_isOrdered=true
\_query=[]
scan 350.644
\_query=[]
\_fullscan=true
\_condition=VERTEX
JanusGraphMultiQueryStep 3043 3043 2.992 0.81
NoOpBarrierStep(2500) 3043 3043 3.158 0.85
UnionStep([[JanusGraphHasStep([~label.eq(accoun... 2 2 99.037 26.71
JanusGraphHasStep([~label.eq(account), addres... 1 1 57.522
look like the first query will always traverse through the entire graph I'm using janusgraph 1.0.0 is this a bug or is it expected? as in g.V().union( the V() step will always traverse through every vertices, at least it seems like it.
B
Bo12d ago
I don't think it's a bug. It's just an optimization missing in JanusGraph. Look at JanusGraphStepStrategy if you'd like to improve this.
Want results from more Discord servers?
Add your server
More Posts
Combine two queries to perform only onecan someone help me figureout how I can combine those two queries? groups = f"g.V().hasLabel('Groupcompatibility with Apache Jena FusekiHello! I am new in the community and trying to figure out whether TinkerPop and Gremlin is supportedIs the first traversal pattern evaluated by Match well definedHi, It seems to me that if the match step is able to dynamically select the first traversal pattern.mergeV() with Javascript not workingHi, I have a nodeJS 18 lambda which is closely modeled after this documentation: https://docs.aws.amUnable to deserialize results with Gremlin-go client + JanusGraphHi all - I'm trying to set up a JanusGraph database and use the Gremlin-go client to run some gremliFulltext-search-like features without ElasticSearch, OpenSearch, Solr and such?I've read in multiple sources that Apache TinkerPop isn't optimized for text search operations like Conditionally updating a variable with choose()How do I create and update a variable with a conditional? I need a number to be calculated based on Systems Analysis Report on Apache TinkerPop - Where to Start?Hey all, I'm currently writing an alaysis on Apache TinkerPop for grad school and was just hoping thLambda example in TypeScriptDoes anyone know where I can find example code that demonstrates up-to-date best practices for writimergeE(): increment counter on matchHi, is there an easy way to increment an existing edge property based on its current value using `meSerialization IssueI have a weird error, when I am connecting with JanusGraph gremlin client using `conf/remote-graph-Design decision related to multiple heterogenous relational graphsI'm working with over 100k instances of heterogeneous, relational node-and-edge attributed graphs, eStackoverflow when adding a larger list of property values using traverser.property()Hey, we encounter a stack overflow: ``` Exception during Transaction, rolling back ... org.apache.tijava: package org.apache.tinkerpop.shaded.jackson.core does not existWhile trying to `mvn clean install` with jdk11, I ran into the above error using the master branch. Performance issue in large graphsWhen performing changes in large graph (ca. 100K nodes, 500K edges) which is stored in one kryo fileConcurrent queries to authentication required sever resulted in 401 errorHey guys, playing around with gremlin & encountered this very odd error where concurrent queries wilDiscrepancy between console server id conventions and NeptuneSo I'm working with my test server and on Neptune--and I'm noticing a difference in the type of the how to connect the amothic/neptune container to the volume?I need to know which directory needs to attach to containeer. so that the data is stored safely. eveDocker yaml authentication settings (gremlinserver.authentication) questionDoes anyone have any experience setting up authentication on Docker by using the supplied .yaml fileGremlin Injection Attacks?Is anyone talking about or looking into attacks and mitigations for Gremlin Injection Attacks? That