Apache TinkerPop

AT

Apache TinkerPop

Apache TinkerPop is an open source graph computing framework and the home of the Gremlin graph query language.

Join

Does the TinkerGraph in-memory database support List cardinality properties for vertices?

It is my understanding that the following code should work: ```java import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversal; import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource; import org.apache.tinkerpop.gremlin.structure.Vertex;...
Solution:
elementMap() assumes that cardinality for each key is single and if it is list then only the first item encountered will be returned. To get all property values valueMap() step can be used instead. `gremlin> g.V(1).property("address", "a1").property(list, "address", "a2") ==>v[1] gremlin> g.V(1).valueMap()...

Analyzing samples of Gremlin Queries in Neptune Notebook

Hey everyone, I’m working on a project where we give internal customers access to our Neptune graph through Neptune Notebook. There are already quite a few users, and we want to analyze the queries they run to see which parts of our ontology are used more and which parts are less utilized. This is not as straight-forward as retrieving all labels from the query, since our edge labels are not unique, and if people would be using .in or .out steps without clarifying the entity name, it's almost impossible to analyze which part of ontology was visited. We also want to identify common query patterns to understand what people are usually querying for and which connections in our ontology are the most frequently used, but also filtering out some common to all queries parts, like g.V() or g.V(), retrieving rather information about combinations of multiple steps that were called. We’ve figured out how to override the Gremlin magic in Neptune Notebook to add our custom logic to handle each query. And for my problem I’m considering two approaches:...
Solution:
I think this is going to depend on how granular you want to get. If the intent is to see what labeled vertices or edges are accessed, then just looking at a query in the audit log would be sufficient. But, if your intent is to see every atomic component that is accessed in the database as part of query execution, that could be expensive. It is possible, though. You could run every query through the Neptune Gremlin Profiler: https://docs.aws.amazon.com/neptune/latest/userguide/gremlin-profile-api.html and set profile.indexOps to True and you'll get an output at the bottom of the profile output with every index operation that occurs. These will equate to some permutation of S-P-O-G patterns that are used in the three different built-in indexes (or fourth index, if enabled).
With the list of indexed lookup patterns, you could possibly maintain an external counter (maybe in sorted set in Redis/Valkey) with a a key of the S-P-O-G combination and the value being the number of times accessed. Just be aware that attaining a Neptune Gremlin Profile output requires that you run the query again. So you may not be able to use this to capture writes (without rewriting the data) and it will incur additional database resources to re-run all of the read queries....

Potential bug in evaluationTimeout when using auth?

When i use g.with("evaluationTimeout", X).call(<some call step).next() without authentication it works. When I do it with authentication enabled it does not work. The offending code appears to be this piece of TraversalOpProcessor.java ``` final long seto = args.containsKey(Tokens.ARGS_EVAL_TIMEOUT) ?...
Solution:
The problem appears to be in this block of code
No description

Should `barrier` step merge Edge properties with the same key and value?

I am trying to understand if it is expected for barrier to merge multiple traversers of different Edge properties together (a.k.a. optimization). Currently, as the result of such merging some Edge properties might be missing from the continuing traversal. For example, the following test will fail as the last line because a single property is still left after all "name" properties removal (graph.traversal().E().properties("name").barrier(5).drop().iterate()). I.e. I had impression that barrier step may influence query optimization, but not influence query result. Now I'm trying to understand if that is the intended behavior or not. ``` @Test public void testDropsEdgePropertiesTinkerGraph() { Graph graph = TinkerGraph.open();...
Solution:
barrier() doesn't dedup. it bulks. https://tinkerpop.apache.org/docs/current/reference/#barrier-step i think the problem here is that unique Edge properties bulk because of how equality works for them, where you can have two key/values that are the same but not refer to the same actual property. note that the same doesn't happen for vertex properties which have quality based on id: ```gremlin> g.addV().property('name','alice') ==>v[0] gremlin> g.addV().property('name','alice') ==>v[2]...

Authorization with transaction results in error

Hi All, I have configured passive authorization as described in https://tinkerpop.apache.org/docs/3.7.0/reference/#authorization. All works fine, but once I use Gremlin console with session mode, calling :remote close, the following error happens:
java.util.concurrent.ExecutionException: org.apache.tinkerpop.gremlin.driver.exception.ResponseException: Failed to authorize: This AuthorizationHandler only handles requests with OPS_BYTECODE or OPS_EVAL.
java.util.concurrent.ExecutionException: org.apache.tinkerpop.gremlin.driver.exception.ResponseException: Failed to authorize: This AuthorizationHandler only handles requests with OPS_BYTECODE or OPS_EVAL.
And on the server side I see (full server log enclosed):...
Solution:
I've looked into your real use case a little closer and I don't think there's a workaround at this time. Side note, you should end your traversals with a terminating step like iterate() or else they don't do anything. So your query should actually be gtx.addV().property('name', 'test1').property('age', 11).iterate(). The error you are seeing with "This AuthorizationHandler..." occurs after the transaction attempts to commit so it shouldn't actually prevent the commit from occurring. The real p...

Is the insertion order guaranteed with this example code?

Taking the following code which is found at https://tinkerpop.apache.org/docs/current/reference/#gremlin-javascript-transactions, is the insertion order guaranteed for these two new vertices? ```const g = traversal().withRemote(new DriverRemoteConnection('ws://localhost:8182/gremlin')); const tx = g.tx(); // create a Transaction ...
Solution:
right, there are no any guaranties with Promise.all if for some reason the insertion order is important, then you need to call sequentially gtx.addV("person").property("name", "jorge").iterate(); gtx.addV("person").property("name", "josh").iterate();...

Using mergeE to create an edge with an id that depends on a lookup

I want to use mergeE to produce an edge whose id is the concatenation of the ids of its inV and outV vertices. But the inV vertex has to be looked up, the exact id is not known without a lookup. Suppose that partialMacbookId === "macbookAir" and the result of the lookup is the vertex with id "macbookAir2024" And suppose that ownerId === "1111"...
Solution:
i dont think there's any way to do that directly in Gremlin without (1) the new string steps in 3.7.x or (2) a lambda. That tends to leave folks with perhaps the third option, doing the operation with multiple queries in a transaction, where you do the concatenation client-side. I can't really be too specific but we hope to see Neptune working with 3.7.x soon.

Is tx.close() necessary in Javascript?

I have read the following two pieces of documentation and I have the question, is tx.close() necessary in Javascript?
https://tinkerpop.apache.org/docs/current/reference/#gremlin-javascript-transactions https://docs.aws.amazon.com/neptune/latest/userguide/access-graph-gremlin-transactions.html...
Solution:
not necessary, commit/rollback will also close the transaction

Using dedup with Neptune

I remember once i came accross AWS Neptune optimization guide that i don't remember where is it now. It mentions that .dedup() step is not optimized for Neptune which makes performance worse. However, I have the following scenario where i need deduplicates and pagination at same time....
Solution:
I guess what I'm getting at, is that I don't know of a way to make dedup() any more performant in that sort of query with Neptune's current implementation.
As far as pagination goes, have you tried using Neptune's Query Results Cache instead of making multiple range() calls? That would significantly decrease latency for subsequent calls as you paginate across the resuls: https://docs.aws.amazon.com/neptune/latest/userguide/gremlin-results-cache.html...

`next(n)` with Gremlin JavaScript

I'm trying to do some basic pagination next(n) seems perfect, but it doesn't appear to be available for JavaScript as per the documentation. Is there a reason for this limitation?...
Solution:
AFAIK, that is only possible via scripts.
No description

Traversal Inspection for properties used

Is there any way to inspect a traversal to figure out what properties are used throughout it? I am looking at the traversal API / steps and can't see anything that looks like it would fulfill the purpose. Something that would tell what properties are used, which are returned. ...

Gremlin with AWS Neptune

Hey Friends, I am working on a project using AWS Neptune with gremlin query language on gremlingo driver. I realize the latest version of neptune is 1.3.1, and tinkerpop version is 3.6.4....
Solution:
note that Neptune 1.3.2.1 now supports the functionality of TinkerPop 3.7.x. You can read more about it at https://aws.amazon.com/blogs/database/exploring-new-features-of-apache-tinkerpop-3-7-x-in-amazon-neptune/

Running Tinkerpop test in Janusgraph repo

Hi I'm trying to run the below Tinkerpop test to test out some of the changes I've made to Janusgraph. By copying the test over from Tinkerpop repo into Janusgraph repo (just for testing one test). ```java @RunWith(Parameterized.class) public class TraversalInterruptionTest extends AbstractGremlinProcessTest {...
Solution:
You should copy the test suite definition test

configure gremlin-server to use remote Neptune server?

Hello, I’m wondering if it is possible to configure gremlin server to use the graph at remote Neptune instance so user can use my own authentication and not having to worry about AWS authentication methods? Many thanks!...

Query optimisation

Hey, I'm optimising some queries, and found that these 2 seemingly identical queries behave very differently in term of performance ```groovy g.V(). union(...
Solution:
If would suggest looking at the profile of the query and post it here as well as what database you are using (e.g. Gremlin Server, JanusGraph, Neptune, etc.) to see where the time is being spent. Without having more information it is difficult to give specifics as to why they query is slow.
Without any additional context I would take a guess that most of the difference in time is being spent doing has("account", "id", "my_account") since the first version is doing that filter twice....

Combine two queries to perform only one

can someone help me figureout how I can combine those two queries? groups = f"g.V().hasLabel('Groups').as('group_data').elementMap().range({start_index}, {end_index}).toList()" users = f"g.V().hasLabel('Groups').as('group_data').bothE('memberOf').otherV().as('members').select('members').by(elementMap()).range({start_index}, {end_index}).toList()"...
Solution:
Simple solution is to use union step https://tinkerpop.apache.org/docs/current/reference/#union-step something like g.union( V().hasLabel('Groups').range({start_index}, {end_index}).elementMap(), V().hasLabel('Groups').out().range({start_index}, {end_index}).elementMap()))...

compatibility with Apache Jena Fuseki

Hello! I am new in the community and trying to figure out whether TinkerPop and Gremlin is supported by Jena Fuseki. I had a look at the TinkerPop-enabled graph systems page (https://tinkerpop.apache.org/providers.html) and Fuseki is not listed there, but I see other RDF graph databases in the list, which makes me think there might be some chance for Fuseki too. Could you please help?...
Solution:
There's a lot to unravel with that question. 🙂 Fuseki is a SPARQL server within the Apache Jena project that is mainly targeting RDF workloads and use cases. Whereas Apache TinkerPop is a framework and reference implementation for graph databases that support the Labeled Property Graph paradigm. While there are a few efforts (https://arxiv.org/abs/2110.13348) to integrate RDF and LPG, there are enough differences that make integrating Gremlin over RDF non-trivial. (Note the opposite maybe easier as there is a SPARQL-Gremlin compiler: https://tinkerpop.apache.org/docs/current/reference/#sparql-gremlin). The few RDF stores that are listed in the supported providers list have made certain concessions to provide an integration between Gremlin and RDF. As an example, Blazegraph's implementation stores data in an RDF* format: https://github.com/blazegraph/tinkerpop3. ...

Is the first traversal pattern evaluated by Match well defined

Hi, It seems to me that if the match step is able to dynamically select the first traversal pattern (as it does all other traversal patterns), and this selection isn't the same across all traversers, the behaviour of match isn't well defined. Consider the simple graph...

.mergeV() with Javascript not working

Hi, I have a nodeJS 18 lambda which is closely modeled after this documentation: https://docs.aws.amazon.com/neptune/latest/userguide/lambda-functions-examples.html#lambda-functions-examples-javascript here is my async query function: ```async function query(context) { const { userId } = context;...
Solution:
The solution was to allow neptune to use the default mimetype; removing the mimetype header solved the issue