How to create indexes by Label?

In search of performance improvements, the AWS Neptune experts suggested that I create some indexes. To better contextualize, I have 3 operations in a single POST endpoint with the database. A query of previous data bringing the relationships of a specific ID, a deletion of edges if there is a registration in the database and a registration/update of vertices and edges. Today I am trying to attack two problems. Improve the performance of the creation that takes approximately 150ms and improve the performance of the query that is currently bogging down between 1.2-17 seconds.
Is it possible to create an index for vertexes and edges by specifying them by label since I have vertices and edges with different labels that have different properties? Does anyone know what this implementation would look like?
In my current implementation I do it in a simple way as follows:


client_write = client.Client(neptune_url, "g", message_serializer=serializer.GraphSONMessageSerializer())

queries = [
"graph.createIndex('journey_id', Vertex.class)",
"graph.createIndex('person_type', Vertex.class)",
"graph.createIndex('relationship_type', Edge.class)"
]


for query in queries:
client_write.submit(query).all().result()
Solution
just by calling g.V().limit(1) with concurrent calls on an r6g.2xlarge machine, the average time is 250ms

How may concurrent calls? An r6g.2xlarge instance has 8 vCPUs (and 16 available query execution threads). If you're issuing more than 16 requests in parallel, any additional concurrent requests will queue (an instance can queue up to 8000 requests). You can see this with the /gremlin/status API (of %gremlin_status Jupyter magic) with the number of executing queries and the number of "accepted" queries. If you need more concurrency, then you'll need to add more vCPUs (either by scaling up or scaling out read replicas).

But in the query mentioned, the bottleneck starts at the stage where it calls the last otherV() before path().
g.V().has(T.id, "client-id-uuid").bothE("has_profile", "has_affiliated", "has_controlling").has(T.id, containing("tenant-id-uuid")).otherV().path().unfold().dedup().elementMap().toList()

Makes sense as you're using a text predicate here (containing()). Neptune does not maintain a Full Text Search index. So any use of text predicates as containing(), startingWith(), endingWith() etc. will incur some form of range scan and also require dictionary materialization (we lose all of the benefits of data compression here as each value must be fetched from the dictionary to compare with the predicate value you've provided).
Was this page helpful?