Big graph makes timeouts

I am having trouble querying big graph especially when it comes to apply filters.
I want to order the nodes so that I can take highest degree ones, but the graph is always throwing timeouts, and the only trick i am applying is pre-limiting the accessed nodes

g.V()
.hasLabel("Word")
.tail(100000)
.order().by(outE("RetrievedBy").count(), desc)
.limit(100)
.project("term", "degree")
.by("term")
.by(outE("RetrievedBy").count())


I am using Neptune with instance (db.r6g.xlarge)
Solution
A few things here -

  1. Neptune was originally designed as a database more in the mindset of TinkerPop OLTP, where queries that perform best have a constrained set of starting conditions with limited query frontier (the projected number of possible objects that may need to be assessed during query computation). Queries that traverse < 1M objects in the graph will perform with ~100ms of latency. Queries that need to process more that that will have a latency that scales linearly with query frontier.
  2. For the most part, Gremlin queries are executed single-threadedly inside of Neptune. Each Neptune instance has a number of query execution threads equal to 2x the number of vCPUs on that instance. More on the resource allocation here: https://docs.aws.amazon.com/neptune/latest/userguide/instance-types.html
  3. The Graviton 2 processors ( the "g" noted in the instance type ) are great for smaller OLTP queries and will show a better performance than the Intel processors for those queries. It has been noted in other forums (https://www.anandtech.com/show/15578/cloud-clash-amazon-graviton2-arm-against-intel-and-amd), however, that the Graviton 2 processors have a TLB that is less performant than same generation Intel processors, making memory-intensive processing (slightly) less performant. So if you plan on running queries with a larger query frontier, using the Intel processors will show some gains (vice versa with smaller queries).
Overview of how to choose the right DB instance type for each instance.
Was this page helpful?