Big graph makes timeouts
I am having trouble querying big graph especially when it comes to apply filters.
I want to order the nodes so that I can take highest degree ones, but the graph is always throwing timeouts, and the only trick i am applying is pre-limiting the accessed nodes
I am using Neptune with instance (db.r6g.xlarge)
I want to order the nodes so that I can take highest degree ones, but the graph is always throwing timeouts, and the only trick i am applying is pre-limiting the accessed nodes
I am using Neptune with instance (db.r6g.xlarge)
Solution
A few things here -
- Neptune was originally designed as a database more in the mindset of TinkerPop OLTP, where queries that perform best have a constrained set of starting conditions with limited query frontier (the projected number of possible objects that may need to be assessed during query computation). Queries that traverse < 1M objects in the graph will perform with ~100ms of latency. Queries that need to process more that that will have a latency that scales linearly with query frontier.
- For the most part, Gremlin queries are executed single-threadedly inside of Neptune. Each Neptune instance has a number of query execution threads equal to 2x the number of vCPUs on that instance. More on the resource allocation here: https://docs.aws.amazon.com/neptune/latest/userguide/instance-types.html
- The Graviton 2 processors ( the "g" noted in the instance type ) are great for smaller OLTP queries and will show a better performance than the Intel processors for those queries. It has been noted in other forums (https://www.anandtech.com/show/15578/cloud-clash-amazon-graviton2-arm-against-intel-and-amd), however, that the Graviton 2 processors have a TLB that is less performant than same generation Intel processors, making memory-intensive processing (slightly) less performant. So if you plan on running queries with a larger query frontier, using the Intel processors will show some gains (vice versa with smaller queries).
Overview of how to choose the right DB instance type for each instance.
