Add Multiple addV() by one Iterate

Hello guys, I have crazy question which needs some experts to help me. I am using C# to add many nodes (20k) I am adding them to the Aws Neptune. but if I do them one by one it's going to take very long time. therefore I need to have like bulk addV() codes. here is my code but it is not working as I want. - Some concerns here: what is the maximum Iterate() requests? - Can I add the 20k node by one go or I need to devided them in smaller packets? - Is there any better way to Bulk add/update the graph?
No description
7 Replies
spmallette
spmallette12mo ago
You should not send them all at once. You should batch your requests. The size of the batches could be dependent on the complexity of your load, so you might need to experiment with what works best as a size. If you do not batch, you will likely end in one of several errors: (1) a timeout if it takes too long, (2) a query construction problem if your query hits JVM stack limits or (3) a memory issue should the transaction grow beyond allowable limits. regarding (1) and timeouts, it's also worth noting that long queries will hold locks on the data so if you're doing anything in parallel to this load you could run into transaction failures. For Neptune those can also occur on reads. Your example seems pretty simple, so perhaps a hundred at a time might be good place to start.
Tomcat
Tomcat12mo ago
Thanks alot for the answer.
Jim Idle
Jim Idle11mo ago
Should that traversal variable not be just g ?
spmallette
spmallette11mo ago
by convention, g is the GraphTraversalSource. you get that from anonymously calling: traversal(). when you do g.V() it produces a GraphTraversal object so if you had a variable to refer to that and you already used the g convention, you wouldn't assign that value to g. Typically, that variable is traversal or just t if the variable is defined at all. You usually don't see it because you just write your Gremlin from g all the way to termination (e.g. iterate() ) but sometimes, like in this example, you're building a GraphTraversal in some dynamic way or just passing it around to different functions for some reason in which case a variable is needed.
Jim Idle
Jim Idle11mo ago
So, in go I have been using: t := g.Traversal(). Then adding to t in loops for say batches of 20. Rather than t := g.V() as in the above python code. I preseume then that either is OK?
spmallette
spmallette11mo ago
i'm not sure. what does the "g" in g.Traversal() refer to in your case? the documentation for go has it as:
g := gremlingo.Traversal_().WithRemote(remote)
g := gremlingo.Traversal_().WithRemote(remote)
in that case g is a GraphTraversalSource and it is from that (by convention) you would do:
t := g.V()
t := g.V()
Jim Idle
Jim Idle11mo ago
Yes - that's what I meant. I get the traversal source and then use that
Want results from more Discord servers?
Add your server
More Posts
MetricRegistry DocumentationIs there documentation anywhere on the metrics provided by the `MetricRegistry` and what their meaniHow do I enable Dynamic Graphs while using the default Docker image `janusgraph/janusgraph:latest`?With the following configuration settings and using the default `g` alias I can work with gremlin/JaEncounter strange behaviors in "match()" stepHello! Sorry for bothering you once again. When I further investigate the problem I post at https://[parameterized queries] Increased time in query evaluation when gremlin server starts/restartsHi folks, High latency observed in query evaluation time whenever janusgraph server restarts/startsGremlin Statement for Adding Edges Based on Existence of Other EdgesI'm trying to figure out how to do the following in Gremlin with Kelvin's air-routes data. I want toGremlin Python 3.4.13 - Exception Ignored Message When Existing A Python Main**What Happens** This happens in Gremlin Python 3.4.13 1. Open `self.connection = DriverRemoteConnSolved: Gremlin Python Exceptions with .property("timeStamp", 0)**The Issue:** In the Python code below: ` def create_edge(self, from_v: Vertex, to_v: Vertex,indexOf Vertex with given property in a sorted listHi all! I've been trying to create a paginated GET REST api and sometimes I need to enforce a `rangExpected use of `next()` in Java driverI was doing some testing recently and noticed that on very large datasets, invoking `next()` seems tDirect and indirect EdgesThis may also be an "it depends" question, but here I have labels `A`, `B` and `C` where entity `C`