Performance issue in large graphs

When performing changes in large graph (ca. 100K nodes, 500K edges) which is stored in one kryo file I am experiencing some huge delays. Just as an example, when writing initially I can change 10K nodes in minutes, but when the graph is big the same changes need more than one hour. Is there any easy solution possible, i.e., like breaking down and saving in smaller files etc. Any suggestion is helpful. Initial preference is saving in file system (local or network). Thanks for your suggestions/solutions.
Solution:
I'm not sure if any of the other serializations such as GraphML or GraphSON might perform better, but I would say this is likely not a common way we see those graphs used so we may not have too much data on which techniques may work best. With the exception of TinkerGraph which is often used as an in-memory, somewhat ephemeral, graph, we typically see persistent graph stores used where the data is persisted on disk by the database and you do not need to constantly reload the data each time. If y...
K
kelvinl281661d ago
Are you using TinkerGraph and saving the updates to file, or are you using some other graph database store? In general what you describe is actually quite a small graph, but I'm not sure what technology stack you are using.
T
Tanvir61d ago
Hi Kelvin, I am using tinkergraph. I also used Janusgraph - it was even slower there and then have to switch back to tinkergraph.
K
kelvinl281660d ago
How exactly are you using Kyro here? Are you serializing the whole graph out periodically and then reloading it?
T
Tanvir60d ago
I am reading the whole graph and then adding the the nodes.
Solution
K
kelvinl281660d ago
I'm not sure if any of the other serializations such as GraphML or GraphSON might perform better, but I would say this is likely not a common way we see those graphs used so we may not have too much data on which techniques may work best. With the exception of TinkerGraph which is often used as an in-memory, somewhat ephemeral, graph, we typically see persistent graph stores used where the data is persisted on disk by the database and you do not need to constantly reload the data each time. If you ever have the need to look at commercial graph database engines, then you will find tools like bulk loaders that make loading the data easier/faster. I wonder if @spmallette has any thoughts on this one?
T
Tanvir60d ago
Thanks a lot for the prompt response. Agree, a persistent store solution will help a lot. I think, that is possibly the solution I have to opt for long run.
Want results from more Discord servers?
Add your server
More Posts
Concurrent queries to authentication required sever resulted in 401 errorHey guys, playing around with gremlin & encountered this very odd error where concurrent queries wilDiscrepancy between console server id conventions and NeptuneSo I'm working with my test server and on Neptune--and I'm noticing a difference in the type of the how to connect the amothic/neptune container to the volume?I need to know which directory needs to attach to containeer. so that the data is stored safely. eveDocker yaml authentication settings (gremlinserver.authentication) questionDoes anyone have any experience setting up authentication on Docker by using the supplied .yaml fileGremlin Injection Attacks?Is anyone talking about or looking into attacks and mitigations for Gremlin Injection Attacks? That Returned vertex properties (JS client)Hi, I've got a question regarding the returned vertex value when using the JS client. How come non-aAnyone using Tinkerpop docker as a local Cosmos replacementRunning into some random issues. Looking for tips and tricks.Configuring Websockets connection to pass through a proxy serverHey, I'm working on making G.V() fully proxy aware, but I can't seem to get websockets connection tpython goblin vs spring-data-goblin for interactions with gremlin serverI want an OGM to interact with my gremlin server. What would be a good choice?Is there any open source version of data visualizer for aws neptune?Is there any open source version of data visualizer for aws neptune. I'll need it since it essentialDynamic select within query not working.Any insights or help would be greatly appreciated. I have to pass a list of lists in the format beAdding multiple properties to a vertex using gremlin-goHello Community, I have a question regarding how multiple properties can be added to a vertex using Is it possible to walk 2 different graphs using custom TraversalStrategy in Gremlin?I have 2 different graphs in 2 different Neptune cluster. Both of them can have few reference verticSideEffect a variable, Use it later after BarrierStep?I seek a query that builds a list and then needs to both sum the list's mapped values and divide theMemory issue on repeatI am traversing all nodes occuring in the same cluster given one of the nodes in that cluster. SurpWhich database should i use for my DJ set planning software?Hi, i want to develop a software that lets DJs plan a set (i.e. playlist) and i'm wondering if graphHow will i add unique values to the vertices or edge properties in NeptuneI can't get a doc regarding adding unique data through gremlin. Is there any way to do it, other thaNot getting result in hasId() but id().is() worksI don't get any response using g.V().hasId(48). But when i use g.V().id().is(48). it shows output. S