Apache TinkerPop•2y ago

The Cascading Coalescing - Create a V then Create an E in One Shot

I have been struggling with this and perhaps I can ask some expert on how to approach this type of issue. What I want to do: 1. Find a V 2. If V isn't found create a V. If V exists, move on to 3. 3. Update some properties regardless of new or existing. 4. Find an outgoing edge out of the V and if not found create one 5. Update some edge properties regardless of new or existing edge was found. 6. Bonus - return the originally found or created V Why Do I Want This? I have a highly concurrent image processing where tons of images their records get generated from concurrent external functions, this hits the Neptune quite hard and I run into some concurrent update issues, resulting in rare but dangling or duplicated Vs. Here is My Example Of How I plan to do Double Coalesce I am sure the use of select is wrong, but how can I get Gremlin to remember the V that I have just found or created during the second coalesce step?


g.V().
     has("Test", "name", "test7").
     fold().
     coalesce(__.unfold(), __.addV("Test")).as("x").
     property("name", "test7").
     out().hasId("dac4de07-1371-a1f7-7409-ad28d75069a5").fold().
     coalesce(__.unfold(), __.select("x").addE("Link").to(__.V("dac4de07-1371-a1f7-7409-ad28d75069a5"))).toList()

Solution:

mergeV/E() should be the best approach for upsert that we have right now and should cover most coalesce() patterns that people do. ```groovy g.mergeV([name:'manabububu']).as('v') // gets or creates a vertex with name="manubububu" property('x', 1) // adds x=1 irrespective of whether the vertex is created or not mergeE([(label):'link',(from):outV,(to): 'dac4de07-1371-a1f7-7409-ad28d75069a5'). // find edge from "v" to a vertex with the uuid specified...

Jump to solution

8 Replies

ManabuBeachOP•2y ago

I tried the following into GPT,

Find a vertex with label of Test and property of test and if it is not found create it, and then check if outgoing edge of label belongsTo exists if it does not exist create an outgoing edge to the group with the tag of Root

and I think this is probably darn close to what I was asking for.

 g.V().
     hasLabel('Test').
     has('test', 'test').
     fold().
     coalesce(unfold(), addV('Test').property('test', 'test')).as('testVertex').
     coalesce(
       outE('belongsTo').hasLabel('belongsTo'),
       addE('belongsTo').to(V().hasLabel('Group').has('tag', 'Root'))).
     path().
     dedup().
     toList()

Interestingly, the GPT throught to add testVertex as AS but it does not use it. Ok, so the issue was that I almost completely misunderstood the meaning of coalesce. I assumed all along that it was doing (do-success-case, do-faiure-case) and that was wrong. I re-read the document and

The coalesce()-step evaluates the provided traversals in order and returns the first traversal that emits at least one element.

This is very clear as to what it does. All the document examples provide two argument situation but I assume it can have more than 2 arguments that one of them emits something. Now I will see if I can do the same with merge.

spmallette•2y ago

seems like you solved your problem. for if-then like semantics you would prefer choose()

ManabuBeachOP•2y ago

Actually GPT-3-Turbo + GdotV did! Amazing, really. 😯 Tried to construct a choose based upsert I am finding that it is still a bit harder to understand what traversal I get or wehter I can just pass the traversal to the next step etc. My thought pattern is working this way, which is obviously wrong. g.V().choose(V().has("test", "name", "test1"), then just pass along, or create a new V).property.... Coalesce seems to work better. Also I looked into merge but that was not appropriate for the original question when cascading upserts. Anyway, the original problem was solved, so I will call this a solved.

spmallette•2y ago

coalesce() is probably best for upsert. it's a well established pattern at this point. i only brought up choose() because you mentioned if-then semantics of " (do-success-case, do-faiure-case)"

Solution

spmallette•2y ago

mergeV/E() should be the best approach for upsert that we have right now and should cover most coalesce() patterns that people do.

g.mergeV([name:'manabububu']).as('v') // gets or creates a vertex with name="manubububu"
  property('x', 1) // adds x=1 irrespective of whether the vertex is created or not
  mergeE([(label):'link',(from):outV,(to): 'dac4de07-1371-a1f7-7409-ad28d75069a5'). // find edge from "v" to a vertex with the uuid specified
    option(outV, select('v')).
  property('y', 10) // add y=1 irrespective of whether the edge is created or found
  select('v') //bonus - return the start vertex

g.mergeV([name:'manabububu']).as('v') // gets or creates a vertex with name="manubububu"
  property('x', 1) // adds x=1 irrespective of whether the vertex is created or not
  mergeE([(label):'link',(from):outV,(to): 'dac4de07-1371-a1f7-7409-ad28d75069a5'). // find edge from "v" to a vertex with the uuid specified
    option(outV, select('v')).
  property('y', 10) // add y=1 irrespective of whether the edge is created or found
  select('v') //bonus - return the start vertex

it's a bit rough as its untested, but the above is how you could approach that with mergeV() and mergeE(). i think it would be easier to read than the coalesce() version.

3x1•2y ago

Just to add my 2 cents on this, I also have a highly concurrent "create if not exist" event processing, and in general the easiest way to deal with this (without mergeV/mergeE at the time) was to deal with the concurrency upfront using some partitioning (Kinesis data streams in my case), then you can run multiple barch queries sequentially (read all vertex at once, write all missing edges at once, etc...) I'm guessing with mergeE/mergeV now it should be possible to do batch processing for this kind of queries, but coalesce was quite hard to use in this case compared to the simple g.V(...).has(..).... I have now

ManabuBeachOP•2y ago

A side note to this. All the people on my team "accused me" of not surrounding a gremlin in transaction, making statements like "I have never seen a (stupid) database like in my career.". I said "no need" as we can execute a query atomically even it includes more than one form of inserts. (Now hope I am still right but tell me if I am wrong in saying this.) Provided what I just stated is correct then you will need to be prepared to explain this to people whose frame of minds are in the SQL tx(insert, join other table) kind of a mindset.

Arthur from gdotv•2y ago

Ha! That's a very elaborate query for gpt 3 to put together!

Gaming

Programming

The Cascading Coalescing - Create a V then Create an E in One Shot

Did you find this page helpful?