Neptune - multiple labels

Hey, For some reason we had a few objects that had multiple labels. This was done via a huge script and we couldn't find anything that caused it. Now that the vertices have multiple labels, one of which is correct, is there a way to remove the other labels, or do we have to remove them and recreate the vertices? From what I could find, the only way to have multiple labels in Neptune is g.addV('label1'::'label2') but I'm certain my script never did that. Is there another way in which it's possible? Please advise.
Solution:
Gremlin will not allow for you to remove labels. However, since Neptune supports both Gremlin and openCypher on the same data, you could use the openCypher support to remove the unwanted label(s): For a vertex created with Gremlin using: ``` g.addV('test1::test2').property(id,'testV001')...
Jump to solution
5 Replies
Solution
triggan
triggan17mo ago
Gremlin will not allow for you to remove labels. However, since Neptune supports both Gremlin and openCypher on the same data, you could use the openCypher support to remove the unwanted label(s): For a vertex created with Gremlin using:
g.addV('test1::test2').property(id,'testV001')
g.addV('test1::test2').property(id,'testV001')
You could remove the test2 label using:
MATCH (n)
WHERE id(n) IN ['testV001']
REMOVE n:test2
RETURN n
MATCH (n)
WHERE id(n) IN ['testV001']
REMOVE n:test2
RETURN n
One possible way that you could end up with multiple labels is if you ran a bulk load job that had multiple rows with different labels per the same vertex ID:
~id, ~label
testV001, test1
testV001, test2
~id, ~label
testV001, test1
testV001, test2
triggan
triggan17mo ago
Or if you ran the following two queries, this would append the labels to the same vertex:
g.addV('test1::test2').property(id,'testV001')
g.addV('test3::test24').property(id,'testV001')
g.addV('test1::test2').property(id,'testV001')
g.addV('test3::test24').property(id,'testV001')
g.V('testV001').label()

returns:
['test2', 'test1', 'test3', 'test24']
g.V('testV001').label()

returns:
['test2', 'test1', 'test3', 'test24']
Shush
Shush17mo ago
The bulk job is the most likely. I don't think any of our team members even know about :: syntax for multiple labels; we do use CSV files with ids and labels for migrations - this could have been it. I don't know what OpenCypher is - can you use the same gremlin console you use for gremlin requests to run OpenCypher commands? Forgot to say, thank you for the detailed response!
triggan
triggan17mo ago
openCypher is just a different query language for Property Graphs. You won't be able to use it from a Gremlin Console. You would either have to use a Neptune Notebook (https://docs.aws.amazon.com/neptune/latest/userguide/graph-notebooks.html), or issue the queries via http (https://docs.aws.amazon.com/neptune/latest/userguide/access-graph-opencypher-queries.html).
Multiple labels is one of the few areas where Neptune deviates from native TinkerPop (https://docs.aws.amazon.com/neptune/latest/userguide/access-graph-gremlin-differences.html#feature-gremlin-differences-labels). But other Property Graph implementations do support multiple labels, hence the support for removing them with the openCypher query language. FWIW, there was some discussion a few years ago about supporting multiple labels within TinkerPop. Maybe worth surfacing this at some point: https://issues.apache.org/jira/browse/TINKERPOP-2226
Gremlin standards compliance in Amazon Neptune - Amazon Neptune
Overview of differences between the Neptune and TinkerPop implementations of Gremlin.
Shush
Shush17mo ago
Thank you so much! This is good reading material