53 replies

Setting index in gremlin-python

I was trying to create vote graph from tutorial on loading data in gremlin-python and afaik you can't simply add index from non-JVM languages because for example there is no TinkerGraph that you could

.open()

.open()

. I don't know how better is performance when having index on 'userId' but my code simply takes too long go through queries from vote file.

I tried using client functionality

ws_url = 'ws://localhost:8182/gremlin'

# Create index on userId
client = Client(ws_url, 'g')
client.submit('graph = TinkerGraph.open()')
client.submit("graph.createIndex('userId', Vertex.class)")
client.close()

conn = DriverRemoteConnection(ws_url, 'g')
g = traversal().with_remote(conn)

ws_url = 'ws://localhost:8182/gremlin'

# Create index on userId
client = Client(ws_url, 'g')
client.submit('graph = TinkerGraph.open()')
client.submit("graph.createIndex('userId', Vertex.class)")
client.close()

conn = DriverRemoteConnection(ws_url, 'g')
g = traversal().with_remote(conn)

to do it from string query and i'm not sure if

with_remote(conn)

with_remote(conn)

uses previously assigned

graph

graph

, let me know how to do it correctly. I'm not sure how to assign to

from

client.submit(...)

client.submit(...)

.

Additionally: how does one speed up those queries, if setting index won't do it?
In my implementation

def idToNode(g: GraphTraversalSource, id: str):
    return g.V().has('user', 'userId', id) \
            .fold() \
            .coalesce(__.unfold(), 
                      __.add_v('user').property('userId', id)) \
            .next()

def loadVotes():
    with open("/tmp/wiki-Vote.txt", "r") as file:
        for _ in range(4):
            next(file)

        for line in file:
            ids = line.split('\t')
            from_node = idToNode(g, ids[0])
            to_node = idToNode(g, ids[1])
            g.add_e('votesFor').from_(from_node).to(to_node).iterate()

def idToNode(g: GraphTraversalSource, id: str):
    return g.V().has('user', 'userId', id) \
            .fold() \
            .coalesce(__.unfold(), 
                      __.add_v('user').property('userId', id)) \
            .next()

def loadVotes():
    with open("/tmp/wiki-Vote.txt", "r") as file:
        for _ in range(4):
            next(file)

        for line in file:
            ids = line.split('\t')
            from_node = idToNode(g, ids[0])
            to_node = idToNode(g, ids[1])
            g.add_e('votesFor').from_(from_node).to(to_node).iterate()

call to

idToNode

idToNode

for each line takes too long.

Solution

if you simply edit that line of code to create your index and load your vote data, every time you start Gremlin Server it will have that all setup and ready to go.

Jump to solution

Setting index in gremlin-python

.open()

.open()

. I don't know how better is performance when having index on 'userId' but my code simply takes too long go through queries from vote file.

I tried using client functionality

ws_url = 'ws://localhost:8182/gremlin'

# Create index on userId
client = Client(ws_url, 'g')
client.submit('graph = TinkerGraph.open()')
client.submit("graph.createIndex('userId', Vertex.class)")
client.close()

conn = DriverRemoteConnection(ws_url, 'g')
g = traversal().with_remote(conn)

ws_url = 'ws://localhost:8182/gremlin'

# Create index on userId
client = Client(ws_url, 'g')
client.submit('graph = TinkerGraph.open()')
client.submit("graph.createIndex('userId', Vertex.class)")
client.close()

conn = DriverRemoteConnection(ws_url, 'g')
g = traversal().with_remote(conn)

to do it from string query and i'm not sure if

with_remote(conn)

with_remote(conn)

uses previously assigned

graph

graph

, let me know how to do it correctly. I'm not sure how to assign to

from

client.submit(...)

client.submit(...)

.

Additionally: how does one speed up those queries, if setting index won't do it?
In my implementation

def idToNode(g: GraphTraversalSource, id: str):
    return g.V().has('user', 'userId', id) \
            .fold() \
            .coalesce(__.unfold(), 
                      __.add_v('user').property('userId', id)) \
            .next()

def loadVotes():
    with open("/tmp/wiki-Vote.txt", "r") as file:
        for _ in range(4):
            next(file)

        for line in file:
            ids = line.split('\t')
            from_node = idToNode(g, ids[0])
            to_node = idToNode(g, ids[1])
            g.add_e('votesFor').from_(from_node).to(to_node).iterate()

def idToNode(g: GraphTraversalSource, id: str):
    return g.V().has('user', 'userId', id) \
            .fold() \
            .coalesce(__.unfold(), 
                      __.add_v('user').property('userId', id)) \
            .next()

def loadVotes():
    with open("/tmp/wiki-Vote.txt", "r") as file:
        for _ in range(4):
            next(file)

        for line in file:
            ids = line.split('\t')
            from_node = idToNode(g, ids[0])
            to_node = idToNode(g, ids[1])
            g.add_e('votesFor').from_(from_node).to(to_node).iterate()

call to

idToNode

idToNode

for each line takes too long.

Solution

if you simply edit that line of code to create your index and load your vote data, every time you start Gremlin Server it will have that all setup and ready to go.

Jump to solution

Setting index in gremlin-python

Setting index in gremlin-python

Similar Threads

Similar Threads

Similar Threads