Fulltext-search-like features without ElasticSearch, OpenSearch, Solr and such?

I've read in multiple sources that Apache TinkerPop isn't optimized for text search operations like partial string matching or Regex matching. A common "solution" seems to involve integrating the database with fulltext search engines like ElasticSearch or Solr. Is there another way of handling these kind of operations without adding another tool? I'm afraid this is getting way more complex than I wanted. Just some context, what I'm trying to do is filter nodes by one of their properties called legal_name, some similar to SQL SELECT * FROM customers WHERE legal_name LIKE '%John%', the query itself is of course more complex than that, but that Step is making it really nonperformant.
B
Bo28d ago
There's an ongoing effort to add Couchbase (a storage engine that supports full-text search) to JanusGraph: https://github.com/JanusGraph/janusgraph/pull/4086
GitHub
4084: adds Couchbase as JanusGraph backend by chedim · Pull Request...
Issue #4084 This PR adds couchbase JanusGraph backend and search. The backend is in alfa stage and is not yet recommended for production use. All the dependencies for the backend are either already...
G
Gil27d ago
that would make it support fulltext search right out of the box?
B
Bo27d ago
seems so
T
triggan26d ago
TinkerPop, by itself, is a framework. So the provider that implements the framework would need to implement things such as text search indexing. That being said, there was an addition made to TinkerPop 3.6 to provide extensions in the form of call() steps. https://tinkerpop.apache.org/docs/3.6.0/dev/provider/#_call https://github.com/apache/tinkerpop/blob/d174572f3fa3d8ff01e628dab18493e13359a632/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/structure/service/Service.java#L41-L51 This likely gets overlooked, as the documentation for this is pretty light. There is, however, a reference implementation of implementing a regex based search by creating a "service" and using the related call() step: https://github.com/apache/tinkerpop/blob/d174572f3fa3d8ff01e628dab18493e13359a632/tinkergraph-gremlin/src/main/java/org/apache/tinkerpop/gremlin/tinkergraph/services/TinkerTextSearchFactory.java You could use that as a the basis for creating a service that makes a remote call to something like OpenSearch.
GitHub
tinkerpop/tinkergraph-gremlin/src/main/java/org/apache/tinkerpop/gr...
Apache TinkerPop - a graph computing framework. Contribute to apache/tinkerpop development by creating an account on GitHub.
GitHub
tinkerpop/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/s...
Apache TinkerPop - a graph computing framework. Contribute to apache/tinkerpop development by creating an account on GitHub.
S
spmallette26d ago
I think that some of the published information out there from previous years might have been satisfied by Gremlin having native regex support. This was added in 3.6.0 - https://tinkerpop.apache.org/docs/current/upgrade/#_textp_regex - that feature might satisfy some text search use cases.
T
triggan26d ago
Yes. Maybe a need for some better examples for Service Registry.
G
Gil26d ago
I'm currently using TextP.Regex for my queries (as well as "startsWith", "containing" etc), but it absolutely kills the performance of the query, and this seems to be one of the most common reasons why people go and integrate it with something like ElasticSearch well this still leads to me having to integrate my DB with another tool, which is exactly what I was trying to avoid.
D
dmcmanus20d ago
I was thinking about this recently as well for attempting to implement a fuzzy search on a name... I haven't fully fleshed out exactly how it would work (or if it would work at all) but essentially, I wondered if it could be accomplished by creating a separate vertex for each letter in the legal_name with a CONTAINS_LETTER Edge (and maybe a positional property on the edge?) My thought was that given an input string (a name, in this example) you could use a repeat() step until() some pre-defined match criteria were met *Edit - I'm a relative Gremlin newbie, so forgive me if that makes zero sense!
S
spmallette19d ago
the fun thing about graphs is that as soon as you start learning more about them, you start seeing how many problems can be put into a graph context. theoretically, i think you could model search the way you describe, but it does create a lot of extra infrastructure in your graph which might have performance/space/administrative implications.
B
Bo17d ago
the fun thing about graphs is that as soon as you start learning more about them, you start seeing how many problems can be put into a graph context.
Haha cannot agree more. Everything that sits in the old RDBMS space can be reinvented in graph universe.
S
spmallette17d ago
i've even infected my children. they see stuff randomly in the world and are like, "whoa, that's a graph"
Want results from more Discord servers?
Add your server
More Posts
Conditionally updating a variable with choose()How do I create and update a variable with a conditional? I need a number to be calculated based on Systems Analysis Report on Apache TinkerPop - Where to Start?Hey all, I'm currently writing an alaysis on Apache TinkerPop for grad school and was just hoping thLambda example in TypeScriptDoes anyone know where I can find example code that demonstrates up-to-date best practices for writimergeE(): increment counter on matchHi, is there an easy way to increment an existing edge property based on its current value using `meSerialization IssueI have a weird error, when I am connecting with JanusGraph gremlin client using `conf/remote-graph-Design decision related to multiple heterogenous relational graphsI'm working with over 100k instances of heterogeneous, relational node-and-edge attributed graphs, eStackoverflow when adding a larger list of property values using traverser.property()Hey, we encounter a stack overflow: ``` Exception during Transaction, rolling back ... org.apache.tijava: package org.apache.tinkerpop.shaded.jackson.core does not existWhile trying to `mvn clean install` with jdk11, I ran into the above error using the master branch. Performance issue in large graphsWhen performing changes in large graph (ca. 100K nodes, 500K edges) which is stored in one kryo fileConcurrent queries to authentication required sever resulted in 401 errorHey guys, playing around with gremlin & encountered this very odd error where concurrent queries wilDiscrepancy between console server id conventions and NeptuneSo I'm working with my test server and on Neptune--and I'm noticing a difference in the type of the how to connect the amothic/neptune container to the volume?I need to know which directory needs to attach to containeer. so that the data is stored safely. eveDocker yaml authentication settings (gremlinserver.authentication) questionDoes anyone have any experience setting up authentication on Docker by using the supplied .yaml fileGremlin Injection Attacks?Is anyone talking about or looking into attacks and mitigations for Gremlin Injection Attacks? That Returned vertex properties (JS client)Hi, I've got a question regarding the returned vertex value when using the JS client. How come non-aAnyone using Tinkerpop docker as a local Cosmos replacementRunning into some random issues. Looking for tips and tricks.Configuring Websockets connection to pass through a proxy serverHey, I'm working on making G.V() fully proxy aware, but I can't seem to get websockets connection tpython goblin vs spring-data-goblin for interactions with gremlin serverI want an OGM to interact with my gremlin server. What would be a good choice?Is there any open source version of data visualizer for aws neptune?Is there any open source version of data visualizer for aws neptune. I'll need it since it essentialDynamic select within query not working.Any insights or help would be greatly appreciated. I have to pass a list of lists in the format be