niuzj
niuzj
ATApache TinkerPop
Created by niuzj on 5/21/2025 in #questions
Seeking Help: Building a Text-to-Gremlin Corpus Generator - AST Parsing
Hey everyone, I'm working on fine-tuning a large language model for text-to-Gremlin generation. To do this, I need a substantial dataset of natural language queries paired with their corresponding Gremlin queries. I'm currently building a corpus generator for this. I've seen some work on text-to-Cypher where they parsed the Cypher AST (Abstract Syntax Tree). However, the ASTs for Cypher and Gremlin are quite different. Does anyone have suggestions on how to tackle this? Specifically: * Are there any existing tools for parsing Gremlin ASTs? * Alternatively, are there any methods to build such a corpus generator without relying on AST parsing? Any help or ideas would be greatly appreciated! Thanks!
34 replies