Natural language to sql converter

Hey guys, i’m currently building an ai app that has an llm to write a sql query which i can then used to fetch the data from a postresql database. My current approach is that i pass the llm with the table schema as well as some sample data (top 5 data in the table) for it to have some context. However, the issue im currently facing is that the data the llm tries to query from the table may not actually exists or named differently. For example, suppose im querying a table for car spare parts. The llm writes a sql query like SELECT * FROM spare_parts WHERE part_name = ‘Bumper’. However, bumper is stored as Fender in the table. This messes up the data retrieval. Is my current approach good enough or should i consider switching to using embeddings ?
5 Replies
Ani
Ani2mo ago
is the point of the app to have AI write the queries or do you just need to get data from natural langauge?
/Users/Dragon
/Users/Dragon2mo ago
Making an AI write SQL queries for you is a disaster waiting to happen.
Wrath
Wrath2mo ago
You are probably better off just having the llm parse the input into a structured output that can be validated and passed into some sort of query builder
Rivenris
Rivenris2mo ago
To reiterate what's above - there is a reason we don't write sql most of the times: we don't trust ourselves, let alone LLM. But, because it's a fun side-project, I'll chip in to actual solution: - You could encourage llm in system prompt to write query that checks for multiple terms whenever it sees opportunity for it. In those case give examples, so that llm knows what you talk about SELECT * FROM spare_parts WHERE part_name IN ("Bumper", "Fender") - You can give the llm tool that will allow it to check for available values in given table/column. With proper description, LLM will check for avaliable values, before it'll decide which to search for - You can even combine both of the above solutions. As mentioned by other people, I wouldn't do that in anything near production data or in any production app. But I would do it as hobby project on my own. In final approach, much better way would be to prepare mcp server that has all the tools and queries implemented so that llm can use them, but limited only to whatever is exposed.
/Users/Dragon
/Users/Dragon5w ago
btw Oracle "AI" Database is doing this using MCP

Did you find this page helpful?