---
type: WebPage
title: Project 24:00:00 — Why Text-to-SQL fails?
description: ""
resource: https://theshresthshukla.medium.com/9be171b057e4?sk=4c202423c5592d2c55b9c7b41457fe42
tags: []
timestamp: 2026-06-23T22:02:14.115806Z
---

URL Source: https://theshresthshukla.medium.com/9be171b057e4?sk=4c202423c5592d2c55b9c7b41457fe42

Published Time: 2026-02-27T16:58:13Z

Markdown Content:
## Project 24:00:00 — Why Text-to-SQL fails? How to think about Enterprise Grade Text to SQL Data Agent?

## soon to be the only text to SQL guide you’ll ever need!

[![Image 1: Shresth Shukla](https://miro.medium.com/v2/resize:fill:32:32/1*vtWP_kJMPsPADVUX9mn2_w.jpeg)](https://theshresthshukla.medium.com/?source=post_page---byline--9be171b057e4---------------------------------------)

8 min read

Feb 27, 2026

> Note — Stuck behind medium paywall? [Click here](https://theshresthshukla.medium.com/9be171b057e4?sk=4c202423c5592d2c55b9c7b41457fe42)to read this blog for free.

Hi everyone, this is going to be a long series of posts and a detailed one. So gear up. Welcome to UselessAI.in, and today we will be exploring all the buzz around Text-to-SQL or maybe data agents — what we call them to look cool among enterprises.

In the last few months, especially in 2025, we observed some amazing developments in advanced language models, and they were capable enough to answer anything that you need in general. And as these models grew stronger, we saw some adoption of these models in specific use cases. Code generation being one of them. Enterprises were more interested in something that they can use and, most importantly, serve to their clients as an AI solution :)

I am thinking of combining every approach available on the internet to build text-to-SQL systems in order to fix it and builda directory on [SmilingNeuron](http://smilingneuron.com/). If you have been working in this space already, you know that all the older methods do not perform well on enterprise‑grade problems, solely because benchmarks used to be simpler and the real world was the exact opposite. (talking about spider 1.0 here btw)

And you know, 60% or 70% accuracy is not something that enterprises would be open to accepting easily. Especially when you already have the data and a guy, probably a data analyst or data engineer, who can write queries for you xd. I mean, in some sense, he would want accuracy over speed. But now with AI, the shift is changing. Now we have accuracy and speed both.

Text to SQL is growing rapidly, just as LLMs are getting smarter, but there’s a fundamental shift in how we think about building a data agent that works at any scale, with any number of tables, but with better results. So Project 24:00:00 is that kickstart where we will be going through the entire journey of learning from what’s already been done in the past, what has been ranked, and then build something towards the end of the series. We’ll start with why systems actually fail, because that’s what will help you design it better. Let’s go.

![Image 2](https://miro.medium.com/v2/resize:fit:201/0*8Q0XuqcZAh--0Z3a)

i love this post.

## Why Text to SQL systems fail in enterprises?

Let's understand the problem first. Text to SQL is not new. It's already in development for decades. Yes, even before ChatGPT xd. But with LLMs, a breakthrough came. Around 2023, people started experimenting with generating SQL queries via LLMs, and it worked to some extent.

The fundamental concept for using any LLM is to share as much context as possible to get the correct answer. We simply used to give all the details to the model via prompt, and it used to generate queries. Sometimes the entire database in a single prompt to get the correct answers. And I hope you see the issue here.

Demo videos and solutions are only good for some time. When you try to implement the same thing in enterprise, you suddenly find a new problem. You cannot share everything with the model. Enterprises have more than 500 tables. You cannot put everything in one prompt. It'll be beyond the context limit of the model. But note that this is not the only issue. Real‑world databases are complex. They are domain‑specific. Few things can be interpreted easily, but a few tables and columns can be complex too.

In fact, if I talk about writing prompts in general, even if you think about sending the full schema to the model to query, math will not allow you to do so.

For example, consider 500 tables. Each on average has 18 columns, which makes a total of 9,000 columns. Now even if each table name and column name takes around 20 tokens or so with some other details, you’ll realise the context itself exceeds the context length of multiple models that you’ll be using. And that too without sending instructions, questions, examples, reasoning, etc. And if you try to do so, the research indicates a new problem, which is LLM attention. They’ll tend to focus more on the starting and ending of bigger prompts and may miss out on other details.

> So, one thing is for sure — The Problem Exists. And you know it. But is this the only problem in enterprises while building data agent or text to sql systems? Probably not!

Context that we need to send to LLMs via prompt needs detailed information about the business and the functional side of the data. For example, abbreviations may differ in different domains and sectors. Column names may have the same/different interpretations in different tables, etc.

Here are a few of the common challenges you’d face when working with enterprises in general —

1.   **_Lack of documentation_**— We take it lightly, and most of the companies still do not have documentation that explains what the data is all about. But nowadays, we need this information. Especially when we have AI tools that can easily understand the content in text files and when you have to pass the content to LLMs. We need column descriptions. We need to know the primary keys, relationships between tables.
2.   **_Incomplete or Outdated Documentation_**— Databases change with time. Schemas change. New columns are added, and a few columns get dropped. Documentation stays the same. No one updates documentation often. And this is one of the biggest challenges you’ll find in enterprises.
3.   **_Unclear data formats_**— Take a simple example. We know that age is generally numeric. What if the data type of that column is string and some inputs are in text like ‘eighteen’ instead of 18? You never know. What if they enter DOB instead of the age? What if they put a name there? Ideally it wouldn’t happen, but the point is, you never know what data format is being used at the base level, and you cannot design your process when you have this type of data.
4.   **_Fields with similar meaning_**— Imagine two or more columns with similar naming conventions but different meanings. For example, date or name. If you have columns like c_date, s_date, i_date in one table, how would you know which date to refer to? What do these columns actually represent? So there is ambiguity.
5.   **_Complex joins_**— When you have 100s of tables with multiple join paths, the model will fail. For example, if you have a query that involves writing 5 joins to get a result, there’s a high chance that the model will get it wrong. Writing correct join expressions is difficult for humans with business understanding, so no doubt the model will ruin it as well.
6.   **_Business Metrics_**— Let’s assume that your enterprise uses a certain formula to calculate a metric based on your data. Now the model has no clue, and if the formula is complex, a model without understanding the business process would make even simple calculations wrong.
7.   **_Actual Data_**— What if you have some weird values inside your data that make the table confusing? For example, does your business put default values in some of the tables? The model might misinterpret it as user input. There’s a possibility that tables with millions of rows would have only a few hundred distinct values, out of which tens of values are not user‑specific but business‑specific. So it is also one of the possibilities.

And many more such problems. I was reading a paper and found a few of the above examples there which were relevant for this blog. Hope you now understand the problem that we are trying to solve.

Can you think of a solution at this stage? One potential solution that might be coming to your mind as well is that, let's send database‑related information to the prompt by filtering only a few of the tables and using limited tables as context. This will help in reducing the length of the prompt and we can add business context there. Which is perfectly right, because that's what most of the researchers did. They thought of the exact same solution, which is finding only relevant tables, and sometimes relevant columns, to send as context. And it works.

But this approach has more complexity than we can think of. If we had to design a solution based on the current understanding of the problem, here's something that I would do.

I'll try to use LLM to find relevant tables, and then send them into the prompt again. And that's correct. Maybe we can think more like a data analyst and break my problem into logical steps, like do this and then do this, then join with this table, add this filter etc. Similar to how we approach a problem, right?

Then the flow of the solution would look like this —

Press enter or click to view image in full size

![Image 3](https://miro.medium.com/v2/resize:fit:700/1*ctquM1YXhSMnHBTiFclYJg.png)

Till now, I hope that you know about the problems at least. We got two major problems — first being the context length limit, and that we cannot send the full schema to the model, and second being that I need to share business context with it, which also includes domain‑specific keywords, any default values, column interpretation, metadata to define what these columns store, what they mean, etc.

The fact that I loved the most while researching about this problem is that LLMs come at the very last stage of this problem. Its more of a systems problem than model capabilities. Obviously better models perform better, but if you have not deisgned it properly, no matter the model you chose, you’ll not get good results. LLMs are the last part of this problem when solving text to sql for enterprises. Ah, btw I hope your model is not like this lol —

Press enter or click to view image in full size

![Image 4](https://miro.medium.com/v2/resize:fit:700/0*RiKTk0vMx43I5MQd.png)

thanKS

In the next blog of this series, we will start looking at potential solutions in order to build these systems, moving one step at a time. If you do not have patience, maybe I’ll write one more blog to at least design a framework or working architecture for an end‑to‑end solution that we can build. We’ll see how we can create multiple agents in our solution which have specific purposes and solve specific problems.

Till then, see ya. If you have something to share and want me to talk about it, just DM me on [LinkedIn](http://linkedin.com/in/shresthshuklaji). I’ll be happy to help and spread words. If you write blogs and want to contribute, [uselessai.in](http://uselessai.in/) is always open to contributions. Cheers.
