Creating an Agentic RAG for Text-to-SQL Applications

Agentic RAG simplifies text-to-SQL by modularizing tasks into tools like query transformation, hybrid search, and re-ranking, ensuring accuracy and scalability.

Arjun Bali

Feb. 18, 25 · Tutorial

Likes (0)

Comment

Save

1.6K Views

The blend of retrieval-augmented generation (RAG) and generative AI models has brought changes to natural language processing by improving the responses to queries. In the realm of Agentic RAG, this conventional method of relying on a monolithic model for tasks has been enhanced by introducing modularity and autonomy. By breaking down the problem-solving process into tools integrated within an agent, Agentic RAG provides benefits like accuracy, transparency, scalability, and debugging capabilities.

The Vision Behind Agentic RAG for Text-to-SQL

Traditional RAG systems often retrieve relevant documents and rely on a single monolithic model to generate responses. Although this is an effective method in some cases, when it comes to structural outputs like the case of generating SQL, this approach may not be the most effective. This is where we can leverage the power of the Agentic RAG framework, where we:

Divide the tasks into smaller, more manageable tools within an agent
Improve accuracy by assigning tasks to specialized tools
Enhance transparency by tracing the reasoning and workflow of each tool
Simplify scaling and debugging through modular design

Let's talk about how this tool works and the role each component plays in transforming user questions into accurate SQL queries.

Architecture Overview

The structure comprises an agent utilizing tools within the text-to-SQL workflow. The process can be summarized as follows:

User Query → Query Transformation Tool → Few Shot Prompting Tool → Hybrid Search Tool → Re Ranking Tool → Table Retrieval Tool → Prompt Building Tool → LLM Execution Tool → SQL Execution Tool → Final Output

1. User Query Transformation Tool

This tool would entail processing the user query for a better understanding of the LLM. It addresses ambiguities, rephrases user questions, translates abbreviations into their forms, and provides context when necessary.

Enhancements

Handle temporal references. Map terms like "as of today" or "till now" to explicit dates.
Replace ambiguous words. For example, "recent" could be replaced by "last 7 days."
Connecting shorthand or abbreviations to their names.

Example

Input: "Show recent sales MTD."

Transformed query: "Retrieve sales data for the last 7 days (Month to Date)."

    Python
   
 

   from datetime import date, timedelta

def transform_query(user_query):
    # Handle open-ended temporal references
    today = date.today()
    transformations = {
        "as of today": f"up to {today}",
        "till now": f"up to {today}",
        "recent": "last 7 days",
        "last week": f"from {today - timedelta(days=7)} to {today}",
    }
    
    for key, value in transformations.items():
        user_query = user_query.replace(key, value)

    # Map common abbreviations
    abbreviations = {
        "MTD": "Month to Date",
        "YTD": "Year to Date",
    }
    for abbr, full_form in abbreviations.items():
        user_query = user_query.replace(abbr, full_form)

    return user_query

query_transform_tool = Tool(
    name="Query Transformer",
    func=transform_query,
    description="Refines user queries for clarity and specificity, handles abbreviations and open-ended terms."
)
  

2. Few Shot Prompting Tool

This tool makes a call to the LLM to identify the question of a kind from a set (we can also say matching the template). The matched question enhances the prompt with an example SQL query.

Example Workflow

1. Input question: "Show me total sales by product for the 7 days."

2. Predefined templates:

"Show sales grouped by region." → Example SQL; SELECT region, SUM(sales) ...
"Show total sales by product." → Example SQL; SELECT product_name, SUM(sales) ...

3. Most similar question: "Show total sales by product."

4. Output example SQL: SELECT product_name, SUM(sales) FROM ...

    Python
   
 

   from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(model="gpt-4")

predefined_examples = {
    "Show sales grouped by region": "SELECT region, SUM(sales) FROM sales_data GROUP BY region;",
    "Show total sales by product": "SELECT product_name, SUM(sales) FROM sales_data GROUP BY product_name;",
}

def find_similar_question(user_query):
    prompt = "Find the most similar question type for the following user query:\n"
    prompt += f"User Query: {user_query}\n\nOptions:\n"
    for example in predefined_examples.keys():
        prompt += f"- {example}\n"
    prompt += "\nRespond with the closest match."

    response = llm.call_as_function(prompt)
    most_similar = response['content']
    return predefined_examples.get(most_similar, "")

few_shot_tool = Tool(
    name="Few-Shot Prompting",
    func=find_similar_question,
    description="Finds the most similar question type using an additional LLM call and retrieves the corresponding example SQL."
)
  

3. Hybrid Search Tool

For a robust retrieval, this tool combines semantic search, keyword search based on BM25, and keyword-based mapping. The search results from these search methods are put together using reciprocal rank fusion.

How does it all come together?

Keyword Table Mapping

This approach maps the tables to the keywords that are contained in the query. For example:

The presence of "sales" results in the sales table being shortlisted.
The presence of "product" results in the products table being shortlisted.

Keyword Overlap Mapping (BM25)

This is a search method based on keyword overlap that shortlists tables in line with relevance. For this, we shall apply the BM25 technique. This sorts the papers in order of relevance to a user search. This search technique considers term saturation in view as well as TF-IDF (Term Frequency-Inverse Document Frequency).

Term Frequency (TF) helps one to measure the frequency of a term in a given document. The Inverse Document Frequency (IDF) approach underlines words that show up in every document lessening importance.

Normalizing takes document length into account to prevent any bias toward longer papers.

Given:

sales_data: Contains terms like "sales," "date," "product."
products: Contains terms like "product," "category."
orders: Contains terms like "order," "date," "customer."
financials: Contains terms like "revenue," "profit," "expense."

User query: "Show total sales by product."

Identify terms in the user query: ["sales," "product"].
Sort every document (based on frequency and relevance of these terms) in DataBaseTable.

Relevance of documents:

sales: High relevance due to both "sales" and "product"
products: High relevance due to "product."
orders: Lower relevance due to the presence of only "sales."
financials: Not relevant.

Output:

Ranked list: [products, sales_data, orders, financials]

Semantic Search

In this search method, as the name suggests, we find semantically similar tables utilizing vector embeddings. We achieve this by calculating a similarity score, such as cosine similarity, between the document (table vectors) and user query vectors.

Reciprocal Rank Fusion

Combines BM25 and semantic search results using reciprocal rank fusion strategy, which is explained a little more in detail below:

Reciprocal Rank Fusion (RRF) combining BM25 and semantic search:

RRF is a method to combine results from multiple ranking algorithms (e.g., BM25 and semantic search). It assigns a score to each document based on its rank in the individual methods, giving higher scores to documents ranked higher across multiple methods.

RRF formula:

RRF(d) = Σ(r ∈ R) 1 / (k + r(d))

Where:

d is a document
R is the set of rankers (search methods)
k is a constant (typically 60)
r(d) is the rank of document d in search method r

Step-by-Step Example

Input data.

1. BM25 ranking results:

products (Rank 1)
sales_data (Rank 2)
orders (Rank 3)

2. Semantic search ranking results:

sales_data (Rank 1)
financials (Rank 2)
products (Rank 3)

Step-by-Step Fusion

For each table, compute the score:

1. sales_data

BM25 Rank = 2, Semantic Rank = 1
RRF Score = (1/60+2 ) + (1/60+1) = 0.03252

2. products

BM25 Rank = 1, Semantic Rank = 3
RRF Score = (1/60+1) + (1/60+3)= .03226

3. orders

BM25 Rank = 3, Semantic Rank = Not Ranked
RRF Score = (1/60+3)= 0.01587

4. financials

BM25 Rank = Not Ranked, Semantic Rank = 2
RRF Score = (1/60+2)=0.01613

5. Sort by RRF score

sales_data (highest score due to top rank in semantic search).
products (high score from BM25).
orders (lower relevance overall).
financials (limited overlap).

Final output: ['sales_data', 'products,' 'financials,' 'orders']

Tables retrieved using Keyword Table mapping are always included.

    Python
   
 

   from rank_bm25 import BM25Okapi

def hybrid_search(query):
    # Keyword-based mapping
    keyword_to_table = {
        "sales": "sales_data",
        "product": "products",
    }
    keyword_results = [table for keyword, table in keyword_to_table.items() if keyword in query.lower()]

    # BM25 Search
    bm25 = BM25Okapi(["sales_data", "products", "orders", "financials"])
    bm25_results = bm25.get_top_n(query.split(), bm25.corpus, n=5)

    # Semantic Search
    semantic_results = vector_store.similarity_search(query, k=5)

    # Reciprocal Rank Fusion
    def reciprocal_rank_fusion(results):
        rank_map = {}
        for rank, table in enumerate(results):
            rank_map[table] = rank_map.get(table, 0) + 1 / (1 + rank)
        return sorted(rank_map, key=rank_map.get, reverse=True)

    combined_results = reciprocal_rank_fusion(bm25_results + semantic_results)

    return list(set(keyword_results + combined_results))

hybrid_search_tool = Tool(
    name="Hybrid Search",
    func=hybrid_search,
    description="Combines keyword mapping, BM25, and semantic search with RRF for table retrieval."
)

  

4. Re-Ranking Tool

This tool ensures the most relevant tables are prioritized.

Example

Input tables: ["sales_data," "products," "financials"]
Re-ranking logic
- For each table, compute a relevance score by concatenating the query and the table description.
- Sort by relevance score.

Output: ["sales_data," "products"]

A little more into the Re- ranking logic:

The cross-encoder calculates a relevance score by analyzing the concatenated query and table description as a single input pair. This process involves:

Pair input. The query and each table description are paired and passed as input to the cross-encoder.
Joint encoding. Unlike separate encoders (e.g., bi-encoders), the cross-encoder jointly encodes the pair, allowing it to better capture context and dependencies between the query and the table description.
Scoring. The model outputs a relevance score for each pair, indicating how well the table matches the query.

    Python
   
   from transformers import pipeline

reranker = pipeline("text-classification", model="cross-encoder/ms-marco-TinyBERT-L-2")

def re_rank_context(query, results):
    scores = [(doc, reranker(query + " " + doc)[0]['score']) for doc in results]
    return [doc for doc, score in sorted(scores, key=lambda x: x[1], reverse=True)]

re_rank_tool = Tool(
    name="Re-Ranker",
    func=re_rank_context,
    description="Re-ranks the retrieved tables based on relevance to the query."
)

5. Prompt Building Tool

This tool constructs a detailed prompt for the language model, incorporating the user’s refined query, retrieved schema, and examples from the Few-Shot Prompting Tool.

Assume you are someone who is proficient in generating SQL queries. Generate an SQL query to: Retrieve total sales grouped by product for the last 7 days.

Relevant tables:

sales_data: Contains columns [sales, date, product_id].
products: Contains columns [product_id, product_name].

Example SQL:

    Plain Text
   
   SELECT product_name, SUM(sales) FROM sales_data JOIN products ON sales_data.product_id = products.product_id GROUP BY product_name;

Future Scope

While this system uses a single agent with multiple tools to simplify modularity and reduce complexity, a multi-agent framework could be explored in the future. We could possibly explore the following:

Dedicated agents for context retrieval. Separate agents for semantic and keyword searches.
Task-specific agents. Agents specialized in SQL validation or optimization.
Collaboration between agents. Using a coordination agent to manage task delegation.

This approach could enhance scalability and allow for more sophisticated workflows, especially in enterprise-level deployments.

Conclusion

Agentic RAG for text-to-SQL applications offers a scalable, modular approach to solving structured query tasks. By incorporating hybrid search, re-ranking, few-shot prompting, and dynamic prompt construction within a single-agent framework, this system ensures accuracy, transparency, and extensibility. This enhanced workflow demonstrates a powerful blueprint for turning natural language questions into actionable SQL queries.

Semantic search applications sql large language model RAG

Opinions expressed by DZone contributors are their own.

Related

Trending

Creating an Agentic RAG for Text-to-SQL Applications

Agentic RAG simplifies text-to-SQL by modularizing tasks into tools like query transformation, hybrid search, and re-ranking, ensuring accuracy and scalability.

The Vision Behind Agentic RAG for Text-to-SQL

Architecture Overview

1. User Query Transformation Tool

Enhancements

Example

2. Few Shot Prompting Tool

Example Workflow

3. Hybrid Search Tool

Keyword Table Mapping

Keyword Overlap Mapping (BM25)

Semantic Search

Reciprocal Rank Fusion

Step-by-Step Example

Step-by-Step Fusion

4. Re-Ranking Tool

Example

5. Prompt Building Tool

Future Scope

Conclusion

Related

Partner Resources