DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • The Role of Retrieval Augmented Generation (RAG) in Development of AI-Infused Enterprise Applications
  • Automating Data Pipelines: Generating PySpark and SQL Jobs With LLMs in Cloudera
  • Getting Started With LangChain for Beginners
  • Challenges of Using LLMs in Production: Constraints, Hallucinations, and Guardrails

Trending

  • How AI Agents Are Transforming Enterprise Automation Architecture
  • Mastering Fluent Bit: Installing and Configuring Fluent Bit on Kubernetes (Part 3)
  • Build Your First AI Model in Python: A Beginner's Guide (1 of 3)
  • Unlocking AI Coding Assistants: Generate Unit Tests
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. Creating an Agentic RAG for Text-to-SQL Applications

Creating an Agentic RAG for Text-to-SQL Applications

Agentic RAG simplifies text-to-SQL by modularizing tasks into tools like query transformation, hybrid search, and re-ranking, ensuring accuracy and scalability.

By 
Arjun Bali user avatar
Arjun Bali
·
Feb. 18, 25 · Tutorial
Likes (0)
Comment
Save
Tweet
Share
6.3K Views

Join the DZone community and get the full member experience.

Join For Free

The blend of retrieval-augmented generation (RAG) and generative AI models has brought changes to natural language processing by improving the responses to queries. In the realm of Agentic RAG, this conventional method of relying on a monolithic model for tasks has been enhanced by introducing modularity and autonomy. By breaking down the problem-solving process into tools integrated within an agent, Agentic RAG provides benefits like accuracy, transparency, scalability, and debugging capabilities.

The Vision Behind Agentic RAG for Text-to-SQL

Traditional RAG systems often retrieve relevant documents and rely on a single monolithic model to generate responses. Although this is an effective method in some cases, when it comes to structural outputs like the case of generating SQL, this approach may not be the most effective. This is where we can leverage the power of the Agentic RAG framework, where we:

  1. Divide the tasks into smaller, more manageable tools within an agent
  2. Improve accuracy by assigning tasks to specialized tools
  3. Enhance transparency by tracing the reasoning and workflow of each tool
  4. Simplify scaling and debugging through modular design

Let's talk about how this tool works and the role each component plays in transforming user questions into accurate SQL queries.

Architecture Overview

The structure comprises an agent utilizing tools within the text-to-SQL workflow. The process can be summarized as follows:

User Query → Query Transformation Tool → Few Shot Prompting Tool → Hybrid Search Tool → Re Ranking Tool → Table Retrieval Tool → Prompt Building Tool → LLM Execution Tool → SQL Execution Tool → Final Output

1. User Query Transformation Tool

This tool would entail processing the user query for a better understanding of the LLM. It addresses ambiguities, rephrases user questions, translates abbreviations into their forms, and provides context when necessary.

Enhancements

  • Handle temporal references. Map terms like "as of today" or "till now" to explicit dates.
  • Replace ambiguous words. For example, "recent" could be replaced by "last 7 days."
  • Connecting shorthand or abbreviations to their names.

Example

Input: "Show recent sales MTD."

Transformed query: "Retrieve sales data for the last 7 days (Month to Date)."

Python
 
from datetime import date, timedelta

def transform_query(user_query):
    # Handle open-ended temporal references
    today = date.today()
    transformations = {
        "as of today": f"up to {today}",
        "till now": f"up to {today}",
        "recent": "last 7 days",
        "last week": f"from {today - timedelta(days=7)} to {today}",
    }
    
    for key, value in transformations.items():
        user_query = user_query.replace(key, value)

    # Map common abbreviations
    abbreviations = {
        "MTD": "Month to Date",
        "YTD": "Year to Date",
    }
    for abbr, full_form in abbreviations.items():
        user_query = user_query.replace(abbr, full_form)

    return user_query

query_transform_tool = Tool(
    name="Query Transformer",
    func=transform_query,
    description="Refines user queries for clarity and specificity, handles abbreviations and open-ended terms."
)


2. Few Shot Prompting Tool

This tool makes a call to the LLM to identify the question of a kind from a set (we can also say matching the template). The matched question enhances the prompt with an example SQL query.

Example Workflow

1. Input question: "Show me total sales by product for the 7 days."

2. Predefined templates:

  • "Show sales grouped by region." → Example SQL; SELECT region, SUM(sales) ...
  • "Show total sales by product." → Example SQL; SELECT product_name, SUM(sales) ...

3. Most similar question: "Show total sales by product."

4. Output example SQL: SELECT product_name, SUM(sales) FROM ...

Python
 
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(model="gpt-4")

predefined_examples = {
    "Show sales grouped by region": "SELECT region, SUM(sales) FROM sales_data GROUP BY region;",
    "Show total sales by product": "SELECT product_name, SUM(sales) FROM sales_data GROUP BY product_name;",
}

def find_similar_question(user_query):
    prompt = "Find the most similar question type for the following user query:\n"
    prompt += f"User Query: {user_query}\n\nOptions:\n"
    for example in predefined_examples.keys():
        prompt += f"- {example}\n"
    prompt += "\nRespond with the closest match."

    response = llm.call_as_function(prompt)
    most_similar = response['content']
    return predefined_examples.get(most_similar, "")

few_shot_tool = Tool(
    name="Few-Shot Prompting",
    func=find_similar_question,
    description="Finds the most similar question type using an additional LLM call and retrieves the corresponding example SQL."
)


3. Hybrid Search Tool

For a robust retrieval, this tool combines semantic search, keyword search based on BM25, and keyword-based mapping. The search results from these search methods are put together using reciprocal rank fusion.

How does it all come together?

Keyword Table Mapping

This approach maps the tables to the keywords that are contained in the query. For example:

  • The presence of "sales" results in the sales table being shortlisted.
  • The presence of "product" results in the products table being shortlisted. 

Keyword Overlap Mapping (BM25)

This is a search method based on keyword overlap that shortlists tables in line with relevance. For this, we shall apply the BM25 technique. This sorts the papers in order of relevance to a user search. This search technique considers term saturation in view as well as TF-IDF (Term Frequency-Inverse Document Frequency). 

Term Frequency (TF) helps one to measure the frequency of a term in a given document. The Inverse Document Frequency (IDF) approach underlines words that show up in every document lessening importance.

Normalizing takes document length into account to prevent any bias toward longer papers.

Given:

  • sales_data: Contains terms like "sales," "date," "product."
  • products: Contains terms like "product," "category."
  • orders: Contains terms like "order," "date," "customer."
  • financials: Contains terms like "revenue," "profit," "expense."

User query: "Show total sales by product."

  • Identify terms in the user query: ["sales," "product"].
  • Sort every document (based on frequency and relevance of these terms) in DataBaseTable.

Relevance of documents:

  • sales: High relevance due to both "sales" and "product"
  • products: High relevance due to "product." 
  • orders: Lower relevance due to the presence of only "sales."
  • financials: Not relevant.

Output:

Ranked list: [products, sales_data, orders, financials]

Semantic Search

In this search method, as the name suggests, we find semantically similar tables utilizing vector embeddings. We achieve this by calculating a similarity score, such as cosine similarity, between the document (table vectors) and user query vectors.

Reciprocal Rank Fusion

Combines BM25 and semantic search results using reciprocal rank fusion strategy, which is explained a little more in detail below:

Reciprocal Rank Fusion (RRF) combining BM25 and semantic search:

RRF is a method to combine results from multiple ranking algorithms (e.g., BM25 and semantic search). It assigns a score to each document based on its rank in the individual methods, giving higher scores to documents ranked higher across multiple methods.

RRF formula:

RRF(d) = Σ(r ∈ R) 1 / (k + r(d))

Where:

  • d is a document
  • R is the set of rankers (search methods)
  • k is a constant (typically 60)
  • r(d) is the rank of document d in search method r

Step-by-Step Example

Input data.

1. BM25 ranking results:

  • products (Rank 1)
  • sales_data (Rank 2)
  • orders (Rank 3)

2. Semantic search ranking results:

  • sales_data (Rank 1)
  • financials (Rank 2)
  • products (Rank 3)

Step-by-Step Fusion

For each table, compute the score:

1. sales_data

  • BM25 Rank = 2, Semantic Rank = 1
  • RRF Score = (1/60+2 ) + (1/60+1) = 0.03252

2. products

  • BM25 Rank = 1, Semantic Rank = 3
  • RRF Score = (1/60+1) + (1/60+3)= .03226

3. orders

  • BM25 Rank = 3, Semantic Rank = Not Ranked
  • RRF Score = (1/60+3)= 0.01587

4. financials

  • BM25 Rank = Not Ranked, Semantic Rank = 2
  • RRF Score = (1/60+2)=0.01613

5. Sort by RRF score

  • sales_data (highest score due to top rank in semantic search).
  • products (high score from BM25).
  • orders (lower relevance overall).
  • financials (limited overlap).

Final output: ['sales_data', 'products,' 'financials,' 'orders']

Tables retrieved using Keyword Table mapping are always included.

Python
 
from rank_bm25 import BM25Okapi

def hybrid_search(query):
    # Keyword-based mapping
    keyword_to_table = {
        "sales": "sales_data",
        "product": "products",
    }
    keyword_results = [table for keyword, table in keyword_to_table.items() if keyword in query.lower()]

    # BM25 Search
    bm25 = BM25Okapi(["sales_data", "products", "orders", "financials"])
    bm25_results = bm25.get_top_n(query.split(), bm25.corpus, n=5)

    # Semantic Search
    semantic_results = vector_store.similarity_search(query, k=5)

    # Reciprocal Rank Fusion
    def reciprocal_rank_fusion(results):
        rank_map = {}
        for rank, table in enumerate(results):
            rank_map[table] = rank_map.get(table, 0) + 1 / (1 + rank)
        return sorted(rank_map, key=rank_map.get, reverse=True)

    combined_results = reciprocal_rank_fusion(bm25_results + semantic_results)

    return list(set(keyword_results + combined_results))

hybrid_search_tool = Tool(
    name="Hybrid Search",
    func=hybrid_search,
    description="Combines keyword mapping, BM25, and semantic search with RRF for table retrieval."
)


4. Re-Ranking Tool

This tool ensures the most relevant tables are prioritized.

Example

  • Input tables: ["sales_data," "products," "financials"]
  • Re-ranking logic
    • For each table, compute a relevance score by concatenating the query and the table description.
    • Sort by relevance score.
  • Output: ["sales_data," "products"]

A little more into the Re- ranking logic:

The cross-encoder calculates a relevance score by analyzing the concatenated query and table description as a single input pair. This process involves:

  • Pair input. The query and each table description are paired and passed as input to the cross-encoder.
  • Joint encoding. Unlike separate encoders (e.g., bi-encoders), the cross-encoder jointly encodes the pair, allowing it to better capture context and dependencies between the query and the table description.
  • Scoring. The model outputs a relevance score for each pair, indicating how well the table matches the query.
Python
 
from transformers import pipeline

reranker = pipeline("text-classification", model="cross-encoder/ms-marco-TinyBERT-L-2")

def re_rank_context(query, results):
    scores = [(doc, reranker(query + " " + doc)[0]['score']) for doc in results]
    return [doc for doc, score in sorted(scores, key=lambda x: x[1], reverse=True)]

re_rank_tool = Tool(
    name="Re-Ranker",
    func=re_rank_context,
    description="Re-ranks the retrieved tables based on relevance to the query."
)


5. Prompt Building Tool

This tool constructs a detailed prompt for the language model, incorporating the user’s refined query, retrieved schema, and examples from the Few-Shot Prompting Tool.

Assume you are someone who is proficient in generating SQL queries. Generate an SQL query to: Retrieve total sales grouped by product for the last 7 days.

Relevant tables:

  1. sales_data: Contains columns [sales, date, product_id].
  2. products: Contains columns [product_id, product_name].

Example SQL:

Plain Text
 
SELECT product_name, SUM(sales) FROM sales_data JOIN products ON sales_data.product_id = products.product_id GROUP BY product_name;


Future Scope

While this system uses a single agent with multiple tools to simplify modularity and reduce complexity, a multi-agent framework could be explored in the future. We could possibly explore the following:

  1. Dedicated agents for context retrieval. Separate agents for semantic and keyword searches.
  2. Task-specific agents. Agents specialized in SQL validation or optimization.
  3. Collaboration between agents. Using a coordination agent to manage task delegation.

This approach could enhance scalability and allow for more sophisticated workflows, especially in enterprise-level deployments.

Conclusion

Agentic RAG for text-to-SQL applications offers a scalable, modular approach to solving structured query tasks. By incorporating hybrid search, re-ranking, few-shot prompting, and dynamic prompt construction within a single-agent framework, this system ensures accuracy, transparency, and extensibility. This enhanced workflow demonstrates a powerful blueprint for turning natural language questions into actionable SQL queries.

Semantic search applications sql large language model RAG

Opinions expressed by DZone contributors are their own.

Related

  • The Role of Retrieval Augmented Generation (RAG) in Development of AI-Infused Enterprise Applications
  • Automating Data Pipelines: Generating PySpark and SQL Jobs With LLMs in Cloudera
  • Getting Started With LangChain for Beginners
  • Challenges of Using LLMs in Production: Constraints, Hallucinations, and Guardrails

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!