Creating an Agentic RAG for Text-to-SQL Applications
Agentic RAG simplifies text-to-SQL by modularizing tasks into tools like query transformation, hybrid search, and re-ranking, ensuring accuracy and scalability.
Join the DZone community and get the full member experience.
Join For FreeThe blend of retrieval-augmented generation (RAG) and generative AI models has brought changes to natural language processing by improving the responses to queries. In the realm of Agentic RAG, this conventional method of relying on a monolithic model for tasks has been enhanced by introducing modularity and autonomy. By breaking down the problem-solving process into tools integrated within an agent, Agentic RAG provides benefits like accuracy, transparency, scalability, and debugging capabilities.
The Vision Behind Agentic RAG for Text-to-SQL
Traditional RAG systems often retrieve relevant documents and rely on a single monolithic model to generate responses. Although this is an effective method in some cases, when it comes to structural outputs like the case of generating SQL, this approach may not be the most effective. This is where we can leverage the power of the Agentic RAG framework, where we:
- Divide the tasks into smaller, more manageable tools within an agent
- Improve accuracy by assigning tasks to specialized tools
- Enhance transparency by tracing the reasoning and workflow of each tool
- Simplify scaling and debugging through modular design
Let's talk about how this tool works and the role each component plays in transforming user questions into accurate SQL queries.
Architecture Overview
The structure comprises an agent utilizing tools within the text-to-SQL workflow. The process can be summarized as follows:
User Query → Query Transformation Tool → Few Shot Prompting Tool → Hybrid Search Tool → Re Ranking Tool → Table Retrieval Tool → Prompt Building Tool → LLM Execution Tool → SQL Execution Tool → Final Output
1. User Query Transformation Tool
This tool would entail processing the user query for a better understanding of the LLM. It addresses ambiguities, rephrases user questions, translates abbreviations into their forms, and provides context when necessary.
Enhancements
- Handle temporal references. Map terms like "as of today" or "till now" to explicit dates.
- Replace ambiguous words. For example, "recent" could be replaced by "last 7 days."
- Connecting shorthand or abbreviations to their names.
Example
Input: "Show recent sales MTD."
Transformed query: "Retrieve sales data for the last 7 days (Month to Date)."
from datetime import date, timedelta
def transform_query(user_query):
# Handle open-ended temporal references
today = date.today()
transformations = {
"as of today": f"up to {today}",
"till now": f"up to {today}",
"recent": "last 7 days",
"last week": f"from {today - timedelta(days=7)} to {today}",
}
for key, value in transformations.items():
user_query = user_query.replace(key, value)
# Map common abbreviations
abbreviations = {
"MTD": "Month to Date",
"YTD": "Year to Date",
}
for abbr, full_form in abbreviations.items():
user_query = user_query.replace(abbr, full_form)
return user_query
query_transform_tool = Tool(
name="Query Transformer",
func=transform_query,
description="Refines user queries for clarity and specificity, handles abbreviations and open-ended terms."
)
2. Few Shot Prompting Tool
This tool makes a call to the LLM to identify the question of a kind from a set (we can also say matching the template). The matched question enhances the prompt with an example SQL query.
Example Workflow
1. Input question: "Show me total sales by product for the 7 days."
2. Predefined templates:
- "Show sales grouped by region." → Example SQL; SELECT region, SUM(sales) ...
- "Show total sales by product." → Example SQL; SELECT product_name, SUM(sales) ...
3. Most similar question: "Show total sales by product."
4. Output example SQL: SELECT product_name, SUM(sales) FROM ...
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(model="gpt-4")
predefined_examples = {
"Show sales grouped by region": "SELECT region, SUM(sales) FROM sales_data GROUP BY region;",
"Show total sales by product": "SELECT product_name, SUM(sales) FROM sales_data GROUP BY product_name;",
}
def find_similar_question(user_query):
prompt = "Find the most similar question type for the following user query:\n"
prompt += f"User Query: {user_query}\n\nOptions:\n"
for example in predefined_examples.keys():
prompt += f"- {example}\n"
prompt += "\nRespond with the closest match."
response = llm.call_as_function(prompt)
most_similar = response['content']
return predefined_examples.get(most_similar, "")
few_shot_tool = Tool(
name="Few-Shot Prompting",
func=find_similar_question,
description="Finds the most similar question type using an additional LLM call and retrieves the corresponding example SQL."
)
3. Hybrid Search Tool
For a robust retrieval, this tool combines semantic search, keyword search based on BM25, and keyword-based mapping. The search results from these search methods are put together using reciprocal rank fusion.
How does it all come together?
Keyword Table Mapping
This approach maps the tables to the keywords that are contained in the query. For example:
- The presence of "sales" results in the sales table being shortlisted.
- The presence of "product" results in the products table being shortlisted.
Keyword Overlap Mapping (BM25)
This is a search method based on keyword overlap that shortlists tables in line with relevance. For this, we shall apply the BM25 technique. This sorts the papers in order of relevance to a user search. This search technique considers term saturation in view as well as TF-IDF (Term Frequency-Inverse Document Frequency).
Term Frequency (TF) helps one to measure the frequency of a term in a given document. The Inverse Document Frequency (IDF) approach underlines words that show up in every document lessening importance.
Normalizing takes document length into account to prevent any bias toward longer papers.
Given:
- sales_data: Contains terms like "sales," "date," "product."
- products: Contains terms like "product," "category."
- orders: Contains terms like "order," "date," "customer."
- financials: Contains terms like "revenue," "profit," "expense."
User query: "Show total sales by product."
- Identify terms in the user query: ["sales," "product"].
- Sort every document (based on frequency and relevance of these terms) in DataBaseTable.
Relevance of documents:
- sales: High relevance due to both "sales" and "product"
- products: High relevance due to "product."
- orders: Lower relevance due to the presence of only "sales."
- financials: Not relevant.
Output:
Ranked list: [products, sales_data, orders, financials]
Semantic Search
In this search method, as the name suggests, we find semantically similar tables utilizing vector embeddings. We achieve this by calculating a similarity score, such as cosine similarity, between the document (table vectors) and user query vectors.
Reciprocal Rank Fusion
Combines BM25 and semantic search results using reciprocal rank fusion strategy, which is explained a little more in detail below:
Reciprocal Rank Fusion (RRF) combining BM25 and semantic search:
RRF is a method to combine results from multiple ranking algorithms (e.g., BM25 and semantic search). It assigns a score to each document based on its rank in the individual methods, giving higher scores to documents ranked higher across multiple methods.
RRF formula:
RRF(d) = Σ(r ∈ R) 1 / (k + r(d))
Where:
- d is a document
- R is the set of rankers (search methods)
- k is a constant (typically 60)
- r(d) is the rank of document d in search method r
Step-by-Step Example
Input data.
1. BM25 ranking results:
- products (Rank 1)
- sales_data (Rank 2)
- orders (Rank 3)
2. Semantic search ranking results:
- sales_data (Rank 1)
- financials (Rank 2)
- products (Rank 3)
Step-by-Step Fusion
For each table, compute the score:
1. sales_data
- BM25 Rank = 2, Semantic Rank = 1
- RRF Score = (1/60+2 ) + (1/60+1) = 0.03252
2. products
- BM25 Rank = 1, Semantic Rank = 3
- RRF Score = (1/60+1) + (1/60+3)= .03226
3. orders
- BM25 Rank = 3, Semantic Rank = Not Ranked
- RRF Score = (1/60+3)= 0.01587
4. financials
- BM25 Rank = Not Ranked, Semantic Rank = 2
- RRF Score = (1/60+2)=0.01613
5. Sort by RRF score
- sales_data (highest score due to top rank in semantic search).
- products (high score from BM25).
- orders (lower relevance overall).
- financials (limited overlap).
Final output: ['sales_data', 'products,' 'financials,' 'orders']
Tables retrieved using Keyword Table mapping are always included.
from rank_bm25 import BM25Okapi
def hybrid_search(query):
# Keyword-based mapping
keyword_to_table = {
"sales": "sales_data",
"product": "products",
}
keyword_results = [table for keyword, table in keyword_to_table.items() if keyword in query.lower()]
# BM25 Search
bm25 = BM25Okapi(["sales_data", "products", "orders", "financials"])
bm25_results = bm25.get_top_n(query.split(), bm25.corpus, n=5)
# Semantic Search
semantic_results = vector_store.similarity_search(query, k=5)
# Reciprocal Rank Fusion
def reciprocal_rank_fusion(results):
rank_map = {}
for rank, table in enumerate(results):
rank_map[table] = rank_map.get(table, 0) + 1 / (1 + rank)
return sorted(rank_map, key=rank_map.get, reverse=True)
combined_results = reciprocal_rank_fusion(bm25_results + semantic_results)
return list(set(keyword_results + combined_results))
hybrid_search_tool = Tool(
name="Hybrid Search",
func=hybrid_search,
description="Combines keyword mapping, BM25, and semantic search with RRF for table retrieval."
)
4. Re-Ranking Tool
This tool ensures the most relevant tables are prioritized.
Example
- Input tables: ["sales_data," "products," "financials"]
- Re-ranking logic
- For each table, compute a relevance score by concatenating the query and the table description.
- Sort by relevance score.
- Output: ["sales_data," "products"]
A little more into the Re- ranking logic:
The cross-encoder calculates a relevance score by analyzing the concatenated query and table description as a single input pair. This process involves:
- Pair input. The query and each table description are paired and passed as input to the cross-encoder.
- Joint encoding. Unlike separate encoders (e.g., bi-encoders), the cross-encoder jointly encodes the pair, allowing it to better capture context and dependencies between the query and the table description.
- Scoring. The model outputs a relevance score for each pair, indicating how well the table matches the query.
from transformers import pipeline
reranker = pipeline("text-classification", model="cross-encoder/ms-marco-TinyBERT-L-2")
def re_rank_context(query, results):
scores = [(doc, reranker(query + " " + doc)[0]['score']) for doc in results]
return [doc for doc, score in sorted(scores, key=lambda x: x[1], reverse=True)]
re_rank_tool = Tool(
name="Re-Ranker",
func=re_rank_context,
description="Re-ranks the retrieved tables based on relevance to the query."
)
5. Prompt Building Tool
This tool constructs a detailed prompt for the language model, incorporating the user’s refined query, retrieved schema, and examples from the Few-Shot Prompting Tool.
Assume you are someone who is proficient in generating SQL queries. Generate an SQL query to: Retrieve total sales grouped by product for the last 7 days.
Relevant tables:
- sales_data: Contains columns [sales, date, product_id].
- products: Contains columns [product_id, product_name].
Example SQL:
SELECT product_name, SUM(sales) FROM sales_data JOIN products ON sales_data.product_id = products.product_id GROUP BY product_name;
Future Scope
While this system uses a single agent with multiple tools to simplify modularity and reduce complexity, a multi-agent framework could be explored in the future. We could possibly explore the following:
- Dedicated agents for context retrieval. Separate agents for semantic and keyword searches.
- Task-specific agents. Agents specialized in SQL validation or optimization.
- Collaboration between agents. Using a coordination agent to manage task delegation.
This approach could enhance scalability and allow for more sophisticated workflows, especially in enterprise-level deployments.
Conclusion
Agentic RAG for text-to-SQL applications offers a scalable, modular approach to solving structured query tasks. By incorporating hybrid search, re-ranking, few-shot prompting, and dynamic prompt construction within a single-agent framework, this system ensures accuracy, transparency, and extensibility. This enhanced workflow demonstrates a powerful blueprint for turning natural language questions into actionable SQL queries.
Opinions expressed by DZone contributors are their own.
Comments