DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Exploring the New Boolean Data Type in Oracle 23c AI
  • How To Call Cohere and Hugging Face AI From Within an Oracle Database Using JavaScript
  • AI Paradigm Shift: Analytics Without SQL
  • Custom Model Context Protocol (MCP) for NL2SQL: A Rigorous Evaluation Framework on Oracle Database

Trending

  • How to Format Articles for DZone
  • Operationalizing Enterprise AI at Scale: Architecture, Governance, and Adoption
  • Build a GitHub Slack Bot With AWS Bedrock and MCP, Part 1
  • Compliance Automated Standard Solution (COMPASS), Part 10: How OSCAL Mapping Paves the Way for Continuous Compliance Scalability
  1. DZone
  2. Data Engineering
  3. Databases
  4. SELECT AI Query Integration Using Oracle Autonomous Database 26AI and OpenAI

SELECT AI Query Integration Using Oracle Autonomous Database 26AI and OpenAI

A custom framework testing Oracle 26ai’s ability to convert natural language into SQL using the 22 TPC-H benchmark. With no prompt engineering, it achieves high accuracy.

By 
Sanjay Mishra user avatar
Sanjay Mishra
·
Apr. 01, 26 · Analysis
Likes (0)
Comment
Save
Tweet
Share
3.3K Views

Join the DZone community and get the full member experience.

Join For Free

This is one of the first studies on Oracle 26 AI, which was launched in January 2026. Instead of writing complex SQL queries, Oracle Autonomous Database introduced Oracle 26ai, which uses natural-language programming techniques to write SQL queries. As part of this research paper, I am examining the capabilities of the SELECT AI. I will be examining the accuracy and latency of LLMs.

Setup

I used approximately 10 GB of data generated by my Python script (the underlying tables are not different from what is available in TPC-H and Kaggle). I inserted my own data to enable this in a faster way. Please refer to https://github.com/sanmish4ds/oracle26ai-eval/blob/main/data_setup.py.

I have also created an Oracle Autonomous Database environment, which is free, and I found it online easy enough for a 5–10 minute setup:

  1. Sign up for Oracle Cloud.
  2. Go to Autonomous Database in the OCI console.
  3. Click "Create Autonomous Database."
  4. Select the "Always Free" tier.
  5. Choose "Data Warehouse" workload.
  6. Version: 26ai.
  7. Download wallet (credentials).
  8. You'll get a wallet.zip file with connection info.
  9. Once you follow the above and log in, create an OpenAI or any LLM key. I used OpenAI.
    SQL
     
    BEGIN
      DBMS_CLOUD.CREATE_CREDENTIAL(
        credential_name => 'OPENAI_CRED',
        username        => 'openai',
        password        => 'sk-your-openai-api-key-here'  -- Get from openai.com
      );
    END;
    /
  10. Configure Select AI to use OpenAI.
    SQL
     
    BEGIN
      DBMS_CLOUD_AI.CREATE_PROFILE(
        profile_name => 'EVAL_PROFILE',
        attributes   => '{"provider": "openai",
                          "credential_name": "OPENAI_CRED",
                          "model": "gpt-4",
                          "temperature": 0.1,
                          "max_tokens": 1000}'
      );
    END;
    /
  11. Set as the default profile.
    SQL
     
    BEGIN
      DBMS_CLOUD_AI.SET_PROFILE(profile_name => 'EVAL_PROFILE');
    END;
    /
  12. Test it works!
    SQL
     
    SELECT DBMS_CLOUD_AI.GENERATE(
             prompt       => 'show me all customer ',
             profile_name => 'OPENAI_GPT4',
             action       => 'runsql'
           ) FROM DUAL;

Refer to the following GitHub file to further see the file https://github.com/sanmish4ds/oracle26ai-eval/blob/main/Data-Setup-Help-Sqls-26ai.sql.

I have created the NL_SQL_TEST_QUERIES table with 22 TPC-H benchmark queries (simple/medium/complex) and stored the ground truth SQL results as a comparison baseline. We are all set to test how SELECT AI works and the accuracy with which it creates SQL queries.

Here is the ground truth Queries link on GitHub: https://github.com/sanmish4ds/oracle26ai-eval/blob/main/TPCH_22_QUERIES.txt.

Execution

Here is the Python project structure:

Python project structure


I created a Python orchestrator (main.py) that runs two experiments: 

  • Accuract test
  • Latency test

Accuracy Test

The accuracy test compares AI-generated SQL result counts with ground truth. 

Step-by-Step

  1. Load ground truth questions: Gets 22 test questions ordered Q1-Q22
  2. For each question:
    • Generate SQL: Asks Oracle AI to write SQL from the English question (using DBMS_CLOUD_AI)
    • Run AI SQL: Run the AI-generated SQL and counts results using COUNT(*)
    • Run ground truth SQL: Run the ground truth SQL and counts results
    • Compare: Checks if both returned the same number of rows
  3. Smart pattern detection:
    • If row counts of both AI and Ground Truth match → marks as successful.
    • Also recognize equivalent SQL patterns (e.g., ROWNUM vs. FETCH FIRST). Sometimes the LLM chooses a different—and even better—approach for generating queries; for example, it may avoid using COUNT(*) when counting rows.
  4. Save results: Saves to a CSV file with columns: query_id, question, AI SQL, row counts, success status, and visualizes the comparison.

Accuracy Test Summary and Result Visualization 

Total queries tested: 22

By SQL complexity: The table below shows that, even though queries are simpler, the LLM is not giving an exact match, so we probably need a better prompt.

Category Success Rate Exact Match Semantic Match Total Queries
Complex 90.0% 90.0% 90.0% 10
Medium 100.0% 100.0% 100.0% 8
Simple 100.0% 75.0% 75.0% 4
Overall Total 95.5% 90.9% 90.9% 22

Accuracy Metrics Visually Presented by Graphs

Overall accuracy metrics

Accuracy by complexity

Latency Test

The latency test measures LLM generation time and Oracle execution time. 

Step-by-Step

  1. Load questions: Gets 22 ground truth test questions
  2. For each question (3-stage timing):
    • Stage 1: LLM generation time (thinking)
      • Asks Oracle AI to generate SQL from English
      • Stops before running the SQL
      • Records how long AI took to think and write the SQL
      • Result: llm_latency_ms (e.g., 2500ms = 2.5 seconds)
    • Stage 2: Oracle execution time (doing)
      • Runs the ground truth SQL 
      • Records database query execution time
      • Result: oracle_exe_ms (e.g., 500ms = 0.5 seconds)
    • Stage 3: Ground truth execution
      • Runs the correct SQL to compare timing
      • Records for comparison
  3. Calculate metrics:
    • Total time = LLM time + Oracle execution time
    • Overhead ratio = LLM time ÷ Oracle time (how much slower is LLM vs. SQL?)
    • Example: If AI takes 3000ms and database takes 1000ms → Ratio = 3x (AI is 3x slower than database)
  4. Save and report:
    • Saves to CSV with columns: query_id, llm_latency_ms, oracle_exe_ms, total_latency_ms, overhead_ratio
    • Prints statistics: Mean, median, P95, P99 latencies
    • Shows breakdown: "Average LLM: 3500ms, Average Oracle: 500ms, Ratio: 7x"

Latency Test Summary and Result Visualization

Latency analysis: LLM is doing the heavy lifting, as shown in the table below.

Metric Latency (ms)
Mean (Average) 3,906.64 ms
Median 3,872.82 ms
P95 (95th Percentile) 4,812.40 ms
P99 (99th Percentile) 5,018.30 ms


Latency breakdown: The table below explains that the time spent during SQL by the LLM generative phase is significantly more time-intensive than the execution phase.

Component Avg. Time (ms) % of Total Latency
LLM Generation 3,691.35 ms 94.5%
Oracle Execution 215.29 ms 5.5%
Total 3,906.64 ms 100%


Visualization Using a Graph

LLM vs. Oracle execution (top 10)

Top 10 queries by LLM overhead

Query latency distribution

Conclusion

I built a comprehensive framework to measure Oracle Database 26ai's capability to generate SQL from natural language, utilizing 22 TPC-H benchmark queries. While this technology is highly promising and can save developers, engineers, and data scientists significant time, thorough validation remains essential.

One of the most critical observations in my research is the model's ability to generate efficient queries; for instance, where the ground truth was a broad COUNT(*), the AI intelligently selected only the specific columns required. 

Do note that even the latency metrics indicate the LLM takes time to generate SQL, it is important to understand that a developer would likely take longer to manually draft a complete, complex query.

I kept prompt engineering outside the current scope of my research, as it has the potential to drive even higher accuracy, making SELECT AI of Oracle 26ai an invaluable tool for software engineers.

 Refer to the GitHub code repo for full functional code: https://github.com/sanmish4ds/oracle26ai-eval/tree/main.

AI Oracle Database sql

Opinions expressed by DZone contributors are their own.

Related

  • Exploring the New Boolean Data Type in Oracle 23c AI
  • How To Call Cohere and Hugging Face AI From Within an Oracle Database Using JavaScript
  • AI Paradigm Shift: Analytics Without SQL
  • Custom Model Context Protocol (MCP) for NL2SQL: A Rigorous Evaluation Framework on Oracle Database

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook