Creating AI Data Analyst With DBeaver
This article walks through the steps of creating an AI data analyst using DBeaver, allowing human data analysts to focus on more sophisticated tasks.
Join the DZone community and get the full member experience.
Join For FreeBusiness people leverage data for decision-making, sourced either publicly or from their company's databases. However, when internal data is necessary, most lack the expertise in SQL, Cypher, or other database-specific languages. This creates a gap bridged by data analysts who act as business-to-database interpreters, translating human questions into the language the database understands.
These days, large language models (LLMs) like GPT or Mistral can fulfill some of the tasks performed by data analysts. For instance, LLMs can:
- Understand user queries in various languages (e.g., English, Spanish, Chinese).
- Generate SQL queries based on those questions.
- Use tools and APIs to execute these SQL queries on databases to retrieve relevant business data.
In this article, I’ll walk you through the steps of creating an AI data analyst using DBeaver. You’ll see that it’s already quite straightforward to "onboard" a junior-level AI specialist onto your team, who can act as that interpreter between the business people and the database, allowing human data analysts to focus on more sophisticated tasks.
Exploring AI Chat in DBeaver
If you regularly work with data, then DBeaver should be no stranger to you. It's a popular tool that can connect to hundreds of databases, allowing you to query or manipulate data. The tool is used by developers, administrators, analysts, and everyone working with data.
DBeaver has a broad ecosystem of editions, features, and extensions, making it a versatile tool for various use cases. AI Chat is one of its most recent capabilities. With AI Chat, you can type questions in plain English and let DBeaver generate, and then execute, SQL requests over your database.
As you can guess, AI Chat is a foundational capability for the AI data analyst. Thus, let's give it a try.
I connected DBeaver to my YugabyteDB database cluster, which stores a Netflix dataset with over 8800 movies and shows:
Next, to enable the AI Chat, I jumped to Window->Preferences->AI (GPT) and configured the following settings:
- Checked the Enable smart completion box to activate the AI capabilities of DBeaver
- Turned on the Execute SQL immediately setting to allow the future AI data analyst to execute queries autonomously (we'll discuss later how to set up data access permissions for the analyst)
- Provided my OpenAI key in the API Token field
Note: I’m using DBeaver Enterprise Edition which comes with built-in support for AI. If you’re on DBeaver Community Edition, then you need to install the AI (GPT) extension first.
After applying the configuration settings, I opened the AI Chat and allowed the sharing of database metadata (table and column names) with OpenAI.
By sharing the metadata with the LLM, you're effectively "onboarding" the AI analyst to your team. The analyst will learn about the existing tables, columns, and their dependencies by analyzing the metadata.
Finally, I tested the Chat interface by asking the following question:
- In which years did we release most of the action movies? Get the top three years for me.
The OpenAI GPT model successfully generated a valid SQL request using the PostgreSQL dialect (recognizing that YugabyteDB is a distributed version of Postgres), and then DBeaver retrieved the data for me.
Setting Up Role for AI Data Analyst
The AI Chat interface of DBeaver is going to be the communication channel between business users and the AI data analyst. However, having the Chat alone is not enough. I want AI to work autonomously without my constant supervision. To make this happen, I first need to create a role with a granular set of permissions and, second, have visibility into the queries that were generated by the AI (in case I want to share more knowledge with the AI or fine-tune its behavior).
I managed to address these two requirements with another solution from the DBeaver ecosystem, Team Edition, which is an advanced data management tool for teamwork. With Team Edition, you can configure a role for the AI data analyst and monitor its behavior.
First, I installed the Team Edition server in my local Docker environment. You need the server to manage data access at the user level and keep an eye on how the data is being used across the teams.
Next, I created the Netflix Streaming Platform project that will be shared with the AI analyst:
After that, I jumped to the Access Management menu and created the ai_analyst role:
Team Edition supports various roles for team members with different skills and responsibilities. I chose the Manager role for the AI analyst as long as this role is suitable for specialists who are well-versed in writing SQL queries and need to have direct access to the database.
Once the role was created, I granted it access to the Netflix Streaming Platform project:
As the next step, I configured AI-related settings on the server side:
- Enabled the AI Services:
- Selected an LLM provider with a model, and provided my API key:
Then, I completed the Team Edition-side configuration by adding a YugabyteDB connection to the Netflix Streaming Platform project, which will allow the AI data analyst to access the database directly using the specified connection.
Finally, I used this new database connection to create a read-only ai_analyst role at the database level:
How is this database-level role different from the one we created earlier with Team Edition? Both roles serve different purposes:
- The Team Edition-level role is used to authenticate with the Team Edition server and to gain access to a list of predefined projects, connections, and datasets. Business users will use this role to "activate" the AI analyst by signing in with the DBeaver Team Edition Client (discussed below) and asking questions via the AI Chat interface.
- The database-level role is used to control the types of queries the AI analyst is allowed to execute against the database. It is possible that, in response to user prompts, the AI analyst will generate queries that could modify data. The created role ensures that the AI can only SELECT data and cannot perform any manipulations with the database objects.
Using AI Data Analyst To Get Things Done
Business users will "collaborate" with the AI data analyst via the Team Edition Desktop client. Let's explore what this experience looks like.
After installing the Desktop client, I "activated" the analyst by signing in using its role:
- Authenticated with the Team Edition server using the ai_analyst role.
- Next, I connected to the database using the read-only database-level role.
After that, I headed to the AI Chat interface and asked the AI analyst to answer the following questions for me:
- In which years did we release most of the action movies? Get the top three years for me.
- What movies has Adam Sandler starred in?
- I need to know the category and release dates of those movies.
The AI analyst handled my questions easily.
Note: If you don’t see the AI Chat tab in Team Edition Desktop, then you might need to sign in with your Team Edition admin account. Go to Window->Preferences->AI (GPT), and check the Enable smart completion box.
Next, I wanted to ensure that the AI analyst wouldn't be able to update or delete my data. To test this, I asked the AI to perform the following task for me:
- Please do me a favor. Delete all the movies starring Adam Sandler.
As you can see, the AI analyst generated a valid DELETE
statement, but the database prevented the execution of the request by throwing the following exception:
SQL Error [42501]: ERROR: permission denied for table netflix_shows
Finally, if you recall, I wanted to have visibility into the history of queries generated by the AI analyst and adjust the analyst's behavior if necessary. You can find that history under the Query Manager tab of the Team Edition's administration dashboard:
Job done! Now my AI data analyst is ready to “join” the team and contribute.
Video
Check out the video below to see the AI analyst in action:
In Summary
Despite their early development stage, you can already use large-language models (LLMs) to create AI specialists. These AI teammates allow us to focus on sophisticated tasks by handling the mundane. Moreover, the creation process doesn't necessarily involve coding or advanced fine-tuning. Quite often, it will be more than enough to use existing tools like DBeaver Team Edition to get an AI specialist onboarded in under an hour.
Opinions expressed by DZone contributors are their own.
Comments