Talk to Your Database
Talk to Your Database
In this article, let's take a look at a demo of how to talk to your database in plain English.
Join the DZone community and get the full member experience.Join For Free
DISCLAIMER: This post in based on personal experiences and the situations explained here may not apply in other context.
Figures displayed on the examples are just samples for demo purposes, not actual data.
In every company in the world, employees need access to information. Most companies purchase and install expensive software solutions or even spend years developing complex reporting systems on-site.
However, they all fall short satisfying user needs. They are either too complex, and non-technical people can’t understand how to use those tools, or they are too user-friendly and they lack the flexibility these users need.
So, how about removing these applications and let the people just ask what they need in plain English?
You may also like: Spring Data JPA and Database Relationships
In this post, I’ll explain a proof of concept of this approach for a retail clothing store.
1. The Data
The first thing you need is a place where you can have all your business data together.
In a best-case scenario, you have a corporate data warehouse where all the information is available under strict data quality measures. Another alternative is having an Operational Data Layer providing access to different data sources seamlessly.
Having this, you are able to run queries against a single data source or data service.
2. The Interface
OK, so how can non-technical people use all this data without a complex tool? Easy, just let them ask what they need in their own words.
After studying the domain and all its details, you can design a language model that represents your business reality. We have:
They represent the user's intention; what they want to get. For example, the intent for, “How much did we sell yesterday” might be Sales.getTotalSales.
As you can see, in this PoC, we split intents by domain. We have Sales.* and Stock.* groups. Then, each intent refers to a use case the same way you write methods in a class.
Among intentions, we had to detect specific entities the user may be referring to. For example, consider this:
Total units sold yesterday in the UK by product type and section.
Our user wants to get the units sold yesterday in the UK, grouped by product type (Clothing or Shoes) and section (Woman, Man or Kids).
- Attributes: group by fields, which are also projected
- Indicators: figures or similar, usually using a group function
- Filters: conditions used in the where clause
Once these entities are detected and classified, building the actual SQL query is quite easy.
3. LUIS: Language Understanding Intelligent Service
For this PoC, we’ll be using LUIS (https://luis.ai/) to perform the language analysis, it’s a great, simple-to-use service by Microsoft that fits great into our approach.
As we discussed early, we defined the entities in the language model:
- Sales.get: when a user wants to get sales-related data
- Stock.get: same for stock
For now, this modeling is really simple, as it’s on PoC. To make the system work, we just have to provide some examples for each intent and then train and publish the model. All this without leaving the dashboard.
And the entities as well, which you must be already familiar with: Attributes, Filters, and Indicators.
Once the model is defined, built, and trained, we can do some testing right in the web portal.
For us, it’d be perfect the user just wrote “section=1” in his phrase, however, this isn’t exactly natural language, right?
So we figured out a little workaround using LUIS, consisting of declaring the entities as these cryptic expressions and then giving plain English synonyms to each expression.
Using It as a Service
Another cool feature from LUIS is exposing this model as a service, allowing us to query some HTTP endpoint.
4. Putting It All Together
To test this entire flow, we’ll need a simple web app with a search box, where the business users can write their queries using their own words.
Submitting this query triggers the text analysis request, providing the application of the language analysis and the actual SQL query.
Afterward, the app runs this query on the database and sends the result set back to the front end.
We could just render a table with the resultset, but since we have the full language analysis, we can do better :).
By having a couple of rules, like rendering a line chart when the date attribute is present, we can suggest to the user different visualizations depending on the query attributes.
Thanks for reading!
Published at DZone with permission of Alejandro Martin . See the original article here.
Opinions expressed by DZone contributors are their own.