Watson NLU-Based Q+A System for Online TV Guide
With this solution, you don't need to do any work to find out information like name, time, date, and channel for your favorite TV shows — the AI will do it for you!
Join the DZone community and get the full member experience.Join For Free
Note: Previous experience on Python development a bonus.
This recipe shows how to use Knowledge Studio to create a machine learning annotator model and use it to extract entities on questions about a TV guide service.
- Bluemix account; sign up for a trial account if you do not already have one.
- Knowledge studio account; sign up for a trial account if you do not already have one.
- Python 2.7
- Get the Watson Developer Cloud Python SDK
Let's go through all the steps to build our project!
It is assumed that you have an IBM Bluemix account. Sign in to your account and select Catalog and search Understanding. The service that we are interested in is highlighted below:
Select the Natural Language Understanding service and create a new instance.
You will need to click on the Service Credentials and View Credentials link to get the details that we need to populate the Python script.
Configure Machine Learning Annotator in IBM Knowledge Studio
As defined in the Requirements section, it is assumed that you have a Knowledge Studio account. Sign in to your account and launch Knowledge Studio.
Add Entities Type
The first step is to click Type System > Add Entity Type. We defined ten entity types; four entity types with prefix Req_ refer to four questions types (time, channel, date, and TV program).
Import Documents for Annotation
The next step is to click Documents > Import Document Set. These documents contain questions about the TV programs.
Following is an example of the document's contents:
Create Annotation Set
The system needs a set of human annotators for identifying all entities for each document containing questions. You must create an annotation set and assign a human annotator, finally, you will need to add annotation tasks to track the work of human annotators in Watson Knowledge Studio. For more information, please visit the Watson Knowledge Studio documentation.
Open the Test annotation task you created above and select each document containing questions. We have annotated all TV show names with the Program entity type and information about time, date, and channel with Tiempo, Fecha, and Canal respectively. Finally, the information request about channel, time, date, and program has been annotated with Req_Tiempo, Req_Fecha, and Req_Canal entity types respectively.
Create a Machine Learning Annotator Model
We trained and evaluated the machine learning annotator model with 70% and 23% of the questions respectively, and 7% as blind set.
Deploy the Machine Learning Annotator Model
Go to Machine Learning Annotator Model Details > Deploy > Natural Language Understanding.
Click on the NLU link to get the ID model that we need to populate the Python script (
Back-End on Python
We separate the back-end code in three Python scripts (
tvguide.py are interfaces for Watson NLU and TV Guide services, respectively, and
Respond.py is the service for questions and answers regarding TV programs and mashes the information given by the
tvguide.py Python scripts.
Use NLU Machine Learning Model
We created an interface to the Natural Language Understanding API using Python script (
nlu.py) and imported the class
As seen below within the
nlu.py script, in the function
text_analisis, we added username, password, and version variables with credential information and model variable with ID model. This function clears Entity information and returns a list of dictionaries with Entity names and values.
Use Online TV Guide Service
We created an interface (
tvguide.py) to get information about TV programs. We used an API REST service thatgives information (name, time, date, and channel) about each TV show in the TV guide.
BuscarProgramas in the
tvguide.py script receives keywords about TV shows — for example,
noticias — and returns a data structure with information about TV shows with their corresponding schedules.
The response is generated by a Python script (
Respond.py). This script sends the question given by the user to the Watson NLU and receives a data structure with detected entity types. The information of entities is sent to the TV schedule service as keywords, and in returns receives a data structure with information about TV shows with their corresponding schedules.
The response is a concatenation between the TV show information and a predesigned script. The result is the sentence; for example,
#T.Vshow saldrá el día #date a las #time por el canal #channel”.
And that's it!
Opinions expressed by DZone contributors are their own.