Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Classify Natural Language Without a Machine Learning Background

DZone's Guide to

Classify Natural Language Without a Machine Learning Background

With the Natural Language Classifier Watson service in IBM Bluemix, developers can classify natural language so that, for example, you can build a virtual agent application that answers common questions. Below is a simple sample how you can use this service.

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

With the Natural Language Classifier Watson service in IBM Bluemix, developers can classify natural language so that, for example, you can build a virtual agent application that answers common questions. Below is a simple sample how you can use this service.

Here is a description on the service: “The service enables developers without a background in machine learning or statistical algorithms to create natural language interfaces for their applications. The service interprets the intent behind text and returns a corresponding classification with associated confidence levels. The return value can then be used to trigger a corresponding action, such as redirecting the request or answering a question.”

In order to use the service you need to provide training data that defines the different classes and text samples that fall under certain classes. In the scenario below I have two classes – positive and negative.

positive,positive
good,positive
excellent,positive
brilliant,positive
really good,positive
best,positive
supportive,positive
reassuring,positive
encouraging,positive
negative,negative
bad,negative
ugly,negative
really bad,negative

Save this file as csv file and send it to the Watson service.

curl -i -u "<username>":"<password>" -F training_data=@./data_train.csv -F training_metadata="{\"language\":\"en\",\"name\":\"PosNegClassifier\"}" "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers

After a couple of minutes the training is done and the Watson service returns a classifier_id that you need to ask the service under which classes specific text falls. Here is a request for the word “awesome” which was not in the initial training data.

curl -G -u "<username>":"<password>" "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/3AE103x13-nlc-1116/classify" --data-urlencode "text=awesome"

The Watson service returns not only one class but up to the top five classes with the highest confidence levels.

classifierresponse2

To learn more about the service check out the online demo, the engagement gallery sample application, the documentation and the API documentation.

In order to improve the quality of the classifier you need to evaluate the results and update the training data. To simplify the management of the training data and the classifiers there is a toolkit/web application available.

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

Topics:
machine learning ,natural language understanding ,api

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}