Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Gleaning Insights From Content With IBM Watson

DZone's Guide to

Gleaning Insights From Content With IBM Watson

Want to see how quick and easy it is to use the Watson API in conjunction with Exaptive Studio to create a word cloud that can discern between people, keywords, and concepts?

· Big Data Zone
Free Resource

Need to build an application around your data? Learn more about dataflow programming for rapid development and greater creativity. 

Image titleIn recent years, machine learning as a service has come of age, with robust capabilities from Amazon, Google, Microsoft and others now available through REST APIs for a fraction of the cost of deploying or developing your own capabilities. One of the better known – if not easy to separate hype from reality – is IBM Watson. While Watson gained fame as the Jeopardy!-winning supercomputer, IBM now uses the brand for a wide variety of machine learning capabilities from speech-to-text and conversational bots to text mining algorithms for understanding the concepts, references, and tone of text-based content.

In this post, we'll cover how to integrate one such IBM service – Natural Language Understanding – and rapidly prototype an application that you can try on your own content. It includes how to get started with IBM's hosted service Bluemix and the Python code to connect to the REST API. I've also included a working data application that you can run with your own text. 

The code and application below uses an API username and password for IBM Watson's free plan, which is limited to a few hundred requests per day. You can sign up for your own username and password by visiting IBM Bluemix. You can use their services for a one-month trial, after which you will need to provide a credit card (even for the free plan). Note that IBM Bluemix is a full application development, compute, and storage environment akin to Amazon's AWS or Microsoft Azure. It has starter components as well, intended to make application development simpler, like Salesforce's Heroku platform.

Once you have valid API credentials, you can connect to Watson services using any programming language capable of making simple web-based REST API calls. We use Python because of its ubiquity as a data programming language, though similar code would work fine in most other languages. IBM also offers SDKs for Python and other languages that wraps the API calls for you.

Each service has a different endpoint and slightly different parameters for passing content and configuration options. For this post, we include information on how to use the Natural Language Understanding (NLU) service. Consult IBM's developer cloud documentation for Watson services to learn about other capabilities.

The following Python function connects to the NLU service, sends the text of interest, and returns the response object as a Python dictionary.

def call_watson_api(text):
    time.sleep(1)

    url_base = 'https://gateway.watsonplatform.net/natural-language-understanding/api/v1/analyze?version=2017-02-27'

    features = {}

    features['keywords'] = {
        "sentiment": True,
        "limit": 10
    }

    features['concepts'] = {
        "limit": 10
    }

    features['entities'] = {
        "sentiment": False,
        "limit": 10
    }

    data = {
        "text": text,
        "features": features
    }

    creds = HTTPBasicAuth(YOUR_API_USERNAME, YOUR_API_PASSWORD)

    headers = {'Content-Type': 'application/json'}

    res = requests.post(url_base, data=json.dumps(data), auth=creds, headers=headers)
    res_json = res.json()
    return res_json

In this instance, we configured the algorithm to pick up concepts, keywords, and entities (people, places, companies, etc.). The NLU service can also detect emotion and sentiment, relationships, and other text-based features by means of the parameters object. The service returns a dictionary with keys representing the requested items (concepts, etc.) and values being lists of discovered items, themselves entities with name, type, and relevance among other attributes.

The relevance score gives you a sense of both how confident the algorithm is in what it discovered and how prominent the item is in the content. The score ranges from 0 (little confidence/prominence) to 1 (very high confidence/prominence).

Like other text mining algorithms, Watson's NLU service works best on modern everyday communication. The service's usefulness out of the box is lower for outdated text or industry-specific cases like law or medicine. In those instances, you will likely need to either build a custom model, which Watson provides facilities for, or do pre- and post-processing to remove noise and otherwise improve upon the service.

So now it's time to give the API a whirl. Copy and paste some text into the application below and click the button. The word cloud will show you items that Watson discovered, coloring them by type and giving more prominence to those with higher relevance scores. Using just a handful of pre-built components, I built this application in about 15 minutes. (You can view it full screen and edit it your own version by signing up for the community edition of the Exaptive Studio with this link, which adds the application and its components to your studio so you can get started quickly.)

NOTE: The application is only pictured below, but doesn't function on DZone; clicking on it will take you to the actual application that is fully functional and embedded in the original blog post. 

Image title

Above, you'll see an example of input and output from the NLU text analysis. Click on the image to test the fully interactive application for yourslef on the original blog post



Here's a dataflow diagram of what's going on under the hood of the xap. Dataflow programming works well for communicating and digesting quickly how an application is working. Follow the flow left to right.

Image title

Pretty simple. Watson did most of the work. My script above is encapsulated in the component "IBMWatson..." All I had to do is add a visualization, the word cloud, and some UI elements to make IBM Watson accessible to just about anyone. What a world?! If you decide to iterate on it yourself in the Studio, let me know what you do with it! 

Check out the Exaptive data application Studio. Technology agnostic. No glue code. Use what you know and rely on the community for what you don't. Try the community version.

Topics:
bluemix ,watson ,exaptive ,api ,nlu ,natural language understanding

Published at DZone with permission of Matt Coatney, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}