Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

New AI Turns to Google to Help When It Realizes It Isn't Smart Enough

DZone's Guide to

New AI Turns to Google to Help When It Realizes It Isn't Smart Enough

AI using Google Searches to augment knowledge is proving better than existing AI techniques by over 10%, making for very interesting thoughts on how to improve existing ML.

· Big Data Zone
Free Resource

Effortlessly power IoT, predictive analytics, and machine learning applications with an elastic, resilient data infrastructure. Learn how with Mesosphere DC/OS.

machine-search-engine

Despite the vast quantities of data and information online, the ability to answer particular questions remains a challenge.  This is in large part because computers have difficulties classifying plain text, and this remains a major challenge for AI researchers today.

A recent paper highlights a new approach to the extraction of this information that promises to turn conventional machine learning on its head.

A Fresh Approach

This conventional approach typically requires training the algorithms on test data that allows it to search for patterns that match those given it by human annotators.  The general feeling is that the more data the algorithm is fed, the better it will be equipped to deal with challenges.

The paper argues for the opposite approach, however, with the algorithms trained on minimal data.  This option is often forced on researchers because of a paucity of appropriate data.

“In information extraction, traditionally, in natural-language processing, you are given an article and you need to do whatever it takes to extract correctly from this article,” the authors say. “That’s very different from what you or I would do. When you’re reading an article that you can’t understand, you’re going to go on the web and find one that you can understand.”

Going Surfing

So that’s what the algorithm was programmed to do.  It begins by assigning a classification a confidence score to determine how accurate they believe it to be.  If this score is too low, it then automatically loads up a search engine and queries the topic to draw upon other text on it.

It looks at each search result, in turn, continuously re-evaluating the confidence score based upon the new information, and returning to the knowledge pool whenever the score remains too low.

“The base extractor isn’t changing,” the team say. “You’re going to find articles that are easier for that extractor to understand. So you have something that’s a very weak extractor, and you just find data that fits it automatically from the web.”

The interesting thing is that all of this takes place autonomously, from the evaluation of weaknesses to the construction of the search query to the whole cycle starting over again.

In initial experiments where the system was fed around 300 documents to begin with, the algorithm proved adept at determining appropriate search terms to beef up its knowledge, and then mining on average around 10 articles from the web in order to do so.

When this was compared against more traditional machine-learning approaches to the same task, it resulted in an out-performance of around 10% for the new method.

Learn to design and build better data-rich applications with this free eBook from O’Reilly. Brought to you by Mesosphere DC/OS.

Topics:
ai ,machine learning ,big data

Published at DZone with permission of Adi Gaskell, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}