Intelligent Agents: Machine Reading Comprehension
Intelligent Agents: Machine Reading Comprehension
Read on in order to discover more information on AI models and machine reading comprehension. Are AI models surpassing humans?
Join the DZone community and get the full member experience.Join For Free
On 5th of January this year, an AI model, for the first time, outperformed humans in reading comprehension. The SLQA+ (ensemble) model from Alibaba recorded an exact match score of 82.44 against the human score of 82.304, on the SQuAD dataset.
It turns out that the Microsoft r-net+ (ensemble) model had achieved 82.650 two days prior to that. And since then, two other models have also gone on to beat the human EM score. While none of the models have beaten the human F1 Score (precision, recall) of 91.21 yet, these events further underline the frantic pace at which the RC models are evolving, which is great news because Reading Comprehension (RC) is a key element of intelligent agent systems.
Intelligent Agents & Machine Reading Comprehension
Building intelligent agents with the ability to answer open-domain and even closed-domain questions with high accuracy has been a key goal of most AI labs. Intelligent agents with RC and Question-Answer (QA) abilities can help AI personal assistant systems like Alexa, Google Assistant, Siri, Cortana, etc. perform better and help enterprises in having intelligent agent bots supplement human agents or directly process chat & messaging traffic and maybe even voice to some extent.
Machine Comprehension/Machine Reading Comprehension/Machine Reading models enable computers to read a document and answer general questions against it. While this is a relatively elementary task for a human, it's not that straightforward for AI models. There are multiple NTM (Neural Turning Machine), Memory Network, and Attention Models for Reading Comprehension available. The list of SQuAD models can be accessed here.
As a first step towards building our Intelligent agent system (humanly.ai), we are also building a machine reading system. Our implementation is based on the BiDAF (Bi-Directional Attention Flow) ensemble model & Textual Entailment. It's still work in progress (EM 67%, and F1 77%), and sometimes it gives funny answers but you can try it out here.
One of the basic challenges we faced was handling the questions that would require a yes/no type of answer (a further inference between the question, answer, and the document) — and hence the implementation of the Textual Entailment module. The other observation was to respond back in full sentences ("Yes, Narendra Modi is the Prime Minister of India" instead of a "Yes", to the question, "Is Narendra Modi the Prime Minister of India?"), and for that as the next product increment we are currently planning on implementing the Seq2Seq model to format our responses.
But one major challenge, all machine reading systems face, especially when it comes to practical implementations for specific domains or verticals, is the absence of supervised learning data (labeled data) for that domain. All the contemporary, reading comprehensions models are built on supervised training data with labeled questions and answers, a paragraph with the answer, etc. So when it comes to new domains, whilst the Enterprises have artifacts and data, the absence of labeled data presents a challenge.
We are currently experimenting with an ensemble of machine reading comprehension models, each trained on a specific dataset, so that the learning is incremental. While the scores are improving for the model, the need for labeled domain data to train the MRC model in the first place still persists. Towards this problem, I came across two very neat solutions, which attempt domain transference from Microsoft — SynNet and ReasoNet, which we intend to explore further.
The 'two-stage Synthesis Networks' or SynNet model first gets trained on supervised data for a given vertical and learns the technique to identify patterns for critical information (named entities, knowledge points etc.) and then generates questions around these answers. Once trained, it can then generate pseudo-questions and answers against artifacts for the new domain. These can then be used to train the MRC on the new domain.
The Reasoning Network, or ReasoNet, essentially uses reinforcement learning to dynamically figure when it has enough information to answer a question, and that it should stop reading. This is a deviation from the approach of using a fixed number of turns during the process of inferring the relationship between the questions, artifacts, and the answers. This has also performed exceptionally when on the SQuAD dataset.
We Shall Overcome
As various models continue to emerge, its a reasonable guess that sooner rather than later (especially catalyzed by the availability of so many datasets that are themselves growing rapidly) that machine comprehension models will be able to overcome the key challenges and get us closer to the goal of intelligent agents that can be trained on standard documents and answer general questions — as humans do.
I do hope you found the post useful in getting some basic understanding of machine comprehension. As always, do leave your comments & thoughts — including any aspects that I might have missed. I will be more than happy to incorporate them.
Disclaimers: The above post in no way claims any copyright to any of the images or literature presented.
Published at DZone with permission of Somnath Biswas . See the original article here.
Opinions expressed by DZone contributors are their own.