Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Integrating Apache OpenNLP Into Apache NiFi For Real-Time Natural Language Processing of Live Data Streams

DZone's Guide to

Integrating Apache OpenNLP Into Apache NiFi For Real-Time Natural Language Processing of Live Data Streams

In this quick post, we well how developers can go about using Natural Language Processing with Apache NiFi on live data.

· Integration Zone ·
Free Resource

The new Gartner Critical Capabilities report explains how APIs and microservices enable digital leaders to deliver better B2B, open banking and mobile projects.

This is an update to the existing processor. This one seems to work better and faster.

Versions

Apache OpenNLP 1.8.4 with Name, Location, and Date Processing.

I also improved the output format and added Date parsing.

Example Output

nlp_location_1 China

nlp_name_1 Andrew Turner

Release

https://github.com/tspannhw/nifi-nlp-processor/releases/tag/2.0

Installation

  1. Download NAR herehttps://github.com/tspannhw/nifi-nlp-processor/releases/tag/2.0
  2. Install the nar file to /usr/hdf/current/nifi/lib/.
  3. Create a model directory with permissions for the NiFi user
  4. Download models (see below).
  5. Restart Apache NiFi via Ambari.

Download Models

wget http://opennlp.sourceforge.net/models-1.5/en-ner-date.bin

wget http://opennlp.sourceforge.net/models-1.5/en-ner-location.bin

wget http://opennlp.sourceforge.net/models-1.5/en-ner-money.bin

wget http://opennlp.sourceforge.net/models-1.5/en-ner-organization.bin

wget http://opennlp.sourceforge.net/models-1.5/en-ner-percentage.bin

wget http://opennlp.sourceforge.net/models-1.5/en-ner-person.bin

wget http://opennlp.sourceforge.net/models-1.5/en-ner-time.bin

wget http://opennlp.sourceforge.net/models-1.5/en-chunker.bin

wget http://opennlp.sourceforge.net/models-1.5/en-parser-chunking.bin

wget http://opennlp.sourceforge.net/models-1.5/en-token.bin

wget http://opennlp.sourceforge.net/models-1.5/en-sent.bin

wget http://opennlp.sourceforge.net/models-1.5/en-pos-maxent.bin

wget http://opennlp.sourceforge.net/models-1.5/en-pos-perceptron.bin

The new Gartner Critical Capabilities for Full Lifecycle API Management report shows how CA Technologies helps digital leaders with their B2B, open banking, and mobile initiatives. Get your copy from CA Technologies.

Topics:
hadoop ,apache opennlp ,natural language processing ,apache nifi ,integration

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}