Integrating Apache OpenNLP Into Apache NiFi For Real-Time Natural Language Processing of Live Data Streams

DZone 's Guide to

Integrating Apache OpenNLP Into Apache NiFi For Real-Time Natural Language Processing of Live Data Streams

In this quick post, we well how developers can go about using Natural Language Processing with Apache NiFi on live data.

· Integration Zone ·
Free Resource

This is an update to the existing processor. This one seems to work better and faster.


Apache OpenNLP 1.8.4 with Name, Location, and Date Processing.

I also improved the output format and added Date parsing.

Example Output

nlp_location_1 China

nlp_name_1 Andrew Turner




  1. Download NAR herehttps://github.com/tspannhw/nifi-nlp-processor/releases/tag/2.0
  2. Install the nar file to /usr/hdf/current/nifi/lib/.
  3. Create a model directory with permissions for the NiFi user
  4. Download models (see below).
  5. Restart Apache NiFi via Ambari.

Download Models

wget http://opennlp.sourceforge.net/models-1.5/en-ner-date.bin

wget http://opennlp.sourceforge.net/models-1.5/en-ner-location.bin

wget http://opennlp.sourceforge.net/models-1.5/en-ner-money.bin

wget http://opennlp.sourceforge.net/models-1.5/en-ner-organization.bin

wget http://opennlp.sourceforge.net/models-1.5/en-ner-percentage.bin

wget http://opennlp.sourceforge.net/models-1.5/en-ner-person.bin

wget http://opennlp.sourceforge.net/models-1.5/en-ner-time.bin

wget http://opennlp.sourceforge.net/models-1.5/en-chunker.bin

wget http://opennlp.sourceforge.net/models-1.5/en-parser-chunking.bin

wget http://opennlp.sourceforge.net/models-1.5/en-token.bin

wget http://opennlp.sourceforge.net/models-1.5/en-sent.bin

wget http://opennlp.sourceforge.net/models-1.5/en-pos-maxent.bin

wget http://opennlp.sourceforge.net/models-1.5/en-pos-perceptron.bin
apache nifi, apache opennlp, hadoop, integration, natural language processing

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}