DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations

Trending

  • Automating the Migration From JS to TS for the ZK Framework
  • Effortlessly Streamlining Test-Driven Development and CI Testing for Kafka Developers
  • Never Use Credentials in a CI/CD Pipeline Again
  • 5 Key Concepts for MQTT Broker in Sparkplug Specification

Trending

  • Automating the Migration From JS to TS for the ZK Framework
  • Effortlessly Streamlining Test-Driven Development and CI Testing for Kafka Developers
  • Never Use Credentials in a CI/CD Pipeline Again
  • 5 Key Concepts for MQTT Broker in Sparkplug Specification
  1. DZone
  2. Data Engineering
  3. Data
  4. Apache NiFi Overview

Apache NiFi Overview

In this post, we explore the basics of NiFi and learn why it's a great tool in any data scientist's belt.

Manoj G T user avatar by
Manoj G T
·
Updated Aug. 06, 19 · Analysis
Like (14)
Save
Tweet
Share
13.22K Views

Join the DZone community and get the full member experience.

Join For Free

What Is Apache NiFI?

Apache NiFi is a robust open-source Data Ingestion and Distribution framework and more. It can propagate any data content from any source to any destination.

NiFi is based on a different programming paradigm called Flow-Based Programming (FBP). I’m not going to explain the definition of Flow-Based Programming. Instead, I will tell how NiFi works, and then you can connect it with the definition of Flow-Based Programming.

How NiFi Works

NiFi consists of atomic elements which can be combined into groups to build simple or complex dataflows.

NiFi has processors and process groups.

What Is a Processor in NiFi?

A processor is an atomic element in NiFi which can do some specific task.

The latest version of NiFi has around 280+ processors, and each has its own responsibilities.

For example, the GetFile processor can read a file from a specific location, whereas the PutFile processor can write a file to a particular location. Like this, we have many other processors, that each address a unique aspect of the data pipeline.

We have processors to get data from various data sources and processors to write data to various data sources.

The data source can be almost anything. It can be any SQL database server like Postgres, Oracle, or MySQL, or it can be NoSQL databases like MongoDB or Couchbase. It can also be your search engines like Solr or Elasticsearch, or it can be your cache servers like Redis or HBase. It can even connect to Kafka Messaging Queue.

NiFi also has a rich set of processors to connect with Amazon AWS entities likes S3 Buckets and DynamoDB.

NiFi have a processor for almost everything you need when you're working with data. We will go deep into various types of processors available in NiFi in later posts. Even if you don’t find the right processor for your requirements, NiFi gives a simple way to write your custom processors.

Now let’s move on to the next term, FlowFile.

What Is a FlowFile in NiFi?

The actual data in NiFi propagates in the form of a FlowFile. The FlowFile can contain any data, say CSV, JSON, XML, or plain text, and it can even be SQL queries or binary data.

The FlowFile abstraction is the reason that NiFi can propagate any data from any source to any destination. A processor can process a FlowFile to generate a new FlowFile.

The next important term is connections.

What Is a Connection in NiFi?

In NiFi, all processors can be connected to create a data flow. This link between processors is called connections. Each connection between processors can act as a queue for FlowFiles as well.

The next one is the process group and input/output ports.

What Are Process Groups, Input Ports, and Output Ports in NiFi?

In NiFi, one or more processors are connected and combined into a process group. When you have a complex data flow, it’s better to combine processors into logical process groups. This helps in better maintaining the flows.

Process groups can have input and output ports which are used to move data between them.

The last and final term you should know is, 'controller services.'

What Is a Controller Service in NiFi?

Controller services are shared services that can be used by processors. For example, a processor which gets and puts data into a SQL database can have a Controller Service with the required DB connection details.

The controller service is not limited to DB connections.

Happy learning!


To learn more about Apache NiFi, kindly visit my YouTube Channel. I have created a playlist, especially for beginners.

NiFi Introduction - YouTube Playlist

After finishing my YouTube tutorial, if you wish to dive deep into the advanced topic, you can opt my Udemy course.

Apache NiFi - The Complete Guide (Udemy Course)

You can learn the same course in Skillshare for FREE using the below referral link.

Apache NiFi - The Complete Guide (Skillshare Course)

Apache NiFi Database Data (computing)

Published at DZone with permission of Manoj G T. See the original article here.

Opinions expressed by DZone contributors are their own.

Trending

  • Automating the Migration From JS to TS for the ZK Framework
  • Effortlessly Streamlining Test-Driven Development and CI Testing for Kafka Developers
  • Never Use Credentials in a CI/CD Pipeline Again
  • 5 Key Concepts for MQTT Broker in Sparkplug Specification

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com

Let's be friends: