DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
What's in store for DevOps in 2023? Hear from the experts in our "DZone 2023 Preview: DevOps Edition" on Fri, Jan 27!
Save your seat
  1. DZone
  2. Data Engineering
  3. Data
  4. IoT Edge Processing With Apache NiFi, MiniFi, and Multiple Deep Learning Libraries: Part 1

IoT Edge Processing With Apache NiFi, MiniFi, and Multiple Deep Learning Libraries: Part 1

Let's look at how to use the new HDP 3.0 and HDF 3.2 stacks to develop IoT applications with Deep Learning.

Tim Spann user avatar by
Tim Spann
CORE ·
Sep. 04, 18 · Tutorial
Like (4)
Save
Tweet
Share
5.99K Views

Join the DZone community and get the full member experience.

Join For Free


In preparation for my talk on utilizing edge devices for Deep Learning, IoT sensor reading, and Big Data processing, I have updated my environment to the latest and greatest tools available.

With the upgrade of HDF to 3.2, I can now use Apache NiFi 1.7 and MiniFi 0.5 for IoT data ingestion, simple event processing, conversion, data processing, data flow, and storage.

The architecture diagram above shows the basic flow we are utilizing.

Step-by-Step

  1. Raspberry Pi with latest patches, Python, GPS software, USB Camera, Sensor libraries, Java 8, MiniFi 0.5, TensorFlow and Apache MXNet installed.
  2. Minifi flow pushes JSON and JPEGs over HTTP(s) / Site-to-Site to an Apache NiFi gateway server.
  3. Option: NiFi can push to a central NiFi cloud cluster and/or Kafka cluster both of which running on HDF 3.2 environments.
  4. Apache NiFi cluster pushes to Hive, HDFS, Dockerized API running in HDP 3.0 and Third Party APIs.
  5. NiFi and Kafka integrate with Schema Registry for our tabular data including rainbow and gps JSON data.

SQL Tables in Hive

I stream my data into Apache ORC files stored on HDP 3.0 HDFS directories and build external tables on them.

CREATE EXTERNAL TABLE IF NOT EXISTS rainbow (tempf DOUBLE, cputemp DOUBLE, pressure DOUBLE, host STRING, uniqueid STRING, ipaddress STRING, temp DOUBLE, diskfree STRING, altitude DOUBLE, ts STRING, 
 tempf2 DOUBLE, memory DOUBLE) 
STORED AS ORC LOCATION '/rainbow';

CREATE EXTERNAL TABLE IF NOT EXISTS gps (speed STRING, diskfree STRING, altitude STRING, ts STRING, cputemp DOUBLE, latitude STRING, track STRING, memory DOUBLE, host STRING, uniqueid STRING, ipaddress STRING, epd STRING, utc STRING, epx STRING, epy STRING, epv STRING, ept STRING, eps STRING, longitude STRING, mode STRING, time STRING, climb STRING, epc STRING) 
STORED AS ORC LOCATION '/gps';

For my processing needs, I also have a Hive 3 ACID table for general table usage and updates.

create table rainbowacid(tempf DOUBLE, cputemp DOUBLE, pressure DOUBLE, host STRING, uniqueid STRING, ipaddress STRING, temp DOUBLE, diskfree STRING, altitude DOUBLE, ts STRING, 
                                             tempf2 DOUBLE, memory DOUBLE) STORED AS ORC 
                        TBLPROPERTIES ('transactional'='true');

CREATE TABLE IF NOT EXISTS gpsacid (speed STRING, diskfree STRING, altitude STRING, ts STRING, cputemp DOUBLE, latitude STRING, track STRING, memory DOUBLE, host STRING, uniqueid STRING, ipaddress STRING, epd STRING, utc STRING, epx STRING, epy STRING, epv STRING, ept STRING, eps STRING, longitude STRING, mode STRING, time STRING, climb STRING, epc STRING) STORED AS ORC
                        TBLPROPERTIES ('transactional'='true');

Then I load my initial data.

insert into rainbowacid
select * from rainbow;

insert into gpsacid 
select * from gps;

Hive 3.x Updates

%jdbc(hive) CREATE TABLE Persons_default (
    ID Int NOT NULL,
    Name String NOT NULL,
    Age Int,
    Creator String DEFAULT CURRENT_USER(),
    CreateDate Date DEFAULT CURRENT_DATE()
)

One of the cool new features in Hive is that you can now have defaults, as you can see, which are helpful for things like standard defaults you might want like current data. This gives us even more relational style features in Hive.

Another very interesting feature is materialized views which help you for having clean and fast subqueries. Here is a cool example:

CREATE MATERIALIZED VIEW mv1
AS
SELECT dest,origin,count(*)
FROM flights_hdfs 
GROUP BY dest,origin

Thanks, and let me know your thoughts in the comments section!

Apache NiFi Processing Big data Deep learning IoT Library

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • 7 Awesome Libraries for Java Unit and Integration Testing
  • What Is a Kubernetes CI/CD Pipeline?
  • Microservices Discovery With Eureka
  • Bye Bye, Regular Dev [Comic]

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: