Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Top Articles on Apache Hadoop: Pig, NiFi, Python, and More

DZone's Guide to

Top Articles on Apache Hadoop: Pig, NiFi, Python, and More

Another week of interest articles from Hortonworks community site including Pig, NiFi, Zeppelin and Python tutorials.

· Big Data Zone
Free Resource

Effortlessly power IoT, predictive analytics, and machine learning applications with an elastic, resilient data infrastructure. Learn how with Mesosphere DC/OS.

It has been another exciting week on Hortonworks Community Connection HCC. We continue to see great activity and recommend the following assets from last week.

Top Articles from HCC

  1. Pig Doing Yoga: How to Build Superflexible Pig Scripts by: gkeys

    We know that parameter passing is valuable for pig script reuse. One lesser known understanding is that parameters do not simply pass variables to pig scripts but rather (and more fundamentally) they pass text that replaces placeholders in the script. This is a subtle but powerful difference.. Read more on HCC.
  2. Incremental Fetch in Apache NiFi with QueryDatabaseTable by: mburgess

    NiFi is most effectively used as an “always-on” system, meaning that the data flows are often always operational (running). Doing batch processing is a more difficult task and usually requires some user intervention (such as stopping a source processor)….Read more on HCC.
  3. Teragen, Terasort, and Teravalidate Performance testing on Bigstep by: smanjee

    Faster & cheaper data processing — IaaS. Read about REAL WORLD experience with the typically IaaS providers has been generally slow on performance. Not to say hadoop/hbase/spark/etc jobs will not perform; however, you need to be familiar with what you’re getting into and set realistic expectations. Read more on HCC.
  4. How do I login to Apache Zeppelin when Security is enabled using HDP 2.5 Tech Preview Sandbox by: phargis

    Apache Zeppelin (version 0.6.0) includes the ability to securely authenticate users and require logins. It uses the Apache Shiro security framework to accomplish this objective. Read more on HCC.
  5. Python Script in Apache NiFi by: vvagias

    In NiFi the data being passed between operators is referred to as a FlowFile and can be accessed via various scripting languages in the ExecuteScript operator. In order to access the data in the FlowFile you need to understand a few requirements first.Read more on HCC.

Top 5 Questions — Last Week

  1. How do you fix login issues after restarting Cloudbreak deployer instance on Amazon?
  2. I am facing issue of huge data in MySQL table which is increasing very fast , so to scale what is the other alternative?
  3. Can I change log location in HDP installation
  4. NiFi JsonSplit Doesn’t Work
  5. HortonWorks License Vs HortonWorks Free Comparison

Learn to design and build better data-rich applications with this free eBook from O’Reilly. Brought to you by Mesosphere DC/OS.

Topics:
hortonworks ,hadoop ,nifi ,zeppelin ,python ,big data ,json ,cloudbreak ,cloud ,aws

Published at DZone with permission of Mark Herring, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}