Over a million developers have joined DZone.

Top Articles from HCC: Hive, NiFi, Kafka, and Hadoop

It has been another exciting week on Hortonworks Community Connection HCC. We continue to see great activity and recommend the following assets from last week.

· Big Data Zone

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

It has been another exciting week on Hortonworks Community Connection HCC. We continue to see great activity and recommend the following assets from last week.

Top Articles from HCC

  1. Implementing a real-time Hive Streaming example by: mjohnson The Hive Streaming API enables the near real-time data ingestion into Hive. This two-part posting reviews some of the design decisions necessary to produce a health Hive Streaming ingest process from which you can in a near real-time execute queries on the ingested data.
  2. Using NiFi to ingest and transform RSS feeds to HDFS using an external config file by: gkeys This article shows a simple NiFi data flow from the web to HDFS that demonstrates several fundamental capabilities of NiFi
  3. Kafka 0.9 Configuration Best Practices (unofficial) by: vjain This article is applicable for anyone deploying a Kafka cluster for production use. It points out certain high level points that you should be thinking about before deploying your cluster. 
  4. Demystifying Delegation Token by: jramakrishnan The concept of Delegation token is introduced to avoid frequent authentication check against Kerberos(AD/MIT). After the initial authentication against Namenode using Keberos, any subsequent authentication can be done without Kerberos service ticket(ot TGT).
  5. Using Apache NiFi for Slowly Changing Dimensions on Hadoop Part 1 by: smanjee EDW design patterns. Applying well-established relational concepts to Hadoop I yields many anti-patterns. Which patterns work? Let's get to work…

Top 5 Questions — last week

  1. NiFi: How to check if large file has completely been written to directory without using Minimum File Age setting?
  2. can i use the same disk with diff mount points to store Data nodes data, NN namenode data, SN data, JT data (master node data) and /usr and /var
  3. Can you please have a look at below mentioned HUE issue and help in understanding root cause ?
  4. Apache NIFI – What is the difference between ConsumeKafka and GetKafka,Apache NiFi – What is the difference between ConsumeKafka and GetKafka
  5. HDP stack versions not listed

Come over to HCC and participate.

Hortonworks Sandbox is a personal, portable Apache Hadoop® environment that comes with dozens of interactive Hadoop and it's ecosystem tutorials and the most exciting developments from the latest HDP distribution, brought to you in partnership with Hortonworks.

hortonworks,nifi,big data,hadoop,hive

Published at DZone with permission of Mark Herring, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}