Over a million developers have joined DZone.

Hortonworks Community Connection

A Summary of the top articles written by the Hortonworks community, these include new articles on how to use NiFi 1.0.0 and debugging HDFS.

· Big Data Zone

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

It has been another exciting week on Hortonworks Community Connection HCC. We continue to see great activity and recommend the following assets from last week.

Top Articles from HCC

  1. Nifi 1.0.0 Beta UI Introduction by: hsowell
    The Apache Nifi community recently released the beta version of Apache NiFi 1.0.0. This version comes with significant updates, which include a UI refresh, transition to zero master clustering, added multi-tenant authorization, and templates that are now deterministically ordered allowing for version controlled templates! 
  2. Five Infrequently Known Commands To Debug Your HDFS Issues by: mliu
    When you do “hdfs dfs -cat file1” from the command line, you get the exception saying that it “Cannot obtain block length for LocatedBlock”. Usually this means the file is still in being-written state. Read the article for tips and techniques in debugging.
  3. Dynamic List Filtering with NiFi by: cstanca
    This article is not meant to show how to install or create a “Hello World” NiFi data flow, but how to resolve a data filtering problem with NiFi providing two approaches, using a filter list as a file on the disk, which could be static or dynamic, and a list stored in a distributed cache populated from the same file. 
  4. Using Pig to convert uncompressed data to compressed data in HDFS by: myoung
    ORC provides many benefits, including support for compression. However, there are times when you need to store “raw” data in HDFS, but want to take advantage of compression. One such use case is with Hive external tables. 
  5. Simple Kafka Producer using Java in a Kerberozied cluster by: sgowda
    Kafka from 0.9 onwards started support SASL_PLAINTEXT ( authentication and non-encrypted) for communication b/w brokers and consumer/produce r with broker. 

Top 5 Questions — last week

  1. Ambari REST API to restart all services
  2. HIVE and ACID table performance for updates
  3. How to block Hive CLI access?
  4. Multiple listeners of Kafka in Kerberozied Cluster
  5. Connecting hive – Beeline vs hive?

Come over to HCC and participate.

Hortonworks Sandbox is a personal, portable Apache Hadoop® environment that comes with dozens of interactive Hadoop and it's ecosystem tutorials and the most exciting developments from the latest HDP distribution, brought to you in partnership with Hortonworks.

hortonworks,hdfs,nifi,big data,hadoop,kafka,java,kerberos

Published at DZone with permission of Mark Herring, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}