Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Top Articles in HCC: NiFi Hive Streaming, Spark, and More

DZone's Guide to

Top Articles in HCC: NiFi Hive Streaming, Spark, and More

The topic articles including Hive for weblog parsing, NiFi Hive Streaming, Spark performance testing, Hadoop IO, and installing Spark 2.0 preview.

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

It has been another exciting week on Hortonworks Community Connection HCC. We continue to see great activity and recommend the following assets from last week.

Top Articles from HCC

  1. Hive on Tez vs PySpark for weblogs parsing by: bmathew

    Synopsis
    Both Pig and Spark (PySpark) excel at iterative data processing against weblogs data in text delimited format. Is one faster than the other?
  2. Stream data into HIVE like a Boss using NiFi HiveStreaming – Olympics 1896-2008 by: nshawa

    An easy tutorial showing how to can stream data from CSV format into Hive tables directly and start working on it.
  3. Performance of Apache Spark on HDP/HDFS vs Spark on EMR/S3 by: bmathew

    Which is faster when analyzing data using Spark 1.6.1: HDP with HDFS for storage, or EMR using S3 for storage?
  4. More Hadoop nodes = faster IO and processing time? by: smanjee

    This article covers Hadoop performance on various IaaS providers in hope to find additional performance insights.
  5. How to install and run Spark 2.0 on HDP 2.5 Sandbox by: phargis

Top Questions from HCC

  1. run simple hive — Invalid distance too far back.
  2. Add new host into the cluster.
  3. Hadoop files list sorted by time.
  4. Is there is any workaround to map CSV columns to Hive columns?
  5. Data transfer between two clusters.

Come join us on HCC and get your questions answered.

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

Topics:
hortonworks ,big data ,hadoop ,nifi ,hive

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}