Over a million developers have joined DZone.

Learning Big Data Tools in 2016

DZone's Guide to

Learning Big Data Tools in 2016

There's a lot of tools around Big Data. For 2016, there's a few tools for big data that I recommend taking a look at.

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

There's a lot of tools around Big Data. For 2016, there's a few tools for big data that I recommend taking a look at. Visualization, Spark, CDAP and HBase allow you to get the most out of your Hadoop cluster and your existing investment. Adding a few more open source tools to really maximize your big data investment.


D3.js is awesome, but there's libraries to help make it easier to use.NVD3provides a great set of reusable charts like Pie and Bar. I think this library is great for visualization data from Hadoop.

Apache Spark and Big Data

This is a great presentation on Spark Streaming and HBase. A great presentation on Spark and HBase.


Here is a great article to start with an interesting REST App using HBase. A really nice use case for HBase is Time Series data. Look at HBase with Phoenix for SQL. My favorite named slideshare of the week, DeathStar, which talks about production HBase on YARN. This is cool as it combines a few cool technology and some great tips on CDAP and HBase.

CDAP / Cask Data Application Platform

CDAP puts a nice abstraction layer on top of Hadoop to make it quicker and easier to develop. It's open source collection of projects for Data Collection, exploration, storage, collection and serving. A quick overview of CDAP and some applicationson Hadoop. There is a great documentation set here. It uses HBase, Yarn and Hive, so you get to reuse a lot of your Hadoop knowledge. It adds some cool things on top of Hadoop including Tephra (for Transactions on HBase) and Tigon (real-time streaming with Apache Twill). Here is a cool presentation on streaming analytics.

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

big data ,hadoop ,spark ,d3.js ,hbase

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}