Quickly Launch Hortonworks Data Platform in Amazon Web Services

DZone 's Guide to

Quickly Launch Hortonworks Data Platform in Amazon Web Services

Running Hive and Spark in the AWS cloud quickly and easily with a UI.

· Big Data Zone ·
Free Resource

Big data is changing the way enterprises interact with and consume data. Modern data platforms, such as Hortonworks Data Platform (HDP) and Hortonworks Data Flow (HDF), are driving a data revolution by powering new workloads and analytic applications.

Last week, there were thousands of attendees in San Jose at Hadoop Summit 2016 learning about the technologies and business drivers that are transforming how the enterprise harnesses data. We also announced the latest innovations coming in Hortonworks Data Platform 2.5 and provided a look at the new and exciting technologies in the pipeline, such as Hive LLAP.

While Big Data and Hadoop have been eating the world, the Cloud has been steadfast at work enabling the enterprise to save time, save money and scale fast.

Separately, Big Data and Cloud are each creating new opportunities and efficiencies for the enterprise. But when used together, enterprises can realize business value and achieve insight into data more quickly and with greater flexibility than ever before. To make this combination achieve its full potential, Big Data and Cloud need an experience that marries ease of use with infrastructure agility so that a user can get their analytics “tool of choice” in their hands exactly when they need (and want) it.

Introducing the Hortonworks Connected Data Cloud Technical Preview

To this end, we are introducing the “Hortonworks Connected Data Cloud” Technical Preview. This Technical Preview gives you a way to quickly spin up Apache Hive and Apache Spark clusters that are ready to run ephemeral workloads in your Amazon Web Services (AWS) environment.


Using “Hortonworks Cloud for AWS”, you can create clusters by choosing from a set of prescriptive cluster configurations. It’s not meant for the infinite configuration possibilities that Hadoop provides. Instead, it’s about hiding those complexities under-the-hood so people can get Spark and Hive running quickly in their AWS environments to start modeling and analyzing data sets. When you are done your analysis, you can give back the resources to the Cloud just as easily as you got them.

How Do I Get Started?

Instructions on how to spin up the Technical Preview in your AWS environment are found here: http://hortonworks.github.io/hdp-aws/. The high-level points are:

  1. Start with your AWS account and launch the cloud controller into your AWS environment.
  2. Log into the cloud controller and start creating Hive or Spark clusters that are ready to use for analysis.
  3. Scale-up, scale-down and clone those clusters… and when done, terminate.

What Hive and Spark Cluster Configurations are Available?

The Technical Preview includes a prescriptive set of cluster configurations for Hive and Spark, from the most stable to the more experimental. If you are looking to grab HDP 2.4 with the Hive and Spark versions you know and love, go for it. Or if you are looking to explore the latest HDP innovations, try the with Spark 2.0-preview or Hive LLAP.


Where Do I Get Help?

Once you start working with the Technical Preview, Hortonworks Community Connection is a great resource for help. Hortonworks cloud subject matter experts are moderating the “Cloud & Operations” Track for questions related to this Technical Preview. When asking a question related to this Technical Preview, be sure to select the “Cloud & Operations” Track and add the following tag: hortonworks-cloud.


What’s Next?

We are excited for you to try the Technical Preview and look forward to seeing your feedback on the Hortonworks Community Connection.

aws, big data, cloud, hadoop, hive, hortonworks, spark

Published at DZone with permission of Jeff Sposetti . See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}