DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. Big Data
  4. Getting Hadoop, Hive and HBase Up and Running in Less than 15 Minutes

Getting Hadoop, Hive and HBase Up and Running in Less than 15 Minutes

Eric Gregory user avatar by
Eric Gregory
·
Feb. 15, 13 · Interview
Like (0)
Save
Tweet
Share
17.48K Views

Join the DZone community and get the full member experience.

Join For Free

Note: This tutorial comes from guest writer Mark Grover. Enjoy.

Introduction

If you have delved into Apache Hadoop and related projects, you know that installing and configuring Hadoop is hard. Often, a minor mistake during installation or configuration with messy tarballs will lurk for a long time until some otherwise innocuous change to the system or workload causes difficulties. Moreover, there is little to no integration testing among different projects (e.g. Hadoop, Hive, HBase, Zookeeper, etc.) in the ecosystem. Apache Bigtop is an open source project aimed at bridging exactly those gaps by:

1. Making it easier for users to deploy and configure Hadoop and related projects on their bare metal or virtualized clusters.

2. Performing integration testing among various components in the Hadoop ecosystem.

More about Apache Bigtop

The primary goal of Apache Bigtop is to build a community around the packaging and interoperability testing of Hadoop related projects. This includes testing at various levels (packaging, platform, runtime, upgrade, etc.) developed by a community with a focus on the system as a whole, rather than individual projects.

The latest released version of Apache Bigtop is Bigtop 0.5 which integrates the latest versions of various projects including Hadoop, Hive, HBase, Flume, Sqoop, Oozie and many more! The supported platforms include CentOS/RHEL 5 and 6, Fedora 16 and 17, SuSE Linux Enterprise 11, OpenSuSE 12.2, Ubuntu LTS Lucid and Precise, and Ubuntu Quantal.

Who uses Bigtop?

Folks who use Bigtop can be divided into two major categories. The first category of users are those who leverage Bigtop to power their own Hadoop Distributions. The second category of users are those who use Bigtop for deployment purposes.

In alphabetical order, they are:

Cloudera leverages Bigtop in its Cloudera’s Distribution, including Apache Hadoop (CDH), a 100% open source Hadoop distribution based on Apache Bigtop.

EMC/Greenplum uses Bigtop extensively as a build framework for their 1000-node Analytics Workbench Cluster.

Juju Charms for Hadoop, HBase, Hive and Zookeeper and the associated packages for Ubuntu are a derivation of Apache Bigtop.

Magna Tempus Group provides ready-to-use, well integrated open source stack for intensive and high-performance in-memory data analysis based on such widely accepted technologies as Bigtop, Hadoop, HBase, Hive and many others.

Trend Micro uses Bigtop as the basis for their internal custom distribution of Hadoop, which starts with Bigtop but then pulls features from different upstream versions and includes Apache licensed non-core contributions as their platform needs dictate.

Uniting Data’s 100% open source platform is a Hadoop distribution based on Apache Bigtop.

WANdisco bases its 100% open source distro, WANdisco Distro (WDD), on Apache Bigtop.

Using Bigtop

Whether or not you have dabbled with Hadoop before, Apache Bigtop can go a long way towards making your life easier by providing infrastructure for easy deployment along with the latest debian and rpm artifacts for various projects. Moreover, these artifacts have been integration tested so you can rely on having a trustworthy cutting edge distribution of Hadoop and related projects on your cluster.  You can use the wiki instructions to set up a pseudo-distributed cluster in no time or use the puppet recipes to set up a fully distributed cluster. You can also make use of soon-to-be-introduced Bigtop integration with Apache Whirr.

If you are a novice and would like to learn more about how you can use Apache Bigtop to quickly deploy Hadoop on your laptop and give it a test drive, or if you are a veteran and are curious to find out how Apache Bigtop can make your cluster more robust and easier to deploy, drop by my talk on Apache Bigtop at ApacheCon NA 2013 on February 26th, 2013.

hadoop Open source cluster

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Asynchronous HTTP Requests With RxJava
  • Why You Should Automate Code Reviews
  • Writing a Modern HTTP(S) Tunnel in Rust
  • Deploying Java Serverless Functions as AWS Lambda

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: