Over a million developers have joined DZone.

Glue and Big Data: Getting Started, Part 1

DZone's Guide to

Glue and Big Data: Getting Started, Part 1

Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

Where to start?

Glue is split into three parts:
  • glue-rest - this is the workflow engine that will execute your jobs
  • gluecron - this is the cron/datadriven deamon that launch workflows based on cron or data in hdfs
  • glue-ui - a simple ui that gives you insight into the workflows running and their output


Initial Requirements

As with all things hadoop related it's best to use linux. Technically Glue does not require a linux machine because it runs on the JVM, but even for trying out the examples its best to create a linux VM (ubuntu or centos) using VirtualBox or another VM app.

Java 6+

Nothing more is required, Glue is packed with its own libraries.

Install Glue Rest

Ok, I could have chosen a better name, but the naming sort of stuck ever since Glue was created.   This is a simple step:Download the rpm from https://sourceforge.net/projects/glueworkflows/files
If your using ubuntu use: sudo alien 'rpm' to convert to a deb.

To install type:

sudo rpm -i 'rpm'
sudo dpkg -i 'deb'

The package installs to /opt/glue and you can run it using 

service glue-server start
/etc/init.d/glue-server start

Install Glue Cron

Download the gluecron rpm from https://sourceforge.net/projects/glueworkflows/file
Again for a deb use sudo alien 'rpm'

To install type:

sudo rpm -i 'rpm'
sudo dpkg -i 'deb'

The package installs to /opt/gluecron and you can run it using:

service gluecron start
/etc/init.d/gluecron start

Don't worry if gluecron or glue gives you errors on startup at the moment.
We'll need to configure them first.
That is the aim of part 2 (coming soon).

To explore more please go to: http://gerritjvv.github.io/glue/documentation.html

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.


Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}