Over a million developers have joined DZone.

BigData Workflows Made Easy -> Glue

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

Glue is a job execution engine, written in Java and Groovy. 

Workflows are written in Groovy DSL / Jython / Clojure / JRuby and use pre-developed modules to interact with external resources e.g. DBs, Hadoop, Netezza, FTP etc.

Glue is not XML, and is not a BI tool, but rather a tool that allows programmers to write workflows in a production environment using any of the supported languages. 

The nicest thing about glue is its modules that allows you to interact with DBs Hadoop Clusters etc using tested methods and which can be setup once and re-used in each workflow, this abstracts the configuration away from the workflows and saves tons of time spent debugging.

Another cool feature is the ability to run data-driven workflows from hadoop, i.e. you can register N workflows to a HDFS directory path and have those workflows run automatically as data arrives in that directory.

http://gerritjvv.github.io/glue/documentation.html

http://gerritjvv.github.io/glue/triggers.html





Hortonworks Sandbox is a personal, portable Apache Hadoop® environment that comes with dozens of interactive Hadoop and it's ecosystem tutorials and the most exciting developments from the latest HDP distribution, brought to you in partnership with Hortonworks.

Topics:

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}