Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Deployment of Apache Oozie 4.1.0 in Hadoop Cluster and Schedule a MR Job with Oozie

DZone's Guide to

Deployment of Apache Oozie 4.1.0 in Hadoop Cluster and Schedule a MR Job with Oozie

· Big Data Zone ·
Free Resource

The open source HPCC Systems platform is a proven, easy to use solution for managing data at scale. Visit our Easy Guide to learn more about this completely free platform, test drive some code in the online Playground, and get started today.

In this demo, step by step instructions are provided to deploy Apache Oozie on Hadoop & how to execute a job through MapReduce in Oozie.

  1. If we plan to install Oozie-4.0.1 or prior version JDK 1.6 required , if the jdk edition on Ubuntu is greater than or equal 1.7, then need to make changes in pom.xml file.
  2. If we install oozie-4.1.0 or later, then jdk 1.7 is fine
  3. Mapreduce job history server need to be configured & started successfully & remaining Hadoop & yarn daemons should be running fine..
  4. Hadoop should be running, i.e hdfs, mapreduce, yarn services should be running fine..install Hadoop 2.6.0 is compatible with the version of oozie 4.1.0

5. In this video, I’ve depicted a step by step guide on installation of Apache Oozie on Hadoop cluster & starting Oozie web console.

Once the Oozie installation is done successfully, then start scheduling a Map-reduce job on Hadoop cluster using Oozie.

6. First, extract the Oozie-examples.tar.gz file

$ cd $OOZIE_HOME
$ tar -xvf oozie-examples.tar.gz

7. Next, Edit the job.properties file of oozie-examples directory.

$/usr/local/oozie/oozie-bin$ find examples/ -name “job.properties” -exec sed -i “s/localhost:8020/localhost:9000/g” ‘{}’ \;
$/usr/local/oozie/oozie-bin$ find examples/ -name “job.properties” -exec sed -i “s/localhost:8021/localhost:8032/g” ‘{}’ \;

8. Now, checkout the job.properties file located in the directory $OOZIE_HOME/examples/apps/map-reduce

Oozie-job.properties

9. Finally, copy the local files to HDFS , to start the MR jobs with Oozie.

     $ hadoop fs -put examples examples

    $ hdfs dfs -mkdir -p /user/oozieuser/examples/apps/map-reduce/lib

    $hdfs dfs -copyFromLocal $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar /user/oozieuser/examples/apps/map-reduce/lib/hadoop-mapreduce-examples-2.6.0.jar
    $ cd /usr/local/oozie/oozie-bin/
    $/usr/local/oozie/oozie-bin$ oozie job -oozie http://localhost:11000/oozie -config $OOZIE_HOME/examples/apps/map-reduce/job.properties -run
    job: 0000000-150216182818445-oozie-user-W
    :/usr/local/oozie/oozie-bin$

10. Once , you have submitted the job on MapReduce node scheduled through Oozie, checkout the status of the job execution.

$ usr/local/oozie/oozie-bin$ oozie job -oozie http://localhost:11000/oozie -info 0000000-150216182818445-oozie-user-W
$ usr/lib/oozie/oozie-bin$ oozie job -oozie http://localhost:11000/oozie -log 0000000-150216182818445-oozie-user-W

Oozie-job-console

Managing data at scale doesn’t have to be hard. Find out how the completely free, open source HPCC Systems platform makes it easier to update, easier to program, easier to integrate data, and easier to manage clusters. Download and get started today.

Topics:
bigdata ,hadoop ,big data ,oozie

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}