Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Deployment of Apache Oozie 4.1.0 in Hadoop Cluster and Schedule a MR Job with Oozie

DZone's Guide to

Deployment of Apache Oozie 4.1.0 in Hadoop Cluster and Schedule a MR Job with Oozie

· Big Data Zone
Free Resource

Access NoSQL and Big Data through SQL using standard drivers (ODBC, JDBC, ADO.NET). Free Download 

In this demo, step by step instructions are provided to deploy Apache Oozie on Hadoop & how to execute a job through MapReduce in Oozie.

  1. If we plan to install Oozie-4.0.1 or prior version JDK 1.6 required , if the jdk edition on Ubuntu is greater than or equal 1.7, then need to make changes in pom.xml file.
  2. If we install oozie-4.1.0 or later, then jdk 1.7 is fine
  3. Mapreduce job history server need to be configured & started successfully & remaining Hadoop & yarn daemons should be running fine..
  4. Hadoop should be running, i.e hdfs, mapreduce, yarn services should be running fine..install Hadoop 2.6.0 is compatible with the version of oozie 4.1.0

5. In this video, I’ve depicted a step by step guide on installation of Apache Oozie on Hadoop cluster & starting Oozie web console.

Once the Oozie installation is done successfully, then start scheduling a Map-reduce job on Hadoop cluster using Oozie.

6. First, extract the Oozie-examples.tar.gz file

$ cd $OOZIE_HOME
$ tar -xvf oozie-examples.tar.gz

7. Next, Edit the job.properties file of oozie-examples directory.

$/usr/local/oozie/oozie-bin$ find examples/ -name “job.properties” -exec sed -i “s/localhost:8020/localhost:9000/g” ‘{}’ \;
$/usr/local/oozie/oozie-bin$ find examples/ -name “job.properties” -exec sed -i “s/localhost:8021/localhost:8032/g” ‘{}’ \;

8. Now, checkout the job.properties file located in the directory $OOZIE_HOME/examples/apps/map-reduce

Oozie-job.properties

9. Finally, copy the local files to HDFS , to start the MR jobs with Oozie.

     $ hadoop fs -put examples examples

    $ hdfs dfs -mkdir -p /user/oozieuser/examples/apps/map-reduce/lib

    $hdfs dfs -copyFromLocal $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar /user/oozieuser/examples/apps/map-reduce/lib/hadoop-mapreduce-examples-2.6.0.jar
    $ cd /usr/local/oozie/oozie-bin/
    $/usr/local/oozie/oozie-bin$ oozie job -oozie http://localhost:11000/oozie -config $OOZIE_HOME/examples/apps/map-reduce/job.properties -run
    job: 0000000-150216182818445-oozie-user-W
    :/usr/local/oozie/oozie-bin$

10. Once , you have submitted the job on MapReduce node scheduled through Oozie, checkout the status of the job execution.

$ usr/local/oozie/oozie-bin$ oozie job -oozie http://localhost:11000/oozie -info 0000000-150216182818445-oozie-user-W
$ usr/lib/oozie/oozie-bin$ oozie job -oozie http://localhost:11000/oozie -log 0000000-150216182818445-oozie-user-W

Oozie-job-console

The fastest databases need the fastest drivers - learn how you can leverage CData Drivers for high performance NoSQL & Big Data Access.

Topics:
bigdata ,hadoop ,big data ,oozie

Published at DZone with permission of Anindita Basak, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}