Over a million developers have joined DZone.

Enabling JMX Monitoring for Hadoop & Hive

DZone 's Guide to

Enabling JMX Monitoring for Hadoop & Hive

· DevOps Zone ·
Free Resource

Hadoop’s NameNode and JobTracker expose interesting metrics and statistics over the JMX. Hive seems not to expose anything intersting but it still might be useful to monitor its JVM or do simpler profiling/sampling on it. Let’s see how to enable JMX and how to access it securely, over SSH.

Background: We run NameNode, JobTracker and Hive on the same server. Monitoring og TaskTrackers and DataNodes isn’t that interesting but still might be useful to have.



diff --git a/etc/hadoop/hadoop-env.sh b/etc/hadoop/hadoop-env.sh
index 69a13b1..e8ca596 100644
--- a/etc/hadoop/hadoop-env.sh
+++ b/etc/hadoop/hadoop-env.sh
@@ -14,7 +14,8 @@ export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}

 # Extra Java runtime options. Empty by default.
-export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true $HADOOP_CLIENT_OPTS"
+# Added $HIVE_OPTS that is set by hive-env.sh when starting hiveserver
+export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true $HADOOP_CLIENT_OPTS $HIVE_OPTS"

 # Command specific options appended to HADOOP_OPTS when specified
 export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT $HADOOP_NAMENODE_OPTS"
@@ -43,3 +44,16 @@ export HADOOP_SECURE_DN_PID_DIR=/var/run/hadoop

 # A string representing this instance of hadoop. $USER by default.
+### JMX settings
+export JMX_OPTS=" -Dcom.sun.management.jmxremote.authenticate=false \
+    -Dcom.sun.management.jmxremote.ssl=false \
+    -Dcom.sun.management.jmxremote.port"
+#    -Dcom.sun.management.jmxremote.password.file=$HADOOP_HOME/conf/jmxremote.password \
+#    -Dcom.sun.management.jmxremote.access.file=$HADOOP_HOME/conf/jmxremote.access"

The JMX setting is used for Hadoop’s daemons while the HIVE_OPTS was added for Hive.

<hive home>/conf/hive-env.sh

Enable JMX when running the Hive thrift server (we don’t want it when running the command-line client etc. since it’s pointless and we wouldn’t need to make sure that each of them has a unique port):

if [ "$SERVICE" = "hiveserver" ]; then
  JMX_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=8008"


When you start Hive server via hive –service hiveserver then it actually executes “hadoop jar …” so to be able to pass options from hive-env.sh to the JVM we had to add $HIVE_OPTS in hadoop-env.sh. (I haven’t found a cleaner way to do it.)


When we now start Hive or any of the Hadoop daemons, they will expose their metrics at their respective ports (NameNode – 8006, JobTracker – 8007, Hive – 8008).

(If you are running DataNode and/or TaskTracker on the same machine then you’ll need to change their ports to be unique.)

Secure Connection Over SSH

Read the post VisualVM: Monitoring Remote JVM Over SSH (JMX Or Not) to find out how to connect securely to the JMX ports over ssh, f.ex. with VisualVM (spolier: ssh -D 9696 hostname; use proxy at localhost:9696).




Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}