Over a million developers have joined DZone.

Enabling JMX Monitoring for Hadoop & Hive

DZone's Guide to

Enabling JMX Monitoring for Hadoop & Hive

· DevOps Zone
Free Resource

Download the blueprint that can take a company of any maturity level all the way up to enterprise-scale continuous delivery using a combination of Automic Release Automation, Automic’s 20+ years of business automation experience, and the proven tools and practices the company is already leveraging.

Hadoop’s NameNode and JobTracker expose interesting metrics and statistics over the JMX. Hive seems not to expose anything intersting but it still might be useful to monitor its JVM or do simpler profiling/sampling on it. Let’s see how to enable JMX and how to access it securely, over SSH.

Background: We run NameNode, JobTracker and Hive on the same server. Monitoring og TaskTrackers and DataNodes isn’t that interesting but still might be useful to have.



diff --git a/etc/hadoop/hadoop-env.sh b/etc/hadoop/hadoop-env.sh
index 69a13b1..e8ca596 100644
--- a/etc/hadoop/hadoop-env.sh
+++ b/etc/hadoop/hadoop-env.sh
@@ -14,7 +14,8 @@ export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}

 # Extra Java runtime options. Empty by default.
-export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true $HADOOP_CLIENT_OPTS"
+# Added $HIVE_OPTS that is set by hive-env.sh when starting hiveserver
+export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true $HADOOP_CLIENT_OPTS $HIVE_OPTS"

 # Command specific options appended to HADOOP_OPTS when specified
 export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT $HADOOP_NAMENODE_OPTS"
@@ -43,3 +44,16 @@ export HADOOP_SECURE_DN_PID_DIR=/var/run/hadoop

 # A string representing this instance of hadoop. $USER by default.
+### JMX settings
+export JMX_OPTS=" -Dcom.sun.management.jmxremote.authenticate=false \
+    -Dcom.sun.management.jmxremote.ssl=false \
+    -Dcom.sun.management.jmxremote.port"
+#    -Dcom.sun.management.jmxremote.password.file=$HADOOP_HOME/conf/jmxremote.password \
+#    -Dcom.sun.management.jmxremote.access.file=$HADOOP_HOME/conf/jmxremote.access"

The JMX setting is used for Hadoop’s daemons while the HIVE_OPTS was added for Hive.

<hive home>/conf/hive-env.sh

Enable JMX when running the Hive thrift server (we don’t want it when running the command-line client etc. since it’s pointless and we wouldn’t need to make sure that each of them has a unique port):

if [ "$SERVICE" = "hiveserver" ]; then
  JMX_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=8008"


When you start Hive server via hive –service hiveserver then it actually executes “hadoop jar …” so to be able to pass options from hive-env.sh to the JVM we had to add $HIVE_OPTS in hadoop-env.sh. (I haven’t found a cleaner way to do it.)


When we now start Hive or any of the Hadoop daemons, they will expose their metrics at their respective ports (NameNode – 8006, JobTracker – 8007, Hive – 8008).

(If you are running DataNode and/or TaskTracker on the same machine then you’ll need to change their ports to be unique.)

Secure Connection Over SSH

Read the post VisualVM: Monitoring Remote JVM Over SSH (JMX Or Not) to find out how to connect securely to the JMX ports over ssh, f.ex. with VisualVM (spolier: ssh -D 9696 hostname; use proxy at localhost:9696).



Download the ‘Practical Blueprint to Continuous Delivery’ to learn how Automic Release Automation can help you begin or continue your company’s digital transformation.


Published at DZone with permission of Jakub Holý, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}