Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Connecting to MapR via JDBC

DZone's Guide to

Connecting to MapR via JDBC

Learn to do some connection testing via JDBC to Hiveserver 2 in MapR 4.0.2 in a secure and kerberized cluster. Connect via JDBC to a Kerberized cluster.

· Big Data Zone
Free Resource

Access NoSQL and Big Data through SQL using standard drivers (ODBC, JDBC, ADO.NET). Free Download 

I recently needed to do some connection testing via JDBC to Hiveserver 2 in MapR 4.0.2 in a secure and kerberized cluster. Since this was new to me I thought it would be worthwhile to write-up my notes as I will sure forget half of this in less than a few weeks. Just in case you are needing to connect via JDBC to a Kerberized cluster here are some tips.

MapR has its own login system called maprlogin for use in a secure cluster setup that requires authentication via password or Kerberos to use the cluster.

hadoop fs -ls /
ls: failure to login: Unable to obtain MapR credentials

Once completed users can act on the cluster via the command line normally via the hadoop commands (or through the Linux command line in NFS mounted MapR-FS).

kinit -V -kt ./mapr.keytab mapr/my.cluster.com@TEST.DOMAIN.COMUsing default cache: /tmp/krb5cc_7286Using principal: mapr/my.cluster.com@TEST.DOMAIN.COMUsing keytab: ./mapr.keytabAuthenticated to Kerberos v5klistTicket cache: FILE:/tmp/krb5cc_7286Default principal: mapr/my.cluster.com@TEST.DOMAIN.COM Valid starting  Expires  Service principal06/16/15 10:20:51  06/16/15 20:20:51  krbtgt/TEST.DOMAIN.COM@TEST.DOMAIN.COM  renew until 06/18/15 10:20:51maprlogin kerberosMapR credentials of user 'mapr' for cluster 'my.cluster.com' are written to '/tmp/maprticket_7286'hadoop fs -ls /Found 8 itemsdrwxr-xr-x  - 2000 2000  0 2015-01-24 16:33 /appsdrwxrwxrwx  - 2000 2000  16 2015-06-11 16:02 /datadrwxr-xr-x  - 2000 2000  0 2015-01-24 16:33 /hbasedrwxr-xr-x  - 2000 2000  1 2015-06-12 09:34 /ooziedrwxr-xr-x  - 2000 2000  1 2015-02-13 10:54 /sasdrwxrwxrwx  - 2000 2000  91 2015-06-16 10:16 /tmpdrwxr-xr-x  - 2000 2000  36 2015-06-15 09:43 /userdrwxr-xr-x  - 2000 2000  1 2015-01-24 16:33 /var

For JDBC connections users must use the principal configured by the Kerberos and Hadoop Admins for the hive service (not their user principal – that is used in the previous step). See the setting “hive.server2.authentication.kerberos.principal” along with the matching keytab specified in “hive.server2.authentication.kerberos.keytab” which in MapR is typically under /opt/mapr/conf/mapr.keytab.

As a quick refresher here is what we are dealing with in terms of a Kerberos principal for a JDBC connection string:

jdbc:hive2://HS2.FQDN:10000/default;principal=hive/fully.qual.dn.com@ TEST.DOMAIN.COM

HS2.FQDN – Hive server 2 fully qualified Domain name and port – Default is 10000

primary – username – user must exist on Hiveserver2 node. You will normally see a user principal that matches the service name – i.e., hive, yarn etc.

Instance – use a FQDN that is resolvable both forward and reverse. This is a property of Apache Hive not specific to the MapR distribution (Hive version 0.13 was used in this testing against MapR 4.0.2). Hive attempts to resolve the hostname (in green above) when connecting via Kerberos.

Realm – correct Kerberos Realm name – see your local /etc/krb5.conf or consult your Kerberos Administrator.

And here are some examples of how to connect. Beeline is a Hive service you can use that is built into Hive. There are many tools that use a similar methods of connection that can be used.

hive --service beeline

Beeline version 0.13.0-mapr-1501 by Apache Hive

beeline> !connect jdbc:hive2://hs2.server.name:10000/default;principal=hive/hs2.server.name.com@TEST.DOMAIN.COM

scan complete in 5ms

Connecting to jdbc:hive2://hs2.server.name:10000/default;principal=hive/hs2.server.name.com@TEST.DOMAIN.COM

Enter username for jdbc:hive2://hs2.server.name:10000/default;principal=hive/hs2.server.name.com@TEST.DOMAIN.COM:

Enter password for jdbc:hive2://hs2.server.name:10000/default;principal=hive/hs2.server.name.com@TEST.DOMAIN.COM:

Connected to: Apache Hive (version 0.13.0-mapr-1501)

Driver: Hive JDBC (version 0.13.0-mapr-1501)

Transaction isolation: TRANSACTION_REPEATABLE_READ

0: jdbc:hive2://hs2.server.name:10000/default> show tables;

+-----------+

| tab_name  |

+-----------+

+-----------+

No rows selected (0.284 seconds)

0: jdbc:hive2://hs2.server.name:10000/default>

You may need to either call the beeline client with a fake user name and password or hit enter twice interactively. This is a known bug.

Java Client

import java.io.*;
import java.util.*;
import java.sql.*;

public class HiveJdbcClient3 {
    public static void main(String[] args) throws SQLException,ClassNotFoundException,InstantiationException,IllegalAccessException {

        Driver hiveDriver;
        Connection con;
        String connection_string;
        Properties props = new Properties();
        connection_string = "jdbc:hive2://" + args[0] + ":" + args[1] + "/default;principal=" + args[2];
        System.out.println( "Full connection string is: " + connection_string );
        hiveDriver = (Driver)Class.forName("org.apache.hive.jdbc.HiveDriver").newInstance();
        System.out.println("BEFORE");
        con = hiveDriver.connect( connection_string, props );
        System.out.println("AFTER");
        Statement stmt = con.createStatement();

        // See if show tables works
        ResultSet res;
        String sql = "show tables" ;
        System.out.println("Running: " + sql);
        stmt.executeQuery(sql);
        System.out.println("Done with show tables");

}
}


The fastest databases need the fastest drivers - learn how you can leverage CData Drivers for high performance NoSQL & Big Data Access.

Topics:
jdbc ,big data ,hiveserver ,mapr ,bigdata

Published at DZone with permission of Adam Diaz, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}