Using an Impala JDBC Driver to Query Apache Kudu

DZone 's Guide to

Using an Impala JDBC Driver to Query Apache Kudu

Learn how to access the Progress DataDirect Impala JDBC driver so that you can query Kudu tablets using Impala SQL syntax.

· Database Zone ·
Free Resource

Apache Kudu is columnar storage manager for Apache Hadoop platform that provides fast analytical and real-time capabilities, efficient utilization of CPU and I/O resources, the ability to do updates in place and an evolvable data model that’s simple. You can learn more about Apache Kudu features in detail from the documentation.

One of the features of Apache Kudu is that it has a tight integration with Apache Impala, which allows you to insert, update, delete, or query Kudu data, along with several other operations. In this tutorial, we will walk you through on how you can access Progress DataDirect Impala JDBC driver to query Kudu tablets using Impala SQL syntax.


Before you start with this tutorial, we expect you to have an existing Apache Kudu instance with Impala installed. If you don’t, you can follow this getting started tutorial to spin up an Apache Kudu VM and load the data into it.

This tutorial also assumes that you have the Progress DataDirect Impala JDBC driver. If you do not, follow these three simple steps:

  1. Download the Cloudera Impala JDBC driver.
  2. Once the package is downloaded, unzip the package and run the program PROGRESS_DATADIRECT_JDBC_INSTALL.exe.
  3. The installation process will be simple — just follow the instructions. For most users, the default settings will be sufficient to install the driver successfully.

Configure and Test Connection

  1. To configure and connect to Apache Kudu using the DataDirect Impala JDBC driver, we will be using SQL Workbench.
  2. Open SQL Workbench and go to File > Connect Window, which will open a new window. On the bottom left of that window, you will find a button named Manage Drivers. Click on it.
  3. Add a new driver by clicking on the New button. Give the name as Impala and browse the path to impala.jar, which will be in the lib folder of installed directory, as shown below. Click OK once you are finished.


  4. You should be back at the Connect window. Create a new connection, give any name to it, and choose Impala(com.ddtek.jdbc.impala.ImpalaDriver) as your driver.

  5. Fill in the URL for connection in the following format and credentials in respective fields as shown below.



  6. Click on the Test button and you should be able to connect successfully. Click on OK and you should now be able to query your Apache Kudu without any problem.

Sample Queries

Once you have followed this getting started tutorial for Apache Kudu, you can run queries against the data.

For example, here are a few basic queries to test it out:

select * from sfmta LIMIT 1
INSERT INTO sfmta VALUES(1323, 123, -122.32, 32.22, 12.322, 52.0)
select * from sfmta where report_time = 1323 

We hope this tutorial helped you to get connected to Apache Kudu using Progress DataDirect Impala JDBC driver.

database ,impala ,jdbc ,kudu ,querying ,sql ,tutorial

Published at DZone with permission of Saikrishna Teja Bobba , DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}