DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
View Events Video Library
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Integrating PostgreSQL Databases with ANF: Join this workshop to learn how to create a PostgreSQL server using Instaclustr’s managed service

Mobile Database Essentials: Assess data needs, storage requirements, and more when leveraging databases for cloud and edge applications.

Monitoring and Observability for LLMs: Datadog and Google Cloud discuss how to achieve optimal AI model performance.

Automated Testing: The latest on architecture, TDD, and the benefits of AI and low-code tools.

Related

  • Leverage Lambdas for Cleaner Code
  • The Generic Way To Convert Between Java and PostgreSQL Enums
  • Kafka JDBC Source Connector for Large Data
  • Geo-Distributed Microservices and Their Database: Fighting the High Latency

Trending

  • Five Tools for Data Scientists to 10X their Productivity
  • AWS vs. Azure vs. Google Cloud: Comparing the Top Cloud Providers
  • Future Skills in Cybersecurity: Nurturing Talent for the Evolving Threatscape
  • Java Parallel GC Tuning
  1. DZone
  2. Data Engineering
  3. Databases
  4. Using an Impala JDBC Driver to Query Apache Kudu

Using an Impala JDBC Driver to Query Apache Kudu

Learn how to access the Progress DataDirect Impala JDBC driver so that you can query Kudu tablets using Impala SQL syntax.

Saikrishna Teja Bobba user avatar by
Saikrishna Teja Bobba
·
Jul. 25, 17 · Tutorial
Like (0)
Save
Tweet
Share
6.60K Views

Join the DZone community and get the full member experience.

Join For Free

Apache Kudu is columnar storage manager for Apache Hadoop platform that provides fast analytical and real-time capabilities, efficient utilization of CPU and I/O resources, the ability to do updates in place and an evolvable data model that’s simple. You can learn more about Apache Kudu features in detail from the documentation.

One of the features of Apache Kudu is that it has a tight integration with Apache Impala, which allows you to insert, update, delete, or query Kudu data, along with several other operations. In this tutorial, we will walk you through on how you can access Progress DataDirect Impala JDBC driver to query Kudu tablets using Impala SQL syntax.

Prerequisites

Before you start with this tutorial, we expect you to have an existing Apache Kudu instance with Impala installed. If you don’t, you can follow this getting started tutorial to spin up an Apache Kudu VM and load the data into it.

This tutorial also assumes that you have the Progress DataDirect Impala JDBC driver. If you do not, follow these three simple steps:

  1. Download the Cloudera Impala JDBC driver.
  2. Once the package is downloaded, unzip the package and run the program PROGRESS_DATADIRECT_JDBC_INSTALL.exe.
  3. The installation process will be simple — just follow the instructions. For most users, the default settings will be sufficient to install the driver successfully.

Configure and Test Connection

  1. To configure and connect to Apache Kudu using the DataDirect Impala JDBC driver, we will be using SQL Workbench.
  2. Open SQL Workbench and go to File > Connect Window, which will open a new window. On the bottom left of that window, you will find a button named Manage Drivers. Click on it.
  3. Add a new driver by clicking on the New button. Give the name as Impala and browse the path to impala.jar, which will be in the lib folder of installed directory, as shown below. Click OK once you are finished.

    kudu_datadirect_1

  4. You should be back at the Connect window. Create a new connection, give any name to it, and choose Impala(com.ddtek.jdbc.impala.ImpalaDriver) as your driver.

  5. Fill in the URL for connection in the following format and credentials in respective fields as shown below.

    jdbc:datadirect:impala://<Server_Address>:<port>

    kudu_datadirect_2

  6. Click on the Test button and you should be able to connect successfully. Click on OK and you should now be able to query your Apache Kudu without any problem.

Sample Queries

Once you have followed this getting started tutorial for Apache Kudu, you can run queries against the data.

For example, here are a few basic queries to test it out:

select * from sfmta LIMIT 1
INSERT INTO sfmta VALUES(1323, 123, -122.32, 32.22, 12.322, 52.0)
select * from sfmta where report_time = 1323 

We hope this tutorial helped you to get connected to Apache Kudu using Progress DataDirect Impala JDBC driver.

Driver (software) Database Java Database Connectivity

Published at DZone with permission of Saikrishna Teja Bobba, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Leverage Lambdas for Cleaner Code
  • The Generic Way To Convert Between Java and PostgreSQL Enums
  • Kafka JDBC Source Connector for Large Data
  • Geo-Distributed Microservices and Their Database: Fighting the High Latency

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: