Over a million developers have joined DZone.

Using a JDBC Driver with Apache Zeppelin

This article is a short introduction to the Apache Zeppelin product and its Interpreter. It includes instructions for some how-tos, including configuring the DataDirect Oracle JDBC interpreter.

· Big Data Zone

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

Have you heard of Apache Zeppelin? If you haven’t, you will. In this tutorial, learn how to use Progress JDBC connectors with this one-stop notebook to satisfy all your BI needs.

Apache Zeppelin is a one-stop notebook designed by the Apache open source community. This web-based notebook can help you with:

  • Data Ingestion
  • Data Discovery
  • Data Analytics
  • Data Visualization and Collaboration

What sets Zeppelin apart from other similar tools others is its Interpreter. The interpreter allows you to write code in any language for data processing, which can then be plugged into Zeppelin. It has support for interpreters for Apache Spark, R, Hive, Shell, Cassandra and more. With the recent release of 0.6.0, Apache Zeppelin started supporting JDBC as its interpreter. What this means is that now you can use Progress JDBC drivers to connect to any Relational, SaaS/Cloud, Big Data, and NoSQL data sources.

To get you started, we created an easy tutorial on how to use Apache Zeppelin with the Progress DataDirect Oracle Database JDBC driver. Note that you can use the similar process with any of our JDBC drivers.

Before You Start

  1. Make sure that you have installed Java on your machine. You can check this by running the command java -version on your terminal.
  2. Install Apache Zeppelin by cloning this GitHub repository and by following instructions in the README file of the repository

Installing DataDirect Oracle JDBC Driver

  1. Download the DataDirect Oracle JDBC driver from here.
  2. To install the driver, you have to execute the .jar package. You can do it by running the following command in terminal:
  3. This will launch an interactive java installer, which you can use to install the Oracle JDBC driver to your desired location as either a licensed or evaluation installation

Configure DataDirect Oracle JDBC interpreter in Apache Zeppelin

  1. If you haven’t started the Apache Zeppelin, start it by running zeppelin-daemon.sh start command in terminal at zeppelin_install_dir/bin
  2. Browse http://localhost:8080/ to access Zeppelin on your browser. You should see a welcome screen below if you have successfully connected. Welcome to Zeppelin
  3. To configure interpreter for Oracle JDBC, click on your username from the navigation bar and click on the option ‘Interpreter.’ This opens a new page which shows all the existing interpreters that have been configured.
  4. To create a new interpreter, click on ‘Create’ button as shown in the screenshot below: Interpreters
  5. Name your interpreter as you like, and for interpreter group choose ‘jdbc.’ You should see a new form with some default values filled in. I am going to change them as below to connect to Oracle DB using DataDirect Oracle JDBC driver.  Properties:
    • Default.url: 
    • Default.driver
    • default.user:
    • default.password:
      You can remove all the other properties by clicking the ‘X’ button under actions.
    • artifact:
    Save the interpreter with these settings and your interpreter will be created. The following is a screenshot for how the configured interpreter looks after saving:Interpreter

Visualize Your Data

  1. Create a new Notebook by clicking on the Notebook drop-down menu on the navigation bar, and then clicking on ‘Create new note.’ Name your new note and click OK.
  2. Before you start, you need to check if interpreter binding is properly done for this particular note. To do that you need to go to interpreter binding settings by clicking on a little gear icon on your Notebook. The following screenshot shows how to access it, in case you are unable to locate it: Interpreter Binding
  3. Make sure that you select the interpreter that we have created above. Unselect everything else that you don’t need and save the binding.
  4. By default, you should see %spark, which you should change to %jdbc, so that the notebook uses the JDBC interpreter we created in above steps
  5. In the next line, you can just run any query to fetch the data from Oracle DB to Apache Zeppelin according to your needs. In this tutorial, I have North wind sample data set in my Oracle DB, and I ran the query to fetch all the data from Orders table as shown below: Fetch All The Data From Orders Table
  6. Now that you have fetched your required data, change the view of results from tabular data view to bar charts, pie diagrams or line charts to see visualizations of your data. Below is a simple line chart, which visualizes the data for a number of orders from each country.Data Visualization

Hortonworks Sandbox is a personal, portable Apache Hadoop® environment that comes with dozens of interactive Hadoop and it's ecosystem tutorials and the most exciting developments from the latest HDP distribution, brought to you in partnership with Hortonworks.


Published at DZone with permission of Nishanth Kadiyala, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}