DZone
Big Data Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > Big Data Zone > Big Data Specifications, Part I: Configuring a MySQL Metastore in Apache Hive

Big Data Specifications, Part I: Configuring a MySQL Metastore in Apache Hive

Apache Hive lets users load, analyze, and query data. One way to implement these queries is to configure a MySQL metastore.

Harsh Sharma Khandal user avatar by
Harsh Sharma Khandal
·
Dec. 19, 16 · Big Data Zone · Tutorial
Like (4)
Save
Tweet
5.74K Views

Join the DZone community and get the full member experience.

Join For Free

Apache Hive is used as a data warehouse over Hadoop to provide users a way to load, analyze, and query the data from various resources. Data is stored in databases or file systems like HDFS (Hadoop Distributed File System). Hive can use Spark SQL or HiveQL for the implementation of queries.

Now, Hive uses its metastore, which contains the following information,

  • IDs of tables.
  • IDs of databases.
  • Time of creation.
  • Table names.
  • Type of the table.
  • Its owners' names.

Hive metastore is constructed with the following.

Metastore DB

Metastore DB is defined as a Relational Database Management System (RDBMS), which contains the metadata for the schema and the two major types of tables:

  1. Managed tables.
  2. External tables.

Metastore Service

Metastore runs a background service as a metastore service, which is used to perform the database operations, manage the metastore data, and store the data into Hive tables.

Warehouse

Hive basically uses the HDFS to store the data retrieved to the tables, usually under the directory user/hive/warehouse.

Steps to Set Up a MySQL Metastore

Install the MySQL server.

sudo apt-get install mysql-server

Install the MySQL Java connector.

sudo apt-get install libmysql-java

If you are using the Spark’s internal hive, then copy the connector jar file into Spark’s lib folder as:

cp /usr/share/java/mysql-connector-java.jar $SPARK_HOME/lib/mysql-connector-java.jar

If you are using the hive apart from Spark, then copy the connector jar file into Hive’s lib folder as:

cp /usr/share/java/mysql-connector-java.jar $HIVE_HOME/lib/mysql-connector-java.jar

Now, we will use the script hive-schema-0.14.0.mysql.sql according to the version of Hive to create the initial schema for metastore database as below.

First, create the database as a metastore and load the initial schema as:

mysql> CREATE DATABASE metastore;
mysql> USE metastore;
mysql> SOURCE PATH-TO-SCRIPT/hive-schema-0.14.0.mysql.sql;

Then, create a user for Hive and grant the permissions to it:

mysql> CREATE USER 'hiveuser'@'%' IDENTIFIED BY 'hivepassword';
mysql> GRANT all on *.* to 'hiveuser'@localhost identified by 'hivepassword';
mysql> flush privileges;

Now, enter the below code and create the hive-site.xml file in $HIVE_HOME or the $SPARK_HOME/conf folder.

<configuration>
   <property>
      <name>javax.jdo.option.ConnectionURL</name>
      <value>jdbc:mysql://localhost/metastore?createDatabaseIfNotExist=true</value>
      <description>metadata is stored in a MySQL server</description>
   </property>
   <property>
      <name>javax.jdo.option.ConnectionDriverName</name>
      <value>com.mysql.jdbc.Driver</value>
      <description>MySQL JDBC driver class</description>
   </property>
   <property>
      <name>javax.jdo.option.ConnectionUserName</name>
      <value>hiveuser</value>
      <description>user name for connecting to mysql server</description>
   </property>
   <property>
      <name>javax.jdo.option.ConnectionPassword</name>
      <value>hivepassword</value>
      <description>password for connecting to mysql server</description>
   </property>
</configuration>

After we create a table in Hive, we can see the metadata by executing the following queries in MySQL:

mysql> use metastore;
mysql> select * from TBLS;

You're done!

Big data MySQL Apache Hive Relational database

Published at DZone with permission of Harsh Sharma Khandal, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Migrating Legacy Applications and Services to Low Code
  • Understanding Kubernetes Resource Types
  • NextJS Vs React: Key Differences, Advantages and Limitations
  • Java Microservices: Code Examples, Tutorials, and More

Comments

Big Data Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo