DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
  1. DZone
  2. Data Engineering
  3. Databases
  4. Jumping from MySQL to Cassandra: A Success Story

Jumping from MySQL to Cassandra: A Success Story

Jose Alvarez Muguerza user avatar by
Jose Alvarez Muguerza
·
Jan. 18, 12 · Interview
Like (0)
Save
Tweet
Share
19.41K Views

Join the DZone community and get the full member experience.

Join For Free

Today I’m gonna share with you my experience when I started with Apache Cassandra…One of the most complicated steps to learn any NoSql stuff, is to take away of your mind the normalization principles and those relational DB structures. Relational databases are designed to persist normalized data and without duplicated data. Well, one of the main changes here is that you need to think or design for your queries, in what your reports or finder methods want, and build a the persistent structure as it need.

Cents of web pages, books, papers treat about What Cassandra is, What Hazelcast is, What Hadoop, MemcacheDB, MongoDB, etc….But none of them treat about HOW TO migrate my data from a relational DB to one of them.

We wanted to migrate the persistent data of two our modules, Turmeric SOA Monitoring and Turmeric SOA Rate Limiting data. In Turmeric we use MySql as relational database. After a week reading and analyzing several NoSql options we decided for Cassandra. <— I hope to write another post about the whys…. btw, I highly recommended this reading: Cassandra: The Definitive Guide

From Relational tables to Keyspaces


The big deal now is How to migrate them. Well this is what we did:
Following an Agile best practice, if something is to hard or complex, just, break it in small challenges. After all we still had a good gap for a MMF (“Minimal Marketable Feature”, refer to Software by Numbers. So:

Step 1: Move our Relational DB tables to Cassandra Colum Family
Step 2: Customize our new Column Families in order to have all needed data without a like JOIN operators
Step 3: Explode those Column Families as finder and query method needs. Typically a finder or query method should use 1 Column Family
Step 4: Customize Creators and Updater methods according previous changes. Don’t be scared if you are saving duplicated data. Keep in mind, “think for your queries!, forget to normalization rules.”
Step 5: while (!pleased) -> do step 3 and 4

A Cassandra DAO


Now, the hardest step is #1. Don’t panic, we developed a kind of generic (in fact it uses Java Generics) Cassandra DAO for your migration. As all this work was needed for the project I’m actually working on, you will find it as a submodule of TurmericSOA, but following the Apache License you can use it through your Maven dependency file.

<dependency>
<groupId>org.ebayopensource.turmeric.utils</groupId>
<artifactId>turmeric-utils-cassandra</artifactId>
<version>1.2.0.0-SNAPSHOT</version>
<type>jar</type>
</dependency>


Features

  • 100% Java code
  • It can runs an Embedded Cassandra Service or just talk to your external Cassandra Service
  • Uses Hector library as Java Cassandra client
  • Dynamically [Super] Column Family creation
  • Key Types and Data Types defined at runtime with the use of Generics
  • Main CRUD methods supported:

 

boolean containsKey(KeyType key);

void delete(KeyType key);

T find(KeyType key);

Map> findItems(final List keys, final Long rangeFrom, final Long rangeTo);

Set findItems(final List keys, final String rangeFrom, final String rangeTo);

Set getKeys();

void save(KeyType key, T model);

 

Main Classes
This util package contains the following package and classes:

org.ebayopensource.turmeric.utils.cassandra.service

  • CassandraManager: initialize a static EmbeddedCassandraService instance based on yaml configuration file

org.ebayopensource.turmeric.utils.cassandra.hector

  • HectorManager: Manages the keyspace and column family creation and reading. It uses Hector Api
  • HectorHelper: Includes some utility methods based on Java Reflection and Java Generics. IE: retrieving the field names from a POJO which are used as column names in cassandra keyspaces

org.ebayopensource.turmeric.utils.cassandra.dao

  • AbstractColumnFamilyDao: As it is called, this should be a base class that every dao should extends. It defines and implements basic DAO operation with the use of Hector Api.

Configuration files

  • log4j.properties: Log4j properties files
  • cassandra.yaml: Storage configuration file. For more info: storage configuration setup.

Here is the directory structure of the configuration files:

META-INF/
         security/
                  config/
                         cassandra/
                                   cassandra.properties


An example of this property file:

cassandra-cluster-name=TurmericCluster
cassandra-host-ip=127.0.0.1
cassandra-rpc-port=9160
cassandra-my-keyspace=My-keyspace

#column families
cassandra-foo-column-family=foo
cassandra-bar-column-family=bar


How to use it….


It is very intuitive. Lets suppose we have a Foo table in our relational DB, ie MySql.
So:

Create the BaseDao interface

public interface BaseDao {
		  public void delete(String key);
		  public Set getKeys();
		  public boolean  containsKey(String key);
		  public void save(String key, FooPojoClass  fooPojo);
		  public FooPojoClass find(String key);
}


Create the FooDao interface

public interface FooDao extends BaseDao  {
}

 

Create the FooDao implementation

public class FooDaoImpl extends AbstractColumnFamilyDao
		implements FooDao {
	public FooDaoImpl(final String clusterName, final String host, final String keySpace, final String cf,  final Class kTypeClass) {
		super(clusterName, host, keySpace, kTypeClass, FooPojo.class, cf);
	}

}


… in your code

//initiates an embedded Cassandra Service
CassandraManager.initialize();

//creates our Foo Column Family
FooDao fooDao = new FooDaoImpl("myCluster", "127.0.0.1", "myKeyspace",
				"myColumnFamilyName", String.class);


and voilà, you have your relational table migrated as a Cassandra column family!!!

Anyways your can surf at UT classes to see how are they implemented…

enjoy it!!!


Source: http://itsecrets.wordpress.com/2012/01/12/jumping-from-mysql-to-cassandra-a-success-story/

Database Relational database MySQL Column family Data (computing)

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • AWS CodeCommit and GitKraken Basics: Essential Skills for Every Developer
  • Microservices 101: Transactional Outbox and Inbox
  • mTLS Everywere
  • Is DevOps Dead?

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: