DZone
Java Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > Java Zone > An Introduction To Cassandra: The Data Model

An Introduction To Cassandra: The Data Model

James Sugrue user avatar by
James Sugrue
CORE ·
Sep. 14, 10 · Java Zone · Interview
Like (0)
Save
Tweet
20.03K Views

Join the DZone community and get the full member experience.

Join For Free

I'm fairly new to the whole NoSQL game, and one thing I keep hearing is how great Cassandra  is. Built by Facebook and open sourced in 2008, Cassandra is probably the most popular NoSQL implementation: "A massively scalable, decentralized, structured data store". Cassandra takes it's distribution features from Dynamo and the data model from BigTable.

Before we look at using Cassandra, we first need to understand the data model. For developers new to Cassandra, coming from a relational database background,  the data model can be a bit confusing. Here's a summary of how the Cassandra data model is composed:

Column

A Column is the most basic element in Cassandra: a simple tuple that contains a name, value and timestamp. All values are set by the client. That's an important consideration for the timestamp,as it means you'll need clock synchronization.



SuperColumn

A SuperColumn is a column that stores an associative array of columns. You could think of it as similar to a HashMap in Java, with an identifying column (name) that stores a list of columns inside (value). The key difference between a Column and a SuperColumn is that the value of a Column is a string, where the value of a SuperColumn is a map of Columns. Note that SuperColumns have no timestamp, just a name and a value.



ColumnFamily

A ColumnFamily hold a number of Rows, a sorted map that matches column names to column values.  A row is a set of columns, similar to the table concept from relational databases. The column family holds an ordered list of columns which you can reference by column name.

The ColumnFamily can be of two types, Standard or Super. Standard ColumnFamilys contain a map of normal columns,

 

meanwhile Super ColumnFamily's contain rows of SuperColumns.



KeySpaces

KeySpaces are the largest container, with an ordered list of ColumnFamilies, similar to a database in RDMS. The KeySpace is normally named after the application.

Multiple KeySpaces reside in clusters, the machines/nodes in a Cassandra instance. 

 

For another summary of the Cassandra data model, check out the (nicely titled) "WTF is a SuperColumn".

In the next article in this introduction series, we'll move onto the good stuff: using Cassandra in Java.

Data model (GIS) Data (computing) Relational database Database

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • How to Properly Format SQL Code
  • Autowiring in Spring
  • 5 Steps to Strengthen API Security
  • 12 Kick-Ass Software Prototyping and Mockup Tools

Comments

Java Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo