Over a million developers have joined DZone.

Why Cassandra is No Good for ETL

DZone's Guide to

Why Cassandra is No Good for ETL

· Big Data Zone ·
Free Resource

The open source HPCC Systems platform is a proven, easy to use solution for managing data at scale. Visit our Easy Guide to learn more about this completely free platform, test drive some code in the online Playground, and get started today.

According to this recent blog post from Aras Can Akin, Cassandra is no good for ETL. That's not to say that Cassandra is not good at all - Akin is a current Cassandra user and has good to say about it - but Akin takes issue with the perception of Cassandra as a do-all replacement for something like MySQL. He says:

. . . Cassandra IS NOT the mysql replacement. Tech people need to know that. It’s a fantastic distributed key-value store, but currently, it’s nothing more than that. However, Cassandra developers keeps saying that it’s designed for time-series data or it’s good at ETL. However, it isn’t. It is only a scalable distributed key-value store, nothing more.

Most of the post is a case study involving Akin's own experiences migrating from MySQL to Cassandra, and the various problems and workarounds that popped up during the process. Ultimately, Akin concludes, the workarounds are not solutions that he is happy with.

What do you think? Is Cassandra anything more than a distributed key-value store? Is there misinformation out there as to its strengths?

Managing data at scale doesn’t have to be hard. Find out how the completely free, open source HPCC Systems platform makes it easier to update, easier to program, easier to integrate data, and easier to manage clusters. Download and get started today.


Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}