Over a million developers have joined DZone.

Why Cassandra is No Good for ETL

DZone's Guide to

Why Cassandra is No Good for ETL

· Big Data Zone ·
Free Resource

Cloudera Data Flow, the answer to all your real-time streaming data problems. Manage your data from edge to enterprise with a no-code approach to developing sophisticated streaming applications easily. Learn more today.

According to this recent blog post from Aras Can Akin, Cassandra is no good for ETL. That's not to say that Cassandra is not good at all - Akin is a current Cassandra user and has good to say about it - but Akin takes issue with the perception of Cassandra as a do-all replacement for something like MySQL. He says:

. . . Cassandra IS NOT the mysql replacement. Tech people need to know that. It’s a fantastic distributed key-value store, but currently, it’s nothing more than that. However, Cassandra developers keeps saying that it’s designed for time-series data or it’s good at ETL. However, it isn’t. It is only a scalable distributed key-value store, nothing more.

Most of the post is a case study involving Akin's own experiences migrating from MySQL to Cassandra, and the various problems and workarounds that popped up during the process. Ultimately, Akin concludes, the workarounds are not solutions that he is happy with.

What do you think? Is Cassandra anything more than a distributed key-value store? Is there misinformation out there as to its strengths?

 Cloudera Enterprise Data Hub. One platform, many applications. Start today.


Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}