DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
The Latest "Software Integration: The Intersection of APIs, Microservices, and Cloud-Based Systems" Trend Report
Get the report
  1. DZone
  2. Data Engineering
  3. Big Data
  4. Twitter Passes On Cassandra-based Tweet Storage For Now

Twitter Passes On Cassandra-based Tweet Storage For Now

Mitch Pronschinske user avatar by
Mitch Pronschinske
·
Jul. 12, 10 · Interview
Like (0)
Save
Tweet
Share
12.56K Views

Join the DZone community and get the full member experience.

Join For Free
An update on the state of Cassandra at Twitter sparked a huge controversy over the weekend as flame wars broke out with some declaring Cassandra's fall from grace as the champion of the NoSQL movement.  Others stood by Apache's prized project, saying that detractors were making a gross misinterpretation of Twitter's announcement while defending the data store against technical criticisms.  Let's look at the facts first, and then the flames.

By no means has Twitter stopped using Cassandra.  They have stated that it's currently being used to store geolocation data and data mining results that feed into things like local trends and @toptweets.  Twitter is also planning to use Cassandra as a part of their monetzation strategy.  Specifically, they are using it for a realtime analytics product (to be used "internally and externally") that is currently under development.  

What sparked the controversy and speculation was Twitter's decision to hold off on moving tweet storage over to Cassandra.  "This is a change in strategy," said Ryan King, an architect at Twitter.  "Instead we're going to continue to maintain our existing Mysql-based storage. We believe that this isn't the time to make large scale migration to a new technology."

Now here's an example of the backlash against Cassandra and the responses to detractors:

"The bloom is starting to come off NoSQL, which is normal - it means that people & firms are trying to do more with it and most probably realizing that all of the tools, support, infrastructure, etc. surrounding alternative solutions isn't such a bad thing.  And that the world of NoSQL had start to come up with a better mantra than "joins are bad, dude", and "you're just protecting the status quo."  There's a *lot more* big data wrapped up inside of SQL databases and only a fraction of the in NoSQL - and there's a lot of reasons for it." --Colin Clark

"You are, for whatever reason, using the dullest of cliches as if they were informed opinion.  Nobody with actual knowledge of the space says "joins are bad, dude".  What they might say is "When you have petabytes and low latency requirements, joins are an expensive proposition".  That is clearly a true statement and constructing indices in a column store to avoid joins is a reasonable decision to avoid that expense.  Is it free?  Of course not, nothing is." --Response, Benjamin Black

"For example, do I *really* need Cassandra if MySQL will work for me and I just want to get up and running quickly without writing a bunch of code?  My team was pushing greater than 20k updates per second into, GASP, Oracle 5 years ago.  Sure, it was expensive.  But it worked.  And it was worth it - or we wouldn't have spent the $$.  What's your data worth if you don't have your data? zero." --Colin Clark

"Had you spent any time on the irc channel you would've seen this advice given repeatedly.  If you don't need what Cassandra does, don't use it.  That you have seen 20k updates/sec on really expensive hardware with a SQL store is neither surprising nor relevant.  As you must realize, those choose to ignore, Cassandra is about more than just high, per-node write throughput.  It is about seamless scale-out of a single cluster, robustness in the face of node failure and network partition, etc.  Can you do that with a SQL store?  Certainly.  Expect to pay 5x in hardware and not be able to operate multi-DC."  It's what folks call a trade-off. --Response, Benjamin Black

"And then there's support - internal support.  Picking a database du-jour is organizationally expensive.  Especially when there's probably one or two databases that Twitter could have bought off the shelf that would have solved their problems." --Colin Clark

"You have no idea what their actual problems are and are merely engaging in the favorite game of HN and similar venues: armchair engineering." --Response, Benjamin Black

And here was one blogger's take on the issue:

"Twitter is busy fighting other fires and they don't have the time to retrofit something that is (more or less) working, namely their MySQL based tweet storage, with a completely new technology based on Cassandra. Does this mean Cassandra and NoSQL suck? No, I think it's just smart project planning." --Todd Hoff, High Scalability

Perhaps a change is what Twitter needs though.  I personally find that out of all of the sites that I frequently visit, Twitter is the most buggy and experiences the most downtime.  I'm not saying that Cassandra could fix that, I have no idea, but I hope they will start making more money so they can bring in some heavy duty solutions and make their site super-reliable.

Thoughts?  ..on Twitter… or Cassandra?

twitter Database Big data

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Important Takeaways for PostgreSQL Indexes
  • What Is QA as a Service?
  • Reconciling Java and DevOps with JeKa
  • Deploying Prometheus and Grafana as Applications using ArgoCD — Including Dashboards

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: