DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workkloads.

Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Build a Scalable E-commerce Platform: System Design Overview
  • How to Design Event Streams, Part 1
  • Client-Side Challenges in Developing Mobile Applications for Large User Bases
  • Elevating B2B Products Through User-Centricity

Trending

  • How to Convert XLS to XLSX in Java
  • Microsoft Azure Synapse Analytics: Scaling Hurdles and Limitations
  • It’s Not About Control — It’s About Collaboration Between Architecture and Security
  • Recurrent Workflows With Cloud Native Dapr Jobs
  1. DZone
  2. Data Engineering
  3. Data
  4. Cassandra Design Best Practices

Cassandra Design Best Practices

Take a look at some best practices you might want to consider employing with your own Cassandra-backed project. Read on for the low-down.

By 
Moshe Kaplan user avatar
Moshe Kaplan
·
Jul. 24, 17 · Opinion
Likes (3)
Comment
Save
Tweet
Share
15.0K Views

Join the DZone community and get the full member experience.

Join For Free

Cassandra is a great NoSQL product. It provides near real-time performance for designed queries and enables high availability with linear scale growth as it uses the eventually consistent paradigm.

In this post, we will focus on some best practices for this great product.

How Many Nodes Do You Need? 

The number of nodes should be odd in order to support votes during downtime/network cut.

The minimal number should be 5, as a lower number (such as 3) will result in high stress on the machines during node failure (replication factor is 2 in this case, and each node will have to read 50% of the data and write 50% of data). When you select the replication factor to 3, each node will need to read 15% of the data and write 15% of the data. Therefore, recovery will be much faster, and there is a higher chance that performance and availability will not be affected.

How Big Should Your C* Instances Be?

C*, like any other data store, loves fast disks (SSD) — although its SSTables and INSERT only architecture as much memory as your data. In particular, your nodes should be 32GB to 512GB RAM each (and not less than 8GB in production and 4GB in development). This is a common issue since C* was coded in Java.

C* is also CPU-intensive, and 16 cores are recommended (and not less than 2 cores for development).

Repair and Replace Strategy

nodetool is probably one of the most common tasks on a C* cluster. 

  1. You can run it on a single node or on a whole cluster.
  2. Repair should run before reaching the gc_grace_seconds (default 10 days) that will remove tombstones.
  3. You should run it during off-peak hours (probably during the weekend) if you keep with the gc_grace_seconds.
  4. You can take this numbers down, but it will affect your backup and recovery strategy (see details about recovery from failure using hints).

You can optimize the repair process by using the following flags:

  1. -seq: Repair token after token: slower and safer.
  2. -local: Run only on the local data center to avoid downtime of both in any case.
  3. -parallel: Fastest mode — run on all data centers in parallel.
  4. -j: The number parallel jobs on a node (1-4); using more threads will stress the nodes but will help end the task faster.

We recommend selecting your strategy based on the height of your peaks and the sensitivity of your data. If your system has the same level of traffic 24/7, consider doing things slowly and sequentially. The higher your peaks, the more stress you should put on your system during off peak hours.

Backup Strategy

There are several backup strategies you can have:

  1. Utilize your storage/cloud storage snapshot capabilities. 
  2. Use C* nodetool snapshot command. This one is very similar to your storage capabilities but enables backup only the data and not the whole machine.
  3. Use C* incremental backup that will enable point-in-time recovery. This process is not a daily process, but requires copying and managing small files all the time
  4. Mix C* snapshots and incremental backups to minimize the time of recovery while keeping the point of time recovery option.
  5. Snapshots and commit log: complex process to recover that supports point in time recovery, as you need to reply the commit log.

We recommend using the daily snapshot if your data is not critical, if you want to minimize your Ops costs, or if there's a mix of C* snapshots and incremental backup when you must have a point in time recovery.

Monitoring

There are several approaches to go with.

  1. Commerical software
    1. DataStax OpsCenter solution: Like almost every other OSS, DataStax provides the commercial version of C* and a paid-for management and monitoring solution
  2. Commercial services including:
    1. NewRelic: Provides a C* plugin as part of its platform
    2. DataDog: With a nice hint on what should be monitored.
  3. Use open source with common integration:
    1. Graphite, Grafana, or Prometheus: 3 tools that can work together or apart and integrated with time series and relevant metrics.
    2. Old-style Nagios and Zabbix that provides community plugins

If you choose a DIY solution, there some hints that you can find in the commercial products and services and also in the following resources:

  1. Basic monitoring thresholds.
  2. Nagios out-of-the-box plugins that thresholds can be extracted from.

For example:

  1. Heap usage: 85% (warning), 95% (error).
  2. GC ConcurrentMarkSweep: 9 (warning), 15 Error.

Our recommendation is starting (when possible) with an existing service/product, getting experience with the metrics that are relevant for your environment, and if needed, implementing based on them your own setup.

Lightweight Transactions

Lightweight transactions are meant to enable case studies that require sequence (or some type of transactions) in an eventually consistent environment.

Yet notice that it's a minimal solution that is aimed to serialize tasks in a single table. We believe that this is a good solution, but the if your data requires a consistent solution, you should avoid eventually consistent solutions and look for SQL solutions (with native transactions) or a NoSQL solution like MongoDB.

C* Internals

What to know more? Check out these videos or this book. 

Bottom Line

C* is indeed a great product. However, it definitely is not an entry-level solution for data storage, and managing it requires skills and expertise.

Resources

  • How many nodes do you need? 

  • How big should your C* instances be? 

  • Repair and repair strategy

  • Backup strategy

  • Lightweight transactions 

Data (computing) Design

Published at DZone with permission of Moshe Kaplan, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Build a Scalable E-commerce Platform: System Design Overview
  • How to Design Event Streams, Part 1
  • Client-Side Challenges in Developing Mobile Applications for Large User Bases
  • Elevating B2B Products Through User-Centricity

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: