DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations

Trending

  • Insider Threats and Software Development: What You Should Know
  • How To Design Reliable IIoT Architecture
  • Boosting Application Performance With MicroStream and Redis Integration
  • Five Java Books Beginners and Professionals Should Read
  1. DZone
  2. Software Design and Architecture
  3. Cloud Architecture
  4. Taking a Global Snapshot of Cassandra Clusters

Taking a Global Snapshot of Cassandra Clusters

Taking a keyspace as a dump can give you a view of your entire Cassandra cluster. Using a mix of built-in and third-party tools makes it possible.

Shivansh Srivastava user avatar by
Shivansh Srivastava
·
Dec. 22, 16 · Tutorial
Like (1)
Save
Tweet
Share
8.31K Views

Join the DZone community and get the full member experience.

Join For Free

Snapshots are taken per node using the nodetool snapshot command. To take a global snapshot, run the nodetool snapshot command using a parallel ssh utility, such as pssh.

A snapshot first flushes all in-memory writes to the disk, then makes a hard link of the SSTable files for each keyspace. You must have enough free disk space on the node to accommodate making snapshots of your data files. A single snapshot requires little disk space, but snapshots can cause your disk usage to grow more quickly over time because a snapshot prevents old, obsolete data files from being deleted. After the snapshot is complete, you can move the backup files to another location if needed, or you can leave them in place.

Note: Cassandra can only restore data from a snapshot when the table schema exists. It is recommended that you also backup the schema.

Procedure

Run the nodetool snapshot command, specifying the hostname, JMX port, and keyspace. For example:

$ nodetool -h localhost -p 7199 snapshot mykeyspace


Results

The snapshot is created in the data_directory_location/keyspace_name/table_name–UUID/snapshots/snapshot_name directory. Each snapshot directory contains numerous .db files that contain the data at the time of the snapshot. Such as:

Package Installations

/var/lib/cassandra/data/mykeyspace/users-081a1500136111e482d09318a3b15cc2/snapshots/1406227071618/mykeyspace-users-ka-1-Data.db


Tarball Installations

install_location/data/data/mykeyspace/users-081a1500136111e482d09318a3b15cc2/snapshots/1406227071618/mykeyspace-users-ka-1-Data.db

Taking a Global Snapshot

As stated earlier, a global snapshot can be taken using the pssh tool. So let us configure this tool first.

The steps for configuring the pssh are:

  1. Install the pssh tool using the following command
    sudo apt-get install python-pip
    sudo pip install pssh
  2. Create a hosts file that contains all the ip’s of the nodes present in that cluster and name it something like pssh-hosts.

    It should look something like this:
    192.168.2.123
    192.168.2.125
    192.168.2.120
  3. Now run the following command so that the snapshots get created on each and every node:
     pssh -h pssh-hosts -P "/root/cassandra/bin/nodetool -h localhost -p 7199 snapshot "

Now you've taken the dump of data on each node. You can download it using secure copy and restore it accordingly. I am still working on automating the process of downloading the dump, but I will update you all  as soon as it is done!

I hope youve enjoyed this article! If you have any queries, ping me here or on Twitter, and I'll be happy to help you out. 

Snapshot (computer storage) cluster

Published at DZone with permission of Shivansh Srivastava, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Trending

  • Insider Threats and Software Development: What You Should Know
  • How To Design Reliable IIoT Architecture
  • Boosting Application Performance With MicroStream and Redis Integration
  • Five Java Books Beginners and Professionals Should Read

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com

Let's be friends: