Easy Data Purge in Cassandra
Easy Data Purge in Cassandra
Data purging is an essential part of having a database server that can handle all the incoming big data. Check out some commands that you can use to do this in Cassandra.
Join the DZone community and get the full member experience.Join For Free
Compliant Database DevOps and the role of DevSecOps DevOps is becoming the new normal in application development, and DevSecOps is now entering the picture. By balancing the desire to release code faster with the need for the same code to be secure, it addresses increasing demands for data privacy. But what about the database? How can databases be included in both DevOps and DevSecOps? What additional measures should be considered to achieve truly compliant database DevOps? This whitepaper provides a valuable insight. Get the whitepaper
Data purging is key to ensuring that your database servers always have enough free space to store the ever-incoming humongous business data. The operational/transactional database needs to be rid of old data that is no longer necessary as per your business rules.
Cassandra being one of the most popular databases, it does offer a really simplistic approach to support your data purge needs — that is, an expiration time for each of the records that go inside. The technique is known as TTL (time to live).
After the expiration time has reached or time to live has ended, the record is automatically marked for deletion by what is known as putting a tombstone on that record.
There are a set of CQLSH commands to utilize this feature:
- Use the
INSERTcommand to set employee details in the
mytabletable to expire in 86,400 seconds (one day).
cqlsh> INSERT INTO mykeyspace.mytable (emp_id, emp_name) VALUES (200, 'Franc') USING TTL 86400;
- Extend the expiration period to three days by using the
UPDATEcommand and change the employee's name.
cqlsh> UPDATE mykeyspace USING TTL 259200 SET emp_name = 'Frank' WHERE emp_id = 200 ;
- Delete a column’s existing TTL by setting its value to zero:
cqlsh> UPDATE mykeyspace.mytable USING TTL 0 SET emp_name = 'somefancyname' WHERE emp_id = 200 ;
- Check how much time is left for a record before it gets deleted:
cqlsh> SELECT emp_id,TTL(emp_name) FROM mykeyspace.mytable;
Using TTL With Cassandra Sink Connectors
The best feature of this comes into picture when you're using a sink connector to insert data into Cassandra from a Kafka topic on-the-fly. You can use the KCQL to specify a TTL right there to ensure your records are inserted with a default expiration time.
"connect.cassandra.kcql": "INSERT INTO mytable SELECT * FROM my-topic TTL=31536000"
31536000 is actually equivalent to a year in seconds.
Published at DZone with permission of Abhinav Sinha , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.