What Is Persistent Data?
Find out more about the ability to retain data in a durable and recoverable form even as hardware, software, and devices evolve.
Join the DZone community and get the full member experience.
Join For FreeTo gather insights for DZone's Data Persistence Research Guide, scheduled for release in March, 2016, we spoke to 16 executives, from 13 companies, who develop databases and manage persistent data in their own companies or help clients do so.
Here's who we talked to:
Satyen Sangani, CEO, Alation | Sam Rehman, CTO, Arxan | Andy Warfield, Co-Founder/CTO, Coho Data | Rami Chahine, V.P. Product Management and Dan Potter, CMO, Datawatch | Eric Frenkiel, Co-Founder/CEO, MemSQL | Will Shulman, CEO, MongoLab | Philip Rathle, V.P. of Product, Neo Technology | Paul Nashawaty, Product Marketing and Strategy, Progress | Joan Wrabetz, CTO, Qualisystems | Yiftach Shoolman, Co-Founder and CTO and Leena Joshi, V.P. Product Marketing, Redis Labs | Partha Seetala, CTO, Robin Systems | Dale Lutz, Co-Founder, and Paul Nalos, Database Team Lead, Safe Software | Jon Bock, VP of Product and Marketing, Snowflake Computing
Persistent data is data that’s considered durable at rest with the coming and going of software and devices. Master data that’s stable—that is set and recoverable whether in flash or in memory.
Here's what we heard when we asked, "How do you define persistent data?":
- The opposite of dynamic—it doesn’t change and is not accessed very frequently
- Core information, also known as dimensional information in data warehousing; demographics of entities—customers, suppliers, and orders
- Master data that’s stable
- Data that exists from one instance to another: Data that exists across time, independent of the systems that created it. Now there’s always a secondary use for data, so there’s more persistent data. A persistent copy may be made, or it may be aggregated. The idea of persistence is becoming more fluid.
- Stored in actual format and stays there versus in-memory where you have it once, close the file, and it’s gone: You can retrieve persistent data again and again. Data that is written to the disc; however, the speed of the discs is a bottleneck for the database. Trying to move to memory because it’s 16X faster.
- Every client has their own threshold for criticality (e.g. financial services don’t want to lose any debits or credits). Now, with much more data from machines and sensors, there is greater transactionality. The metadata is as important as the data itself. Metadata must be transactional.
- Non-volatile: Persists in the face of a power outage
- Any data stored in a way that it stays stored for an extended period versus in-memory data: Stored in the system, modeled, and structured to endure power outages. Data doesn’t change at all.
- Data considered durable at rest with the coming and going of hardware and devices: There’s a persistence layer at which you hold your data at risk.
- Data that is set and recoverable, whether in flash or memory-backed
- With persistent data, there is reasonable confidence that changes will not be lost, and the data will be available later. Depending on the requirements, in-cloud or in-memory systems can qualify. We care most about the "data" part. If it’s data, we want to enable customers to read, query, transform, write, add value, etc.
- A way to persist data to disk or storage: Multiple options to do so with one replica across data centers in any combination with and without persistence. Snapshot data to disk or snapshot changes. Write to disk every one second or every write. Users can choose between all options. Persistence is part of a high-availability suite that provides replication and instant failover. Registered over multiple clouds. Host thousands of instances over multiple data centers, with only two node failures per day. Users can choose between multiple data centers and multiple geographies. We are the company behind Redis. Others treat it as a cache and not a database. Multiple nodes - data written to disks. You can’t do that with regular open source. If you don’t do high availability, like recommended, you can lose your data.
- Anything that goes to a relational or NoSQL database in between.
So, how do you define persistent data?
2023 Update: Keeping Data Safe in an Ever-Changing Digital World
I wrote the content above seven years ago based on research I conducted for DZone’s Data Persistence Research Guide. A lot has changed since then. Here’s an update.
Data persistence refers to the ability to retain data in a durable and recoverable form, even as hardware, software, and devices change around it. As our world becomes increasingly digital, having reliable methods of data persistence is more crucial than ever.
In the past, data persistence mainly referred to storing data on physical disks that could survive power outages. While disks are still used today, there are now many more options for achieving data persistence thanks to advancements in cloud computing, flash storage, and in-memory databases.
Expanded Definitions of Data Persistence
The conventional definition of persistent data is "data that is non-volatile and stored in an actual format so it can be retrieved repeatedly." This is in contrast to transient in-memory data that disappears when a system is powered off.
However, the meaning of data persistence has evolved to encompass:
- Data stored in durable media like disks, tape, and flash storage
- Replicated data stored across multiple servers or locations
- Data stored in non-relational databases like NoSQL that can run entirely in memory but maintain copies for persistence
- Virtualized data that appears persistent but may rely on non-persistent infrastructure
In today's IT landscape, data persistence refers not just to static data saved to disk, but a range of technologies that provide access to stable and recoverable data stores.
Importance of Persistent Data
Persistent data is the foundation for digital business activities today. It includes:
- Master data like customer info, product catalogs, and financial records
- Transaction data and records from operations
- Metadata that describes key business entities and data schema
- Reference data like geographical info and codes
Without reliable persistence, this data could not be effectively processed, analyzed, or audited. Losing access to persistent data stores could bring business operations to a standstill.
Advancements in Data Persistence Technologies
While disks continue to store the bulk of critical business data, new technologies are addressing their limitations like latency, size constraints, and throughput:
- Flash storage offers much faster read/write performance than traditional disks
- Object stores allow storing massive amounts of unstructured data in the cloud
- Replication tool synchronously mirrors data across distributed databases
- Virtualization abstracts physical storage into logical pools that applications can access and scale on demand
- High availability clusters eliminate single points of failure to maximize uptime
These technologies make it possible to persist data at scale in distributed environments and provide continuous access.
Case Studies on Persistent Data Strategies
- Financial Company X: Created a multi-site architecture for zero downtime by replicating databases between an on-premise data center and the cloud. If one site fails, the other ensures uninterrupted data access.
- Healthcare Provider Y: Uses a disk-based data warehouse for analytics and reporting. They added a flash storage tier to reduce query times from hours to seconds for time-sensitive insights.
- SaaS Company Z: Hosts databases in a cloud-based database-as-a-service to automatically scale capacity as their data grows without managing infrastructure.
Summary
As data volumes explode, data persistence is more critical than ever. New challenges require going beyond traditional disk-based storage to utilize advanced technologies like flash, distributed systems, and cloud storage. With careful planning, companies can implement data persistence strategies to protect their information assets today and into the future.
Opinions expressed by DZone contributors are their own.
Comments