Data Safety in Cloud-Based Databases
Avoid data loss and achieve high availability with cloud databases.
Join the DZone community and get the full member experience.Join For Free
One of the most common arguments I hear around cloud adoption is the lack of faith in the safety of online data stores. Often times, individuals tasked with protecting an organization’s most vital asset — its unique set of knowledge — can be reactionary when it comes to risk management. Understandably so. It can feel like the best way to keep data safe is under your direct control, in your data center, on your servers. Yet, more and more organizations are trusting corporate data to be stored with cloud vendors.
Are these people just incredibly foolish, or is there more here than meets the eye?
This is an excerpt of DZone's 2020 Cloud Database Trend Report.
Causes of Data Breaches
If you research the causes of data breaches, there’s not much agreement on the most common cause; however, most sites will list the following:
- Weak and/or stolen credentials
- Application vulnerabilities from old or unpatched software
- Malicious insiders
- Employee error leading to loss
- Theft/Loss of data-carrying device
- Social engineering/phishing attacks
Except for malware attacks, the majority of attack vectors can be summed up as problems associated with internal systems and processes of the organizations that suffered a data loss. Outright hacking attacks, such as through a SQL Injection attack or other types of more sophisticated attacks, frequently don’t make the top 10 lists.
If you read the news, you often hear about public GitHub repositories filled with user logins and passwords, publicly-accessible drives with unsecured and unencrypted data, and laptops with entire data sets either lost or stolen. In short, the risk to your data is extremely high on the local systems that you manage because one of the reasons listed above. Yet, people see the cloud and cloud-based databases as a higher risk. Let me explain a few reasons why this just not true.
Types of Cloud-Based Data Stores
Before we dive in, let me take a short moment to discuss the various types of data stores because not everything “in the cloud’ is created equally. In this article, I’m going to focus on Platform-as-a-Service data offerings from different cloud vendors. Whether we’re talking about AWS with the RDS databases, Azure using Azure SQL Database and CosmosDB, Google’s Cloud Spanner, or any of the others, this is a different type of data store.
If we were talking to you about hosting your servers through virtual machines in the cloud and then installing a data management system from SQL Server to MongoDB to Elasticsearch, the security and management of those VMs is the responsibility of your organization, just like with your current on-premises systems. The same goes for some of the file-based data systems. Just because you’ve moved the location of your file from your local machine to a cloud platform, the security is still on you.
No, I’m talking about the Platform as a Service (PaaS) offerings for data management because they fundamentally change the game. You are no longer managing your VM with these. Instead, you’re putting your cloud provider in charge of the fundamentals of data management, and frankly, they’re going to do a better job. The reasons for this are because many of the most common attack vectors — from a lack of updates and patching on your software to misconfigured systems, poor practices, and all of the other issues listed above that can cause a data breach — are removed.
Security by Default
First, most PaaS databases are designed with security in mind. Security is a fundamental aspect of all that they do. If you’re getting set up with one of the cloud vendors, not only do you have to have a secure login and password (which, let’s face it, is still vulnerable to attack), but you have to white-list IP addresses through the built-in firewall. So even if you have poor choices in logins, people won’t be able to get direct access to your databases because there is a firewall blocking the access — unless they can also spoof your IP addresses.
There are even data systems out there that intentionally have no security in order to make development easier. At the time of writing, Elasticsearch, by default, does not include security in its development version. It’s on you to put security in place before you release it to production. Yet, none of the PaaS offerings from the various vendors suffer from this problem.
One additional note: All cloud vendors I’m familiar with have both physical and plant security in mind. They build their data centers away from flood zones and wind zones. They have secondary power sources available but are also placing the data centers near a constant source of power, two of the best and cleanest being hydro-electric and nuclear. Security starts with the physical plant.
Continuous Patching and Updates
Different cloud vendors do this in different ways, but all of them have a strong upgrade and an updated set of requirements. Unlike your own systems, which can fall woefully behind and expose well-known and well-documented security issues that bad actors are happy to take advantage of, the cloud-based systems are patched regularly. Some of the cloud-based systems are even patched and updated continuously. You’re warned that with the SLA, your databases will be upgraded on a schedule set by the vendor, not by you.
While this can sound scary, remember that you’re used to patching or updating a few hundred servers whereas they’re building systems that patch and update millions at a time. They’ve had a lot of practice and are quite good at it subsequently.
Backups and Maintenance
One of the saddest things I read about when I read about the constant drumbeat of data breaches is how often people don’t have backups in place. Sometimes, they have them, but they’re old and out of date. Or, they have them, but they don’t know how to restore them. Or, worst of all, they have them, but they were stored locally, using the same security that the malware just exploited, and now, those backups are lost.
Not so with the cloud-based PaaS offerings. All that I’ve studied and used have some form of backup in place. Further, depending on the type of data storage we’re talking about, they have the ability to perform a point-in-time recovery. This is something that many businesses struggle to get right.
In addition to the backups, most of the PaaS database management systems are also performing constant and regular maintenance of the databases under their care. Once more, depending on the type of database and the cloud vendor, they may have additional protections under the covers so that they ensure your databases are available, even in the event of data corruption (it’s rare, but it happens to all of us sooner or later).
Encryption, at Rest and in Motion
Most PaaS offerings make a point of ensuring that your data is encrypted within their systems. They encrypt it when it’s stored within the databases, but they also encrypt the backups. Most major vendors ensure that they’re encrypting data while it’s in motion within their data centers. In the event of a breach, all this added protection helps to ensure that the breach is going to be contained at the lowest possible level, rather than expose not only multiple databases but multiple organizations to a single breach.
One last thing to point out is that most cloud vendors offer varying degrees of high availability within their systems. They typically offer some degree of built-in availability and disaster recovery at zero cost to you. However, you can pay for extra protection. So, with some additional overhead, you can ensure that not only is your data available in whatever data center you store it, but that you have automated processes that ensure it gets duplicated to secondary locations. Most of the major vendors can do this for locations all around the world.
This is a level of disaster protection that most businesses haven’t even attempted, let alone successfully implemented.
The fact of the matter is that you may have an excellent data center with secondary power systems — security, well-maintained patching, tested security, tested backups, and a tried-and-true high availability setup. On the other hand, you may have a need for these services. The most important things to think about are the common attack vectors. How many of these are increased by moving your data to cloud? Very few, if any. How many of these are decreased by moving your data to the cloud? Most of them.
So, if you’re hesitant to move to the cloud because you worry data will be less safe, I strongly encourage you to look at how cloud vendors are handling security and how data breaches occur. Once you have this knowledge under your belt, you’re going to worry much less about cloud-based data systems.
- Most Common Causes of Data Breaches and How You Can Spot Them
- Top 6 Causes of Data Breaches and How to Plan Against Them
- Most Common Causes of Data Breaches
- 7 Major Causes of a Data Breach
It's important to keep your data safe. Learn more about data loss prevention, patching, and more in DZone's Cloud Database Trend Report.
Opinions expressed by DZone contributors are their own.