Database Backup: A Conversation With an Expert
Database Backup: A Conversation With an Expert
Read an interview with an expert about database backup and why it is so important.
Join the DZone community and get the full member experience.Join For Free
Database backup is a hot topic. It may seem simple at first, but in practice, O&M personnel often encounter various problems when it comes to backing up databases. So, what typical challenges are presented and how can we build an effective backup system? Which solutions are applicable? To answer these questions, we interviewed Heng Tiegang, a database backup expert at Alibaba.
Heng Tiegang (nickname: Pei'en), an Alibaba database backup expert
Why Should Databases Be Backed Up?
I think the answer to this question is already obvious. So, rather than answering this question, I would like to answer another question: what risks can be prevented through database backup? In fact, since its generation, data has always been accompanied by the risks of data loss caused by natural disasters, power failures, network faults, hardware faults, software faults, and human faults.
The point is, even if your database survives from hardware bugs today, a lightning strike tomorrow, or a power failure the day after tomorrow, you may mistakenly delete data due to a slip of the hand three days from today.
Which Challenges Are Presented by Database Backups?
The first challenge is taking stock of database assets. For an individual user, all these database assets may just be one instance, and the user clearly knows the assets even without stocktaking. However, for an enterprise user, especially a user from a large-sized enterprise, the database can have multiple instances and various database types due to business diversity. In this case, the O&M personnel needs to clearly know the numbers, distribution, types (production or core databases), and functions of different databases.
The second challenge is the evaluation of the backup system. While backup is a basic and daily practice, people often find that it does not help during crunch times. The reason is that backup, as a basic task, does not promote the business, and as long as no problems occur, few people remember it. However, once a problem occurs, backup immediately becomes the target of public attention. Backups often do not help during emergencies mainly because people do not take backups seriously enough, so investment in backup is insufficient. Many enterprises claim that backups are a top priority but never implement them properly.
I recommend that you ask your technical team right away: Is your backup system really effective?
What Is an Effective Backup System?
Different databases can be used for different purposes, and the effectiveness of a backup system varies accordingly. According to their functions, databases can be classified as test databases, production databases, and core databases.
For test databases, you must learn the importance of the database based on its intended use. If the test database is used for personal tests, in most cases, data is imported and cleared without being backed up. If the test database is used for R&D, we recommend that you enable the backup function and do not underestimate the importance of backups. This is because all development and testing personnel in the enterprise work on the test database, and a single data problem can immediately cause trouble for the entire team. In addition, a test database is likely to encounter more problems than a production database.
For a production database, first, ensure that you have enabled the backup function. Then, evaluate whether the backup cycle meets the requirements, for example, full backup on a daily basis. When a failure occurs, the only up to one day of new data is lost. In this case, you need to check whether the last copy of backup data had been restored and whether the backup data is valid.
For a core database, its importance is higher than that of a test or production database. In addition to the preceding measures, you need to take some other measures. Real-time backup has become a mandatory option for an enterprise to select a database backup solution because it minimizes the amount of data lost upon a fault. Fast recovery also plays an increasingly significant role for the core database. Based on the risks of potential faults, you can select the optimal recovery solution, perform regular drills on the entire backup and recovery system, and sample the backup data to test the recovery function. I recommend that you develop a policy that automatically and regularly conducts the entire recovery process and provides drill reports.
- Not verifying the validity of the backup data is even worse than not backing up the data. Imagine that all of your business data has been completely destroyed in a disaster. However, when you want to recover the data, you may find that the backup data is corrupted, the files that you backed up are incorrect, or some other terrible thing has happened. In this case, what can you do? A data backup solution without validation can be an even bigger disaster. You must validate the backup content to ensure that the data has been properly backed up and can be used for recovery. Don't wait until it is too late.
- Don't insist on large and comprehensive solutions. Diversified requirements must be met by a variety of solutions. In particular, for the core database, the entire instance must be backed up regularly to prevent hardware failures and damage to instances. In addition, each table must be backed up in real time, which often reduces the data recovery time at crunch time by up to 90 percent.
- Either manual or automatic data validation aims to verify the validity of the backup data used for recovery (also referred to as the recovery data). Verification of the integrity of the recovery data is pretty challenging. In most cases, the recovery data and production data are sampled and compared with each other based on the business characteristics. Alternatively, the recovery database serves as the secondary database and is synchronized with the primary database to verify data integrity.
Published at DZone with permission of Leona Zhang . See the original article here.
Opinions expressed by DZone contributors are their own.