Binary Backups for PostgreSQL
Of course you want to back up your DB. But if it's large, with lots of commands, then it might take a while to load. Setting up PostgreSQL's binary backups is a solution.
Join the DZone community and get the full member experience.Join For Free
Creating a PostgreSQL backup is a simple task. You only need to run the below command, and save the result as your backup:
$ pg_dump --username=your-user-name --file=output-file dbname
The result is a file full of SQL commands that, when fed back to the server, will recreate the database in the same state as it was at the time of the dump. So to restore from that, you only need to run the below command:
$ psql -U your-user-name dbname < backup-file
I can't imagine a simpler process for the backup and restore of a database; however, as the backup file contains many many SQL commands, creating and restoring backups for a large database may take a very long time and that could make your backups useless — especially if you need to restore in a failover scenario.
The good news is there are some other backup methods you can use to have a faster backup/restore process.
File System Level Backup
This is the fastest method you can use for taking a PostgreSQL backup. If you directly copy the files that PostgreSQL uses to store the data in, you will have a file system level backup which can be restored by copy back those files.
Although this method is faster than creating a dump file, you can't use it while the PostgreSQL server is up and running, because the files you are trying to copy may change during the process and the result file-set will not be consistent.
Essentially this means that you can not use this method to backup a production PostgreSQL server as you are not able to shutdown your database every hour to take a backup.
PostgreSQL uses a mechanism called write ahead log (WAL) to records every change made to the database's data files. It means if you take a file system backup and re-play the WAL commands generated during the backup process on the result file-set, then you will have a consistent backup which can be used to restore PostgreSQL to the exact state the database was when the backup finished. We call these backups binary backups and these are the steps you need to follow to have one.
1. Configure Wal_level
There is an item in Postgresql configuration file called wal_level which determines how much information is written to the WAL. The default value is minimal, which writes only the information needed to recover from a crash or immediate shutdown. For binary backups, you need to set this value to hot_standby. Just open the PostgreSQL config file (postgresql.conf) and find the wal_level. Set hot_standby as its value and restart your PostgreSQL service.
2. Configure Max_wal_senders
max_wal_senders is the maximum number of simultaneously running WAL sender processes.
PostgreSQL uses WAL to sync changes with replica servers in Streaming Replication method.So if you have Streaming Replication in place, you need one WAL sender process for each slave. Also you need two WAL sender processes for creating a standalone hot backup. Based on the above information, you need to calculate the number of WAL sender processes and update the config. Open the PostgreSQL config file (postgresql.conf) and find the max_wal_senders. Set the calculated number as its value and restart your PostgreSQL service.
3. Configure Authentication
To be able to create Binary Backups you need to use a superuser or a user which has REPLICATION permissions. In PostgreSQL, the Client authentication is controlled by a configuration file called pg_hba.conf and is stored in the database cluster's data directory.
Open pg_hba.conf and add the below line to it (replace
your-postgresal-username with your username) and reload the PostgreSQL service :
host replication your-postgresal-username 127.0.0.1/32 md5
4. Use Pg_basebackup to Create a Backup
Pg_basebackup is used to take base backups of a running PostgreSQL database cluster. This tool has an option (-x) which if you use will include WAL files to created result.
Run the below command to create the backup (replace
your-username with your username):
$ pg_basebackup -U your-username -x -D /path/to/backup/folder
After pg_basebackup finishes its work, your binary backup is ready to use. You can use tar to create one archive file and save it in a safe place to restore when needed.
Restore Binary Backups
Now that you have the backup folder, you can use it to restore PostgreSQL to the state the backup is finished. These are the steps you need to go through to restore a Binary Backup:
Prepare the Folders
Stop the PostgreSQL service and then delete your PostgreSQL data directory (note: You may want to move it somewhere else instead of deleting it for redundancy).
Then create a new empty directory as your PostgreSQL data directory, and use rsync to move the content of your backup folder to the newly created folder (replace
/path/to/postgresql/data/folder with correct values):
$ sudo rsync -az /path/to/backup/folder/ /path/to/postgresql/data/folder
Fix permissions of the PostgreSQL data folder:
$ sudo chown -R postgres:postgres /path/to/postgresql/data/folder
$ sudo chmod -R 0700 /path/to/postgresql/data/folder
Lastly, Start the PostgreSQL service. Now you should have PostgreSQL server with restored data! As you can see, the whole process of backup/restore for binary backups is more complicated that dealing with SQL dump files, but if you have a large database and restore time is important for you, this method could be a great replacement.
Published at DZone with permission of ShaB, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.