Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Audit and Log Database DML Changes in PostgreSQL With Cyan Audit

DZone's Guide to

Audit and Log Database DML Changes in PostgreSQL With Cyan Audit

Take a look at Cyan Audit, an extension that detects database changes and lets you keep neat, orderly files. Take a trip through Cyan Audit's features and see if it works for you.

Free Resource

Whether you work in SQL Server Management Studio or Visual Studio, Redgate tools integrate with your existing infrastructure, enabling you to align DevOps for your applications with DevOps for your SQL Server databases. Discover true Database DevOps, brought to you in partnership with Redgate.

Introduction

The database serves an important role. The security of your database systems is also becoming increasingly important, and an effective audit of your database can reduce the potential security risk.

Information about the periodic changes (DML — Insert, Update, Delete) of the intensity and structure of database workloads plays an important role in reducing the potential security risk, strengthening the security of database, and making it possible to trace the source when looking for problems and evidence of wrongdoing, such as fraud or records being modified without authorization.

In this article, I am going to share an interesting Postgres extension that does the DML logging job simply and seriously. Although there are many others way of doing that, in this article, I am going to use the Cyan Audit Postgres Extension.

Cyan Audit

This extension is responsible for detecting all database changes (INSERTs, UPDATEs, and DELETEs) and recording them in a separate 'audit' schema within the same Postgres database so that they can be reviewed using online inquiry screens.

By default, logging is TRUE for all current database tables and their columns, but it can be set to FALSE to disable logging on table and column levels.

This extension also provides a means to revert back to the correct values.

Let’s Get Started

I have created the database stackDB  with some tables on it.

Stack Db with Default Public Schema


I will use the customer table in this article for demonstration. For now, this table contains the record below:

Select * from customer order by id limit 5;
 

stackdb=# Select * from customer order by customer_id limit 5;
+-------------+----------+------------+-----------+-------------------------------------+------------+------------+-------------+-------------------------+--------+
| customer_id | store_id | first_name | last_name |                email                | address_id | activebool | create_date |       last_update       | active |
+-------------+----------+------------+-----------+-------------------------------------+------------+------------+-------------+-------------------------+--------+
|           1 |        1 | Mary       | Smith     | mary.smith@sakilacustomer.org       |          5 | t          | 2006-02-14  | 2013-05-26 14:49:45.738 |      1 |
|           2 |        1 | Patricia   | Johnson   | patricia.johnson@sakilacustomer.org |          6 | t          | 2006-02-14  | 2013-05-26 14:49:45.738 |      1 |
|           3 |        1 | Linda      | Williams  | linda.williams@sakilacustomer.org   |          7 | t          | 2006-02-14  | 2013-05-26 14:49:45.738 |      1 |
|           4 |        2 | Barbara    | Jones     | barbara.jones@sakilacustomer.org    |          8 | t          | 2006-02-14  | 2013-05-26 14:49:45.738 |      1 |
|           5 |        1 | Elizabeth  | Brown     | elizabeth.brown@sakilacustomer.org  |          9 | t          | 2006-02-14  | 2013-05-26 14:49:45.738 |      1 |
+-------------+----------+------------+-----------+-------------------------------------+------------+------------+-------------+-------------------------+--------+


Install and Enable the Cyan Audit Extenstion With stackDb Database

Download Cyan Audit from this link and unzip it.

$ cd -> cyanaudit-1.0.2
$ make install


If running above command gives the error "How To Resolve Pgxs.Mk: No Such File Or Directory Make" then run the following commands:

$ sudo apt-get install postgresql-server-dev-all
$ sudo apt-get install postgresql-common

Now that you're logged into to your Postgres database using Shell, run the following commands to enable the extension with your stackDB database.

postgres=# \connect stackdb;
stackdb=# create extension cyanaudit ;
CREATE EXTENSION
stackdb=# select cyanaudit.fn_create_event_trigger();
+-------------------------+
| fn_create_event_trigger |
+-------------------------+
| |
+-------------------------+
(1 row)

stackdb=# select cyanaudit.fn_update_audit_fields('public');
+------------------------+
| fn_update_audit_fields |
+------------------------+
| |
+------------------------+
(1 row)


Now you have enabled the Cyan Audit Extension with your default schema in your stackDB database, refresh the database. Now you will have two schemas in your database, as seen below:

Image title

Now, all the logging details of your tables under the public schema will be stored in tables under the "cyanaudit" schema. If you explore the cyanaudit schema, you will have four tables, three views, and 31 functions that do the job for you. You don't need to worry about these — you just need to know about 1 table, 1 view, and 1 function to do your job.

Now, let's play with the customer table by updating, deleting, and inserting snew record in this table. First, update the records of users with ID 1 in the customer table.

stackdb=# Select * from customer order by customer_id limit 5;
+-------------+----------+------------+-----------+-------------------------------------+------------+------------+-------------+-------------------------+--------+
| customer_id | store_id | first_name | last_name |                email                | address_id | activebool | create_date |       last_update       | active |
+-------------+----------+------------+-----------+-------------------------------------+------------+------------+-------------+-------------------------+--------+
|           1 |        1 | Mary       | Smith     | mary.smith@sakilacustomer.org       |          5 | t          | 2006-02-14  | 2013-05-26 14:49:45.738 |      1 |

stackdb=# update customer set address_id = 6 , activebool = false where customer_id = 1;
UPDATE 1
stackdb=# Select * from customer order by customer_id limit 5;
+-------------+----------+------------+-----------+-------------------------------------+------------+------------+-------------+----------------------------+--------+
| customer_id | store_id | first_name | last_name |                email                | address_id | activebool | create_date |        last_update         | active |
+-------------+----------+------------+-----------+-------------------------------------+------------+------------+-------------+----------------------------+--------+
|           1 |        1 | Mary       | Smith     | mary.smith@sakilacustomer.org       |          6 | f          | 2006-02-14  | 2016-09-14 22:08:19.330765 |      1 |


I have updated the address_id = 6 and activebool = false of customer 1. Remember the old value. Now, let's ask Cyan Audit to tell us the old value of these two columns of customer 1.

All the logged details are stored in the tbl_audit_event table.

Image title

As you can clearly see, the columns are there to store the logged data: audit_filed has the ID of the column in cyanaudit, pk_vals has the primary key of the record, recorded represents the timestamp at which the operation was performed, uid holds the user ID of Postgres, row_op  holds the type of operation like update, delete or insert, txid hold logged transaction ID — helpful in reverting the transaction, audit_transaction_type — old_value is old value of the record, and new_value is new value of the record.

To get our logged data, Cyan Audit provides us with a view built in view vw_audit_log. We need to pass the view to get log data by passing the table name and column name.

stackdb=# select txid, pk_vals, recorded, op,old_value,new_value from vw_audit_log where table_name = 'public.customer' and pk_vals = '{1}';
+------+---------+----------------------------+----+-------------------------+----------------------------+
| txid | pk_vals |          recorded          | op |        old_value        |         new_value          |
+------+---------+----------------------------+----+-------------------------+----------------------------+
| 2875 | {1}     | 2016-09-14 22:08:19.333788 | U  | true                    | false                      |
| 2875 | {1}     | 2016-09-14 22:08:19.333788 | U  | 5                       | 6                          |
| 2875 | {1}     | 2016-09-14 22:08:19.333788 | U  | 2013-05-26 14:49:45.738 | 2016-09-14 22:08:19.330765 |
+------+---------+----------------------------+----+-------------------------+----------------------------+


Now the logs tell us clearly when the update query was performed, what the old value was, and what the new value is.

Now, what if we want to revert this transaction? Not a problem. Cyan Audit has a built-in function, fn_undo_transaction() , to do this. What we need to do is pass the transaction ID of the log to this function, and the record will be reverted back in the customer table. Let's try:

stackdb=# select fn_undo_transaction('2875');
+----------------------------------------------------------------------------------------------------------------------------------+
|                                                       fn_undo_transaction                                                        |
+----------------------------------------------------------------------------------------------------------------------------------+
| UPDATE public.customer SET last_update = '2013-05-26 14:49:45.738',activebool = 'true',address_id = '5' WHERE customer_id = '1'; |
+----------------------------------------------------------------------------------------------------------------------------------+
(1 row)
stackdb=# Select * from customer order by customer_id limit 5;
+-------------+----------+------------+-----------+-------------------------------------+------------+------------+-------------+----------------------------+--------+
| customer_id | store_id | first_name | last_name |                email                | address_id | activebool | create_date |        last_update         | active |
+-------------+----------+------------+-----------+-------------------------------------+------------+------------+-------------+----------------------------+--------+
|           1 |        1 | Mary       | Smith     | mary.smith@sakilacustomer.org       |          5 | t          | 2006-02-14  | 2016-09-14 22:31:09.315708 |      1 |


Now if we explore our vw_audit_log, we will see it also records the undo transaction log, too:

stackdb=# select * from vw_audit_log where table_name = 'public.customer' and pk_vals = '{1}';
+----------------------------+-----+------------+------+------------------+-----------------+-------------+---------+----+----------------------------+----------------------------+
|          recorded          | uid | user_email | txid |   description    |   table_name    | column_name | pk_vals | op |         old_value          |         new_value          |
+----------------------------+-----+------------+------+------------------+-----------------+-------------+---------+----+----------------------------+----------------------------+
| 2016-09-14 22:31:09.318275 |   0 |            | 2877 | Undo transaction | public.customer | activebool  | {1}     | U  | false                      | true                       |
| 2016-09-14 22:31:09.318275 |   0 |            | 2877 | Undo transaction | public.customer | address_id  | {1}     | U  | 6                          | 5                          |
| 2016-09-14 22:31:09.318275 |   0 |            | 2877 | Undo transaction | public.customer | last_update | {1}     | U  | 2016-09-14 22:08:19.330765 | 2016-09-14 22:31:09.315708 |
| 2016-09-14 22:08:19.333788 |   0 |            | 2875 |                  | public.customer | activebool  | {1}     | U  | true                       | false                      |
| 2016-09-14 22:08:19.333788 |   0 |            | 2875 |                  | public.customer | address_id  | {1}     | U  | 5                          | 6                          |
| 2016-09-14 22:08:19.333788 |   0 |            | 2875 |                  | public.customer | last_update | {1}     | U  | 2013-05-26 14:49:45.738    | 2016-09-14 22:08:19.330765 |
+----------------------------+-----+------------+------+------------------+-----------------+-------------+---------+----+----------------------------+----------------------------+


It also records the Delete and Insert operations on the customer table.

Set Logging ON and OFF on Specific Columns

By default, logging is TRUE for all current database tables and their columns, but it can be set to FALSE to disable logging on the table and column levels.

You will notice that there is a table under the cyanaudit schema in our database named tb_audit_field, which contains the table and column name on which logging  is set on or off.

Let's explore this table, too:

stackdb=# select * from tb_audit_field where table_name = 'customer';
+-------------+--------------+------------+-------------+---------+----------+
| audit_field | table_schema | table_name | column_name | enabled | loggable |
+-------------+--------------+------------+-------------+---------+----------+
|           2 | public       | customer   | store_id    | t       | t        |
|           9 | public       | customer   | email       | t       | t        |
|          18 | public       | customer   | last_update | t       | t        |
|          25 | public       | customer   | customer_id | t       | t        |
|          45 | public       | customer   | first_name  | t       | t        |
|          49 | public       | customer   | activebool  | t       | t        |
|          54 | public       | customer   | active      | t       | t        |
|          62 | public       | customer   | last_name   | t       | t        |
|          66 | public       | customer   | address_id  | t       | t        |
|          68 | public       | customer   | create_date | t       | t        |
+-------------+--------------+------------+-------------+---------+----------+
(10 rows)

Now I will turn off the logging for the create_date column of the customer table. From now on, any operation on this column will not be logged by Cyan Audit. 

stackdb=# update tb_audit_field set enabled = false where audit_field = 68;
UPDATE 1
stackdb=# select * from tb_audit_field where table_name = 'customer';
+-------------+--------------+------------+-------------+---------+----------+
| audit_field | table_schema | table_name | column_name | enabled | loggable |
+-------------+--------------+------------+-------------+---------+----------+
|           2 | public       | customer   | store_id    | t       | t        |
|           9 | public       | customer   | email       | t       | t        |
|          18 | public       | customer   | last_update | t       | t        |
|          25 | public       | customer   | customer_id | t       | t        |
|          45 | public       | customer   | first_name  | t       | t        |
|          49 | public       | customer   | activebool  | t       | t        |
|          54 | public       | customer   | active      | t       | t        |
|          62 | public       | customer   | last_name   | t       | t        |
|          66 | public       | customer   | address_id  | t       | t        |
|          68 | public       | customer   | create_date | f       | t        |
+-------------+--------------+------------+-------------+---------+----------+
(10 rows)


This way, we can turn off the non-critical columns or tables in our database.

I also want to talk about querying this log from your web application or via exporting to some human readable form. And most of this logging is used for reporting purposes. Most of the languages or frameworks today provide facilities to connect with multiple databases or multiple connections within the same database. In our case, it's one database:

doctrine:
    dbal:
        default_connection: default
        connections:
            default:
                driver:   pdo_pgsql
                host:     '%database_host%'
                port:     '%database_port%'
                dbname:   '%database_name%'
                user:     '%database_user%'
                password: '%database_password%'
                charset:  UTF8
                options:
                   search_path: public
            cyanaudit:
                driver:   pdo_pgsql
                host:     '%database_host2%'
                port:     '%database_port2%'
                dbname:   '%database_name2%'
                user:     '%database_user2%'
                password: '%database_password2%'
                charset:  UTF8
                options:
                   search_path: cyanaudit

    orm:
        default_entity_manager: default
        entity_managers:
            default:
                connection: default
            cyanaudit:
                connection: cyanaudit


Or in Django:


DATABASES = {

    'default': {
            'ENGINE': 'django.db.backends.postgresql_psycopg2',
            'OPTIONS': {
                'options': '-c search_path=public'
            },
            'NAME': 'multi_schema_db',
            'USER': 'appuser',
            'PASSWORD': 'secret',
    },

    'cyan_audit': {
            'ENGINE': 'django.db.backends.postgresql_psycopg2',
            'OPTIONS': {
                'options': '-c search_path=cyan_audit'
            },
            'NAME': 'multi_schema_db',
            'USER': 'appuser',
            'PASSWORD': 'secret',
    },
}


Conclusion

In this DevOps world, developers are increasingly being asked (or sometimes pushed) to manage the database as well. So, with these small, free, open-source Postgres extensions, the chances of survival are high.

Let me know your thoughts in the comments section. I would love to hear your feedback!

It’s easier than you think to extend DevOps practices to SQL Server with Redgate tools. Discover how to introduce true Database DevOps, brought to you in partnership with Redgate

Topics:
postgresql 9.5 ,postgresql ,audit log ,sql databases

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}