Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Sensitive Data Masking With MariaDB MaxScale

DZone's Guide to

Sensitive Data Masking With MariaDB MaxScale

Data redaction obfuscates data, reducing unnecessary exposure of sensitive data while maintaining its usability. Learn how to redact data using a masking filter.

· Database Zone ·
Free Resource

MariaDB TX, proven in production and driven by the community, is a complete database solution for any and every enterprise — a modern database for modern applications.

Protecting personal and sensitive data and complying with security and privacy regulations is a high priority for organizations. This includes personally identifiable information (PII), protected health information (PHI), payment card information (subject to PCI-DSS regulation), and intellectual property (subject to ITAR and EAR regulations). In many cases, if not most, it needs to be redacted or masked when accessed (internally and/or externally).

Data redaction obfuscates all or part of the data, reducing unnecessary exposure of sensitive data while at the same time maintaining its usability. Various terms such as data masking, data obfuscation, and data anonymization are used to describe this functionality in databases. Data redaction allows an organization to:

  • Meet regulations.
  • Protect against insider threat.
  • Use production data for non-production use cases (for example, testing and training).

MariaDB TX and MariaDB AX are complete solutions for high performance transactional and analytical workloads, respectively. Included in both of these solutions is MariaDB MaxScale, a next-generation database proxy. In addition to load balancing and query routing, this database proxy provides enhanced security, like encryption of data in flight and masking of sensitive end user data.  

In this blog, we show you how to redact data using the masking filter in MariaDB MaxScale.

Data Masking

With the masking filter, the value of a particular column returned by a query can be obfuscated. For instance, suppose there is a table patients that, among other columns, contains the column ssn where the social security number or national identification number of a patient is stored. With the masking filter, it is possible to specify that when the ssn field is queried, a masked value is returned instead of actual ssn

For example:

> SELECT name, ssn FROM patients; 

Without masking of SSN column, query result would be:

+-------+-------------+
+ name  | ssn         |
+-------+-------------+
| Alice | 721-07-4426 |
| Bob   | 435-22-3267 |
...

And with masking of SSN column, the query result would be:

+-------+-------------+
+ name  | ssn         |
+-------+-------------+
| Alice | XXXXXXXXXXX |
| Bob   | XXXXXXXXXXX |
...

An example configuration of masking filter in MaxScale configuration looks like the following:

[MyMasking]
type=filter
module=masking
warn_type_mismatch=always
large_payload=abort
rules=masking_rules.json

[MyService]
type=service
...
filters=MyMasking

And the masking_rules.json will look like this

{
   "rules": [
       {
           "replace": {
               "column": "ssn"
           },
           "with": {
               "fill": "X"
           }
       }
   ]
}

Now doing the following query will show masked data for the ssn column:

> select name, ssn from patients;
+---------------+--------------+
| name          | ssn          |
+---------------+--------------+
| John Doe      | XXXXXXXXXXX  |
| Jack Smith    | XXXXXXXXXXX  |
| Jane Richards | XXXXXXXXXXX  |
+---------------+--------------+
> select * from patients;
+------+---------------+-------------+--------+-------+ 
| id   | name          | ssn         | gender | age  | 
|+------+--------------+-------------+--------+-------+
|    1 | John Doe      | XXXXXXXXXXX | M      |   55 |
|    2 | Jack Smith    | XXXXXXXXXXX | M      |   38 |
|    3 | Jane Richards | XXXXXXXXXXX | F      |   48 |
+------+---------------+-------------+--------+-------+
> select name, ssn as xyz from patients;
+---------------+--------------+
| name          | xyz          |
+---------------+--------------+
| John Doe      | XXXXXXXXXXX  |
| Jack Smith    | XXXXXXXXXXX  |
| Jane Richards | XXXXXXXXXXX  |
+---------------+--------------+

We have additional enhancements planned for the masking filter in the next release (MariaDB MaxScale 2.2), including partial data masking. 

For details on how to setup the rules, please see MariaDB MaxScale masking filter guide.

If you would like to prevent users from using functions on columns to be masked, you may use the database firewall filter to block such queries. For details on how to configure black-listing and white-listing rules with firewall filter, please see MariaDB MaxScale Database Firewall filter guide.

MariaDB AX is an open source database for modern analytics: distributed, columnar and easy to use.

Topics:
database ,data masking ,data security ,tutorial ,sensitive data

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}