Using MariaDB MaxScale 2.1 Regex Filter for Migrations
Application code, even simple SQL, tends to have one or two database-specific constructs in them. That is what this blog is about.
Migrating applications from one database system to another is sometimes easy and sometimes not. But it is hardly ever effortless. Among the obvious issues are schema and data, migrating from one datatype to another, with slightly different behavior and semantics is one thing. Another is migrating the actual data; is it UTF8, and if so, how many bytes? What is the collation? What is the required accuracy of numeric types?
And on top of this are things such as triggers, stored procedures, and such — not to mention performance tuning and the optimal way to construct SQL statements.
Speaking of SQL statements, we have application code, as well. Yes, most databases have some kind of application running on them, often more than one, and these access the database using SQL over some kind of APIs such as JDBC, ODBC, or some proprietary driver. And application code, even simple SQL, tends to have one or two database-specific constructs in them. That is what this blog is about.
Before moving on to that though, a few words on MariaDB Server 10.2.6 which has been GA since May 23. MariaDB Server 10.2 does contain more than a few things that make migration from other database systems to MariaDB a lot easier. Among these features are:
CHECK
constraints. The syntax for these has been supported before, but in MariaDB Server 10.2 these are actually implementing proper constraints.DEFAULT
values. In MariaDB Server before version 10.2, these were several restrictions around whatDEFAULT
values could be used, and how, but in 10.2 these are lifted.- Multiple triggers per event. In MariaDB Server before 10.2 you could only have one trigger per DML event, i.e. several
BEFORE INSERT
triggers. This has two advantages, one is the obvious one that the database you are migrating from might be supporting multiple triggers per event. Another is that sometimes you want to add a trigger to implement some compatibility with the database system you are migrating from, and this feature makes doing this a lot easier.
With that said, let’s say you have migrated the schema and the procedures and what have you not, and also the data. Then you have replaced that ODBC driver from the existing database system with one from MariaDB, which means we are all set to try the application. And the application falls over on the first SQL statement because it uses some proprietary feature of the database we are migrating from. There are some ways of getting around that, with MariaDB there are two ways that have been used in the past:
- Use MariaDB compatibility features. As stated above, there are many new compatibility features in MariaDB Server 10.2.6 GA. In addition there are some settings for the
SQL_MODE
parameter for compatibility, such as thePIPES_AS_CONCAT
that ensures that theANSI
SQL concatenation operator, two pipes (||
), is interpreted as a MariaDBCONCAT
. - Develop procedures, functions and user defined functions that mimic procedures and functions in other database systems.
There is nothing wrong with the above means of migration, but they don’t cover all aspect of a migration. One more tool that is available now is the new MariaDB MaxScale 2.1.3 GA and there is a plugin that is particularly useful, the Regex one. What this allows us to do is to replace text in the SQL statement so that it matches something that MariaDB Server can work with, and a good example is the Oracle DECODE() function. This function is rather special, in a few ways:
- It takes a variable number of arguments, from 3 and up.
- The type of the return value depends on the type of the arguments.
The SQL Standard construct for this is the CASE
statement, which has the same attributes as above. We cannot solve the use of the DECODE
function by adding a STORED FUNCTION
. A UDF (User Defined Function) is possible as this can take any number and type of arguments. Also even though a UDF can only return a predefined type, this is not a big issue as MariaDB is loosely typed, so we can always, for numeric results, return a numeric string. A bigger issue though is that MariaDB already has a DECODE
function that does something else.
Also, we would really like to use the CASE
function and a way to deal with that is to use the MariaDB MaxScale Regex filter. Let me show you how. To begin with, we need to set up the Regex filter itself, and the way I do it here, I will use multiple filters, one for each of the number of arguments I pass to DECODE
. I guess there is some way of doing this in a smarter way, but here I am just showing the principle. Also note that the Regex filter use the PCRE2 regular expressions, not the Posix one. Let’s start with a couple of filter specification for a DECODE
with 3 and 4 arguments and define them in our MariaDB MaxScale configuration file:
[DecodeFilter3]
type=filter
module=regexfilter
options=ignorecase
match=DECODE\(([^,)]*),([^,)]*),([^,)]*)\)
replace=CASE $1 WHEN $2 THEN $3 END
[DecodeFilter4]
type=filter
module=regexfilter
options=ignorecase
match=DECODE\(([^,)]*),([^,)]*),([^,)]*),([^,)]*)\)
replace=CASE $1 WHEN $2 THEN $3 ELSE $4 END
As anyone can see, the above really isn’t perfect, things like strings with embedded commas and what have you not will not work, but in the general case, this should work reasonably well, which is not to say you would want to use this in production, but for a test or a proof-of-concept this is good enough. For DECODE
with 5, 6, or more arguments, you add these following the pattern above.
Before we show this in action, let me add one more useful filter for the Oracle SYSDATE
psedocolumn. In Oracle SQL, SYSDATE
is the same as NOW()
in MariaDB, so this is a simple replacement, but as SYSDATE
is a pseudo-column and not a function, like NOW()
, we cannot write a simple Stored Function to handle it, but using a MaraiaDB MaxScale filter should do the trick, like this:
[sysdate]
type=filter
module=regexfilter
options=ignorecase
match=([^[:alpha:]])SYSDATE
replace=$1NOW()
With this, it is now time to enable these filters, and that is done by adding them to the Service in MariaDB MaxScale which we will use:
[Read-Write Service]
type=service
router=readwritesplit
servers=srv1
user=rwuser
passwd=rwpwd
max_slave_connections=100%
filters=DecodeFilter3|DecodeFilter4|sysdate
Assuming you have your MariaDB MaxScale correctly configured in any other place, let’s see if this works as expected. First, we have to restart MariaDB MaxScale and then when we connect to MariaDB and do a call to DECODE
the way it looks like in Oracle and see what is returned:
As can be seen, the translation from DECODE
to a CASE
statement seems to work as expected. Let’s also try with SYSDATE
:
MariaDB> SELECT DECODE('Today', 'Today', SYSDATE, 'Some other day') FROM dual;
+-------------------------------------------------------------------+
| CASE 'Today' WHEN 'Today' THEN NOW() ELSE 'Some other day' END |
+-------------------------------------------------------------------+
| 2017-05-19 18:51:22 |
+-------------------------------------------------------------------+
As we see here, not only does SYSDATE
work as expected, we can handle both DECODE
andSYSDATE
conversions as the filters are piped to each other. Using MariaDB MaxScale with the Regex filter is yet another tool for migrating applications.
Comments