Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Testing N1QL: SQL for JSON (Part 5)

DZone's Guide to

Testing N1QL: SQL for JSON (Part 5)

Now that we've learned how to use the MySQL Random Query Generator, let's use it to debug SQL and N1QL to find out where errors come from and how to fix them.

· Database Zone
Free Resource

Learn how to move from MongoDB to Couchbase Server for consistent high performance in distributed environments at any scale.

Welcome back to Testing N1QL! If you're just getting in, check out part onepart twopart three, and part four. Now that you're familiar with MySQL's RQG, let's take a look at the execution environment.

Our test case assumes that we have MySQL and the Python connector for MySQL installed on the machine where we want to run our test framework: testrunner. Links to our GitHub repository of testrunner are provided below. The test case can run for single or multiple tables in MySQL. Since it is hard to store expected results for millions of queries, we want to compute it on the fly. This requires the installation of MySQL.

The test creates the database in MySQL as specified by the database schema. MySQL tables get created and data is populated. Various data types considered here are Datetime, String, Numeric, Boolean.

We load the MySQL database created into Couchbase buckets. Each MySQL table is loaded into a Couchbase bucket. Each row in the MySQL table is converted into JSON docs in the bucket. The primary key value of a MySQL table, which is unique, defines the unique key of each Couchbase document.

Query template files are loaded and unzipped. Then we build the primary index on Couchbase buckets and MySQL tables. A few tests build different secondary indexes on Couchbase buckets by taking different possible combinations of fields. The tests using secondary indexes verify for both views and GSI indexes.

We vary the number of queries  to be generated for our testing purposes as below:

Sanity Runs

Number of Queries

Run Frequency

Dev Sanity

1000

Upon introduction of a new feature or optimization of an existing feature by Dev.

Daily QE Sanity

10000

Every 24 hours with a new development build by QE.

BiWeekly Full RQG

550K

Every 2 weeks.


Though there is no limitation, we validate up to 550K queries.

                    Figure3: Visualization of Framework

Image title

The figure above gives us an idea of how we compare actual JSON results generated by N1QL queries with expected results from SQL queries after inserting the exact same data in both a MySQL database and Couchbase buckets — then running the same queries respectively.

Once we have the respective queries, we run SQL queries against MySQL and generate the expected results. Then, we run N1QL queries against Couchbase and generate the actual results. Now, in order to compare results, we need to normalize the data, since MySQL generates resultsets in rows/columns while N1QL generates JSON.

What Is the Meaning of Normalization Here?

We add an order by clause in the respective SQL and N1QL query so that results are ordered based on the key, and then we compare them.

  • sql_query becomes sql_query order by <primary_key_id>

  • n1ql_query becomes n1ql_query order by <document key>

First, we compare the lengths of actual and expected results. Then we convert SQL results to JSON and compare them with N1QL JSON results.

For tests with multiple indexes, we specify N1QL queries without index hints and see if the correct index is being used in explain clause.

The difference in results is displayed very clearly for debugging purposes as shown below.

There is an AssertionError giving an analysis of pass percentage, total queries run, and distribution of failure.

Example of results from a failed run:

AssertionError:  Total Queries Run = 10000, Pass = 9997, Fail = 3, Pass Percentage = 99 %

[ KEYWORD FAILURE DISTRIBUTION ]

AND :: 0%                                 

BETWEEN :: 0%

FALSE :: 0%

DISTINCT :: 0%

IS :: 0%

IN :: 0%

 >=  :: 0%

 =  :: 0%

KEY :: 0%

 <  :: 0%

NOT :: 1%

 >  :: 0%

NULL :: 0%

TRUE :: 0%

OR :: 0%

 <=  :: 0%


In this example, 0% mentioned with keywords in [ KEYWORD FAILURE DISTRIBUTION ] indicates that N1QL queries with these keywords gave correct results while those with the NOT keyword gave in 1% of results different results from SQL queries. This is how we identify the pattern where failures occurred.

The test also displays the exact SQL and N1QL query, which gave different results in the end.

Example:

sql_query

n1ql_query

SELECT primary_key_id,
      varchar_field1
FROM   simple_table
WHERE  (( ( decimal_field1 <= 4937 )
         AND ( int_field1 < 4947 )
         AND varchar_field1 <= "zrkkqjrlpo"
         AND bool_field1 = true ))
      AND (( decimal_field1 != 4937
              OR primary_key_id NOT BETWEEN "1" AND "999" ))


SELECT primary_key_id,
      varchar_field1
FROM   simple_table_db_6815_simple_table
WHERE  (( ( decimal_field1 <= 4937 )
         AND ( int_field1 < 4947 )
         AND varchar_field1 <= "zrkkqjrlpo"
         AND bool_field1 = true ))
      AND (( decimal_field1 != 4937
              OR primary_key_id NOT BETWEEN "1" AND "999" ))


Example of Results From a Successful Run

2017-01-16 19:23:40 | INFO | MainProcess | test_thread | [test_rqg._test_result_analysis]  Total Queries Run = 10000, Pass = 10000, Fail = 0, Pass Pecentage = 100 %

2017-01-16 19:23:40 | INFO | MainProcess | test_thread | [test_rqg.test_rqg_crud_ops]

ok

Ran 1 test in 16045.208s

OK

MySQLCursor: DROP SCHEMA IF EXISTS crud_db_..

summary so far suite rqg.test_rqg.RQGTests , pass 1 , fail 0


This helps the QE and developers to analyze the differences and figure out the reason for incorrect results in an efficient way.

We caught a lot of bugs related to incorrect results in N1QL last year using our rqg testrunner framework.

The following table shows some of the bugs caught by this RQG framework:

Issue #

Description

Comments

MB-15877

With secondary indexing Having clause with group by returns inconsistent result as compared to MySQL


On comparing the explain it was seen that the spans array was different in both cases.



MB-22001

Difference in the number of results seen for few aggregate queries


This error was seen as part of intersect scan optimizations being made in N1QL.

MB-19456

Merge into and delete queries reporting wrong results


This error was introduced as a result of a race condition in merge queries.

MB-19171

Mismatch in Queries seen for count(*) and min functions



MB-16291


Secondary indexes without index hints not giving correct results for sum,average,count


Problem / Solution:

If an index scan produces multiple spans that might overlap, the results of the index scan must be de-duped.

An additional optimization would be to eliminate duplicate spans.



Details on an Example bug: MB-19171

This bug shows how QE daily RQG run caught the regression which was introduced as part of our N1QL optimizations.

The following queries gave different results, and the mismatch in results shows the expected results from a particular SQL query and actual results from the same n1ql query.

sql_query :: SELECT MIN( int_field1 ) FROM simple_table_1 WHERE char_field1 >= "N"

n1ql_query :: SELECT MIN( int_field1 ) FROM simple_table_1 WHERE char_field1 >= "N"

mismatch in results :: expected :: [{u'MIN( int_field1 )': 34}], actual :: []

sql_query :: SELECT COUNT(*) FROM simple_table_1 WHERE (int_field1 > 5042) AND (decimal_field1 > 5201)

n1ql_query :: SELECT COUNT(*) FROM simple_table_1 WHERE (int_field1 > 5042) AND (decimal_field1 > 5201)


mismatch in results :: expected :: [236], actual :: [500]

sql_query :: SELECT MIN( int_field1 ) FROM simple_table_1 WHERE varchar_field1 > "aYTUOeXhmw"

n1ql_query :: SELECT MIN( int_field1 ) FROM simple_table_1 WHERE varchar_field1 > "aYTUOeXhmw"


mismatch in results :: expected :: [{u'MIN( int_field1 )': 69}], actual :: []

sql_query :: SELECT MIN( int_field1 ) FROM simple_table_1 WHERE NOT (varchar_field1 < "aYTUOeXhmw")

n1ql_query :: SELECT MIN( int_field1 ) FROM simple_table_1 WHERE NOT (varchar_field1 < "aYTUOeXhmw")


mismatch in results :: expected :: [{u'MIN( int_field1 )': 69}], actual :: []


Dev could identify the patterns in the queries where we saw the mismatch and fix the issue quickly.

Advantages of This Framework

The main advantage of this framework is to catch regressions in N1QL, since the random field combinations are not picked by functional or unit tests. When developers change N1QL code, they can be at ease as most of the issues will be immediately caught by RQG.

Installation of RQG

If you want to install RQG on your system,here are the instructions:

  1. Install Perl: Make sure it is Perl 5, version 18, subversion 2 (v5.18.2) built for darwin-thread-multi-2level (with two registered patches, see perl -V for more detail)

  2. Install MySQL and then link MySQL

  3. Install DBI:

    1. Install cpanm with this command: sudo cpan App::cpanminus.

    2. Install DBI: sudo cpanm DBI.

  4. Install DBD::MySQL:

    1. Get the necessary files from CPAN. Open the CPAN terminal:

    2. sudo perl -MCPAN -e 'shell'.

    3. Then, run these commands: get DBD::MySQL and exit.

    4. Before compiling you have to create some alias because MySQL

    5. It's installed differently on a Mac than on Linux.

  5. cd /usr/local
    sudo mkdir lib

  6. cd lib: sudo ln -s /usr/local/MySQL/lib/*.dylib

  7. Finally, you must install and compile the library:

    1.  cd ~/.cpan/build/DBD*/ 

    2.  sudo perl Makefile.PL --testuser='yourMySQLuser' --testpassword='
      yourMySQLpassword'
       

    3.  sudo make 
    4.  sudo make test 

    5.  sudo make install 

Pre-Requisites for RQG

MySQL, Python MySQL connector (Python lib that talks to MySQL).

Relevant GitHub Links

RQG for MySQL:

N1QL Documentation:

If you have any feedback or questions on this article, please email me at prerna.manaktala@couchbase.com

Acknowledgements:

  • Raju Suravarjjala <raju@couchbase.com>

  • Parag Agarwal  <>

  • Keshav Murthy <keshav@couchbase.com>

Want to deliver a whole new level of customer experience? Learn how to make your move from MongoDB to Couchbase Server.

Topics:
database ,n1ql ,mysql ,database debugging ,tutorial

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}