Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Schema Sharding with MariaDB MaxScale 2.1

DZone's Guide to

Schema Sharding with MariaDB MaxScale 2.1

Learn how you can mix SchemaRouter and ReadWriteSplit to have a robust solution for sharding databases among a number of sharding servers without changing app code.

· Database Zone
Free Resource

Learn how to create flexible schemas in a relational database using SQL for JSON.

Most of the time, when you start a database design, you don’t imagine how your applications need to scale. Sometimes, you need to shard your databases among some different hosts and then on each shard, you want to split reads and writes between master and slaves. This blog is about MariaDB MaxScale being able to handle different databases across shards, and splitting up reads and writes into each of the shards to achieve the maximum level of horizontal scalability with MariaDB Server.

After reading this blog you will be able to:

  • Know how MariaDB MaxScale handles shards with the SchemaRouter router.
  • Know how MariaDB MaxScale handles the split of reads and writes with the ReadWriteSplit router.
  • Know how we can combine both mentioned router in order to have a scalable environment.

Introduction and Scenario

I have worked with customers and their environments before when it was possible to see customers just duplicating the whole database structure in order to make their systems work. Instead of having a unique database and clients with an ID and their data, some companies have systems in which, for each of their clients, they just add a new database with empty tables and start the new client on the system; tables are all the same.

This way, you can imagine that after some time, database servers will have thousands of databases schemas with the same structure to attend different clients. As well, after some time, the capacity problem can arise and a known resolution for this problem is to start sharding the database schemas on more servers.

So each database server is now a source of databases from a subset of clients and so forth. This is very interesting because the main benefit of using shards is to mitigate the pressure on just one server and spread the world on more servers, while client A’s database is located on shard_a and client B’s database is located on shard_b (for sure, I’m considering an environment with more than 600 databases per server and as said before, I saw customers using up to six to eight shards, sharding clients databases this way). This is not just the number of databases or schemas but also the number of tables each of these schema has (I’m talking about when querying I_S becomes a complete nightmare and can really crash a database server).

Upon this, we know very well that one of the magic things related to scalability in MySQL world is to split reads and writes — at least, it’s a good start when it can be done. It makes the writes just on the master server while all reads go to the slave or slaves in case capacity to just one slave is hit. With that, we can say that each of the shards we’re talking about until here should have at least one slave which will be used for reads, while writes go to the master, as said previously. With this in mind, I present the following scenario which I will be working with on this blog post:

maxscale - 192.168.50.100 (MaxScale Server) 

#: SHARD A
box01 - 192.168.50.11 (Shard_A’s master) 
| 
+——— box02 - 192.168.50.12 (Shard_A’s slave) 

#: SHARD B
box03 - 192.168.50.13 (Shard_B’s master) 
|
+——— box04 -192.168.50.14 (Shard_B’s slave)

Above, you can see, I set up one server to run MaxScale and four others to run two shards having different databases or schemas in it. The nodes box01 through box04 are running MariaDB 10.1.21 (you can also run this on MariaDB 10.2.6, the most recent release of MariaDB Server), set up from MariaDB YUM repository for CentOS 7. You can set up CentOS repository, as well as download MaxScale rpm package. I would like to highlight that just the newer version of MariaDB MaxScale is able to combine modules as we’re going to present here: SchemaRouter and ReadWriteSplit.

The replication between servers on each shard is something that you know you need to set up by yourself. Once servers are all up — I’m talking about database nodes from box01 to box04 — you can create a replication user on box01 and box03 and enter the CHANGE MASTER TO, as shown below:

#: create replication user on shard_a 
sudo mysql -e "grant replication client, replication slave on *.* to repl@'192.168.%' identified by ‘XXXXXX';" 

#: create replication user on shard_b 
sudo mysql -e "grant replication client, replication slave on *.* to repl@'192.168.%' identified by ‘XXXXXX’;" 

#: configure replication for shard_a on box02 
sudo mysql -e "change master to master_host='192.168.50.11',master_user='repl',master_password='XXXXXX',master_use_gtid=slave_pos,master_connect_retry=5;” 

sudo mysql -e "start slave;"

#: configure replication for shard_b on box04
sudo mysql -e "change master to master_host='192.168.50.13',master_user='repl',master_password='XXXXXX',master_use_gtid=slave_pos,master_connect_retry=5;" 

 sudo mysql -e "start slave;" 

MariaDB MaxScale 2.1.3 Setup

Talking about MaxScale, we’re running the version 2.1.3, which supports the combination ofSchemaRouter and ReadWriteSplit  routers. To set up MaxScale on CentOS 7, you can download the rpm package from MariaDB repository and set up the package as below:

#: setup maxscale
sudo rpm -Uvih https://downloads.mariadb.com/MaxScale/2.1.3/rhel/7server/x86_64/maxscale-2.1.3-1.rhel.7.x86_64.rpm

Create the encrypted password using maxkeys and maxpasswd if you want to encrypt the password for the user that access the database servers. See here for help.

#: generate keys
[root@maxscale ~]# maxkeys 
Generating .secrets file in /var/lib/maxscale. 

#: generate the hexadecimal encrypted password to add to the maxscale config file 
[root@maxscale ~]# maxpasswd /var/lib/maxscale/ XXXXXX 
DF5822F1038A154FEB68E667740B1160 

#: change .secrets file ownership to permit maxscale user to access it 
[root@maxscale log]# chown maxscale:maxscale /var/lib/maxscale/.secrets 

After this, add the following configuration file that will combine modules, considering the above servers arrangement, to make SchemaRouter and ReadWriteSplit happen:

#: maxscale configuration file - /etc/maxscale.cnf 
[maxscale] 
threads=16 
skip_permission_checks=true 

# Shard-A 

[Shard-A-Monitor] 
type=monitor
module=mysqlmon 
servers=server1,server2 
user=maxmon 
passwd=DF5822F1038A154FEB68E667740B1160 
monitor_interval=10000 

[Shard-A-Router] 
type=service 
router=readwritesplit 
servers=server1,server2 
user=maxuser 
passwd=DF5822F1038A154FEB68E667740B1160 

[Shard-A-Listener] 
type=listener
service=Shard-A-Router 
protocol=MySQLClient 
port=4006 

[server1] 
type=server 
address=192.168.50.11 
port=3306 
protocol=MySQLBackend

[server2] 
type=server 
address=192.168.50.12
port=3306 
protocol=MySQLBackend

# Shard-B

[Shard-B-Monitor]
type=monitor 
module=mysqlmon 
servers=server3,server4 
user=maxmon 
passwd=DF5822F1038A154FEB68E667740B1160
monitor_interval=10000 

[Shard-B-Router] 
type=service 
router=readwritesplit 
servers=server3,server4 
user=maxuser 
passwd=DF5822F1038A154FEB68E667740B1160 

[Shard-B-Listener] 
type=listener 
service=Shard-B-Router 
protocol=MySQLClient 
port=4007 

[server3] 
type=server 
address=192.168.50.13
port=3306
protocol=MySQLBackend 

[server4]
type=server 
address=192.168.50.14 
port=3306 
protocol=MySQLBackend 

# The two services abstracted as servers 

[Shard-A]
type=server 
address=127.0.0.1
port=4006 
protocol=MySQLBackend 

[Shard-B]
type=server 
address=127.0.0.1
port=4007
protocol=MySQLBackend 

# The sharding service

[Sharding-Service] 
type=service
router=schemarouter 
servers=Shard-A,Shard-B
user=maxuser 
passwd=DF5822F1038A154FEB68E667740B1160 
router_options=refresh_interval=10 

[Sharding-Listener]
type=listener 
service=Sharding-Service 
protocol=MySQLClient 
port=3306

# Command line interface used by Maxadmin 

[CLI]
type=service router=cli 

[CLI-Listener] 
type=listener 
service=CLI 
protocol=maxscaled 
socket=default 

Of course, some other variables and configurations that should be considered as part of the monitor's sections, as well as if you want to setup an automatic failover on a failed event on the backend servers. But this is part of a future blog (for more info about how to mix together MariaDB MaxScale and Replication Manager for automatic failover, click here). 

Additionally, continuing with the configurations, make sure you create the user for MaxScale to be able to access database servers when it is started up. The below users are created on both masters, box01 and box03, to make the users to exists on those two shards:

sudo mysql -e "grant all on *.* to maxmon@'%' identified by 'XXXXXX’;" 
sudo mysql -e "grant all on *.* to maxuser@'%' identified by 'XXXXXX’;" 

sudo mysql -e "grant all on *.* to maxmon@'127.0.0.1' identified by 'XXXXXX’;" 
sudo mysql -e "grant all on *.* to maxuser@'127.0.0.1' identified by 'XXXXXX’;"

sudo mysql -e "grant all on *.* to maxmon@'192.168.50.100' identified by 'XXXXXX’;" 
sudo mysql -e "grant all on *.* to maxuser@'192.168.50.100' identified by 'XXXXXX’;" 

With this, you can start MaxScale:

[root@maxscale log]# systemctl start maxscale 
... 
MariaDB MaxScale /var/log/maxscale/maxscale.log Mon Mar 6 14:04:04 2017 
---------------------------------------------------------------------------- 
2017-03-06 14:04:04 notice : Working directory: /var/log/maxscale 
2017-03-06 14:04:04 notice : MariaDB MaxScale 2.1.3 started 
2017-03-06 14:04:04 notice : MaxScale is running in process 4203 
2017-03-06 14:04:04 notice : Configuration file: /etc/maxscale.cnf 
2017-03-06 14:04:04 notice : Log directory: /var/log/maxscale 
2017-03-06 14:04:04 notice : Data directory: /var/lib/maxscale 
2017-03-06 14:04:04 notice : Module directory: /usr/lib64/maxscale 
2017-03-06 14:04:04 notice : Service cache: /var/cache/maxscale 
2017-03-06 14:04:04 notice : Loading /etc/maxscale.cnf. 
2017-03-06 14:04:04 notice : /etc/maxscale.cnf.d does not exist, not reading. 
2017-03-06 14:04:04 notice : [cli] Initialise CLI router module 
2017-03-06 14:04:04 notice : Loaded module cli: V1.0.0 from /usr/lib64/maxscale/libcli.so 
2017-03-06 14:04:04 notice : [schemarouter] Initializing Schema Sharding Router. 
2017-03-06 14:04:04 notice : Loaded module schemarouter: V1.0.0 from /usr/lib64/maxscale/libschemarouter.so 
2017-03-06 14:04:04 notice : [readwritesplit] Initializing statement-based read/write split router module. 
2017-03-06 14:04:04 notice : Loaded module readwritesplit: V1.1.0 from /usr/lib64/maxscale/libreadwritesplit.so 
2017-03-06 14:04:04 notice : [mysqlmon] Initialise the MySQL Monitor module. 
2017-03-06 14:04:04 notice : Loaded module mysqlmon: V1.5.0 from /usr/lib64/maxscale/libmysqlmon.so 
2017-03-06 14:04:04 notice : Loaded module MySQLBackend: V2.0.0 from /usr/lib64/maxscale/libMySQLBackend.so 
2017-03-06 14:04:04 notice : Loaded module MySQLBackendAuth: V1.0.0 from /usr/lib64/maxscale/libMySQLBackendAuth.so 
2017-03-06 14:04:04 notice : Loaded module maxscaled: V2.0.0 from /usr/lib64/maxscale/libmaxscaled.so 
2017-03-06 14:04:04 notice : Loaded module MaxAdminAuth: V2.1.3 from /usr/lib64/maxscale/libMaxAdminAuth.so 
2017-03-06 14:04:04 notice : Loaded module MySQLClient: V1.1.0 from /usr/lib64/maxscale/libMySQLClient.so 
2017-03-06 14:04:04 notice : Loaded module MySQLAuth: V1.1.0 from /usr/lib64/maxscale/libMySQLAuth.so 
2017-03-06 14:04:04 notice : No query classifier specified, using default 'qc_sqlite'.
2017-03-06 14:04:04 notice : Loaded module qc_sqlite: V1.0.0 from /usr/lib64/maxscale/libqc_sqlite.so 
2017-03-06 14:04:04 notice : Using encrypted passwords. Encryption key: '/var/lib/maxscale/.secrets'. 
2017-03-06 14:04:04 notice : [MySQLAuth] [Shard-A-Router] Loaded 4 MySQL users for listener Shard-A-Listener. 
2017-03-06 14:04:04 notice : Listening connections at 0.0.0.0:4006 with protocol MySQL 
2017-03-06 14:04:04 notice : [MySQLAuth] [Shard-B-Router] Loaded 4 MySQL users for listener Shard-B-Listener. 
2017-03-06 14:04:04 notice : Listening connections at 0.0.0.0:4007 with protocol MySQL 
2017-03-06 14:04:04 notice : [schemarouter] Authentication data is fetched from all servers. To disable this add 'auth_all_servers=0' to the service. 
2017-03-06 14:04:04 notice : Server changed state: server3[192.168.50.13:3306]: new_master. [Running] -> [Master, Running] 
2017-03-06 14:04:04 notice : Server changed state: server4[192.168.50.14:3306]: new_slave. [Running] -> [Slave, Running] 
2017-03-06 14:04:04 notice : [mysqlmon] A Master Server is now available: 192.168.50.13:3306 
2017-03-06 14:04:04 notice : Server changed state: server1[192.168.50.11:3306]: new_master. [Running] -> [Master, Running] 
2017-03-06 14:04:04 notice : Server changed state: server2[192.168.50.12:3306]: new_slave. [Running] -> [Slave, Running] 
2017-03-06 14:04:04 notice : [mysqlmon] A Master Server is now available: 192.168.50.11:3306 
2017-03-06 14:04:10 notice : Listening connections at 0.0.0.0:3306 with protocol MySQL 
2017-03-06 14:04:10 notice : Listening connections at /tmp/maxadmin.sock with protocol MaxScale Admin
2017-03-06 14:04:10 notice : MaxScale started with 1 server threads. 
2017-03-06 14:04:10 notice : Started MaxScale log flusher. 

An additional user needs to be created to be used by the application, which will connect to MaxScale and MaxScale will do the router based on modules. I suggest you create a user like the one below (if the user of your app does more than the privileges below, add the privileges you need or even, create a ROLE and give the ROLE privilege to the user:

#: create the user for the app to connect to db nodes through MaxScale 
sudo mysql -e "grant all on *.* to appuser@'%' identified by 'XXXXXX’;" 
sudo mysql -e "grant all on *.* to appuser@'127.0.0.1' identified by 'XXXXXX’;" 
sudo mysql -e "grant all on *.* to appuser@'192.168.50.100' identified by 'XXXXXX’;" 

At this point, we can use MaxAdmin on MaxScale server to exhibit the configuration we have until now:

#: configured services 
[root@maxscale ~]# maxadmin list services
Services. 
--------------------------+-------------------+--------+----------------+------------------- 
Service Name  | Router Module  | #Users | Total Sessions | Backend databases 
--------------------------+-------------------+--------+----------------+------------------- 
Shard-A-Router       | readwritesplit   | 1   | 1 | server1, server2 
Shard-B-Router   | readwritesplit   | 1     | 1| server3, server4 
Sharding-Service       | schemarouter   | 1     | 1| Shard-A, Shard-B 
CLI  | cli  | 2   | 2| 
--------------------------+-------------------+--------+----------------+------------------- 

#: database nodes and their status on shards 
[root@maxscale ~]# maxadmin list servers 
Servers. 
-------------------+-----------------+-------+-------------+--------------------
Server   | Address | Port | Connections | Status 
-------------------+-----------------+-------+-------------+-------------------- 
server1   | 192.168.50.11  | 3306  | 0   | Master, Running 
server2   | 192.168.50.12   | 3306  | 0    | Slave, Running 
server3   | 192.168.50.13   | 3306  | 0   | Master, Running 
server4   | 192.168.50.14   | 3306  | 0    | Slave, Running 
Shard-A   | 192.168.50.11 | 3306  | 0   | Running 
Shard-B   | 192.168.50.13  | 3306  | 0    | Running 
-------------------+-----------------+-------+-------------+-------------------- 

Maxadmin can tell us the statistics about the modules and the all the work they have done until now, this is just to register that we haven't yet tested the services as it’s going to be done on the next section. Below you can see the statistics of the configured services:

#: Shard-A-Router (readwritesplit) 
[root@maxscale log]# maxadmin show service "Shard-A-Router" 
 Service: Shard-A-Router 
     Router: readwritesplit 
     State: Started 

     use_sql_variables_in: all 
     slave_selection_criteria: LEAST_CURRENT_OPERATIONS
     master_failure_mode: fail_instantly 
     max_slave_replication_lag: -1 
     retry_failed_reads: true 
     strict_multi_stmt: true 
     disable_sescmd_history: true 
     max_sescmd_history: 0 
     master_accept_reads: false 

     Number of router sessions:  1 
     Current no. of router sessions:  1 
     Number of queries forwarded:   2 
     Number of queries forwarded to master:   0 (0.00%) 
     Number of queries forwarded to slave:  2 (100.00%)
     Number of queries forwarded to all:   1 (50.00%)
     Started:   Mon Mar 6 01:37:27 2017 
     Root user access:   Disabled
     Backend databases: 
  192.168.50.11:3306Protocol: MySQLBackendName: server1 
          192.168.50.12:3306Protocol: MySQLBackendName: server2 
     Total connections: 2 
     Currently connected: 1 

#: Shard-B-Router (readwritesplit) 
[root@maxscale log]# maxadmin show service "Shard-B-Router" 
 Service:  Shard-B-Router 
     Router:   readwritesplit
     State:   Started 

     use_sql_variables_in:all 
     slave_selection_criteria:  LEAST_CURRENT_OPERATIONS 
     master_failure_mode: fail_instantly 
     max_slave_replication_lag: -1 
     retry_failed_reads:true 
     strict_multi_stmt:true
     disable_sescmd_history:true 
     max_sescmd_history:0 
     master_accept_reads:false 

     Number of router sessions: 1
     Current no. of router sessions:  1
     Number of queries forwarded:    2 
     Number of queries forwarded to master:   0 (0.00%) 
     Number of queries forwarded to slave:  2 (100.00%) 
     Number of queries forwarded to all:  1 (50.00%) 
     Started: Mon Mar 6 01:37:27 2017 
     Root user access:Disabled 
     Backend databases:
 192.168.50.13:3306Protocol: MySQLBackendName: server3 
         192.168.50.14:3306Protocol: MySQLBackendName: server4 
     Total connections:   2 
     Currently connected:  1 

#: Sharding-Service 
[root@maxscale log]# maxadmin show service "Sharding-Service" 
 Service:   Sharding-Service 
     Router:  schemarouter 
     State:  Started 

Session Commands 
Total number of queries: 0 
Percentage of session commands: 0.00 
Longest chain of stored session commands: 0 
Session command history limit exceeded: 0 times 
Session command history: enabled 
Session command history limit: unlimited 
Shard map cache hits: 0 
Shard map cache misses: 0 

 Started:  Mon Mar 6 01:37:27 2017 
     Root user access:   Disabled 
     Backend databases: 
  192.168.50.11:3306Protocol: MySQLBackendName: Shard-A 
          192.168.50.13:3306Protocol: MySQLBackendName: Shard-B 
     Total connections: 1 
     Currently connected: 1 

Conclusion

The main goal of this first part is to show that we can mix together both routers SchemaRouter and ReadWriteSplit to have a robust solution for sharding databases among a number of sharding servers initially without changing any code on the application side. We show that the whole job can be done with MariaDB MaxScale after configuring it the way I showed in this blog. The second part will focus on showing the mechanics of the routing, while going through the details of how it works.

Create flexible schemas using dynamic columns for semi-structured data. Learn how.

Topics:
database ,tutorial ,schema ,sharding ,mariadb ,maxscale

Published at DZone with permission of Wagner Bianchi, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}