Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

ProxySQL and Percona XtraDB Cluster (Galera) Integration

DZone's Guide to

ProxySQL and Percona XtraDB Cluster (Galera) Integration

Learn everything you wanted to know about getting ProxySQL playing nicely with Percona's XtraDB. See the integration in fine detail.

· Database Zone
Free Resource

Download the Guide to Open Source Database Selection: MySQL vs. MariaDB and see how the side-by-side comparison of must-have features will ease the journey. Brought to you in partnership with MariaDB.

In this post, we’ll discuss how an integrated ProxySQL and Percona XtraDB Cluster (Galera) helps manage node states and failovers.

ProxySQL is designed to not perform any specialized operation in relation to the servers with which it communicates. Instead, it uses an event scheduler to extend functionalities and cover any special needs.

Given that specialized products like Percona XtraDB Cluster are not managed by ProxySQL, they require the design and implementation of good/efficient extensions.

In this article, I will illustrate how Percona XtraDB Cluster/Galera can be integrated with ProxySQL to get the best from both.

Brief Digression

Before discussing their integration, we need to review a couple of very important concepts in ProxySQL. ProxySQL has a very important logical component: Hostgroup(s) (HG).

A hostgroup is a relation of:

+-----------+       +------------------------+
|Host group +------>|Server (1:N)            |
+-----------+       +------------------------+

In ProxySQL, QueryRules (QR) can be directly mapped to an HG. Using QRs, you can define a specific user to ONLY go to that HG. For instance, you may want to have user app1_user go only on servers A-B-C. Simply set a QR that says app1_user has the destination hostgroup 5, where HG 5 has the servers A-B-C:

INSERT</code> <code class="sql keyword">INTO</code> <code class="sql plain">mysql_servers (hostname,hostgroup_id,port,weight) </code><code class="sql keyword">VALUES</code> <code class="sql plain">(</code><code class="sql string">'192.168.1.5'</code><code class="sql plain">,5,3306,10);
INSERT</code> <code class="sql keyword">INTO</code> <code class="sql plain">mysql_servers (hostname,hostgroup_id,port,weight) </code><code class="sql keyword">VALUES</code> <code class="sql plain">(</code><code class="sql string">'192.168.1.6'</code><code class="sql plain">,5,3306,10);
INSERT</code> <code class="sql keyword">INTO</code> <code class="sql plain">mysql_servers (hostname,hostgroup_id,port,weight) </code><code class="sql keyword">VALUES</code> <code class="sql plain">(</code><code class="sql string">'192.168.1.7'</code><code class="sql plain">,5,3306,10);
INSERT</code> <code class="sql keyword">INTO</code> <code class="sql plain">mysql_query_rules (username,destination_hostgroup,active) </code><code class="sql keyword">values</code><code class="sql plain">(</code><code class="sql string">'app1_user'</code><code class="sql plain">,5,1);

Easy isn’t it?

Another important concept in ProxySQL also related to HG is ReplicationHostgroup(s) (RHG). This is a special HG that ProxySQL uses to automatically manage the nodes that are connected by replication and configured in Write/Read and Read_only mode.

What does this mean? Let’s say you have four nodes A-B-C-D, connected by standard asynchronous replication. A is the master and B-C-D are the slaves. What you want is to have your application pointing writes to server A, and reads to B-C (keeping D as a backup slave). Also, you don’t want to have any reads go to B-C if the replication delay is more than two seconds.

RHG, in conjunction with HG, ProxySQL can manage all this for you. Simply instruct the proxy to:

  1. Use RHG.
  2. Define the value of the maximum latency.

Using the example above:

INSERT</code> <code class="sql keyword">INTO</code> <code class="sql plain">mysql_servers (hostname,hostgroup_id,port,weight,max_replication_lag) </code><code class="sql keyword">VALUES</code> <code class="sql plain">(</code><code class="sql string">'192.168.1.5'</code><code class="sql plain">,5,3306,10,2);
</code><code class="sql keyword">INSERT</code> <code class="sql keyword">INTO</code> <code class="sql plain">mysql_servers (hostname,hostgroup_id,port,weight,max_replication_lag) </code><code class="sql keyword">VALUES</code> <code class="sql plain">(</code><code class="sql string">'192.168.1.6'</code><code class="sql plain">,5,3306,10,2);
INSERT</code> <code class="sql keyword">INTO</code> <code class="sql plain">mysql_servers (hostname,hostgroup_id,port,weight,max_replication_lag) </code><code class="sql keyword">VALUES</code> <code class="sql plain">(</code><code class="sql string">'192.168.1.7'</code><code class="sql plain">,5,3306,10,2);
INSERT</code> <code class="sql keyword">INTO</code> <code class="sql plain">mysql_servers (hostname,hostgroup_id,port,weight,max_replication_lag) </code><code class="sql keyword">VALUES</code> <code class="sql plain">(</code><code class="sql string">'192.168.1.8'</code><code class="sql plain">,10,3306,10,2);
INSERT</code> <code class="sql keyword">INTO</code> <code class="sql plain">mysql_query_rules (username,destination_hostgroup,active) </code><code class="sql keyword">values</code><code class="sql plain">(</code><code class="sql string">'app1_user'</code><code class="sql plain">,5,1);
</code><code class="sql keyword">INSERT</code> <code class="sql keyword">INTO</code> <code class="sql plain">mysql_query_rules (username,destination_hostgroup,active) </code><code class="sql keyword">values</code><code class="sql plain">(</code><code class="sql string">'app1_user'</code><code class="sql plain">,6,1);
INSERT</code> <code class="sql keyword">INTO</code> <code class="sql plain">mysql_replication_hostgroups </code><code class="sql keyword">VALUES</code> <code class="sql plain">(5,6);

From now on ProxySQL will split the R/W using the RHG and the nodes defined in HG 5.

The flexibility introduced by using HGs is obviously not limited to what I mention here. It will play a good part in the integration of Percona XtraDB Cluster and ProxySQL, as I illustrate below.

Percona XtraDB Cluster/Galera Integration

In an XtraDB cluster, a node has many different states and conditions that affect if and how your application operates on the node.

The most common one is when a node become a DONOR. If you’ve ever installed Percona XtraDB Cluster (or any Galera implementation), you’ve faced the situation when a node become a DONOR it changes state to DESYNC. If the node is under a heavy load, the DONOR process might affect the node itself.

But that is just one of the possible node states:

  • A node can be JOINED but not synced
  • It can have wsrep_rejectqueries, wsrep_donorrejectqueries, wsrep_ready (off)
  • It can be in a different segment
  • The number of nodes per segment is relevant.

To show what can be done and how, we will use the following setup:

  • Five nodes
  • Two segments
  • Applications requiring R/W split

And two options:

  • Single writer node
  • Multiple writers node

We’ll analyze how the proxy behaves under the use of a script run by the ProxySQL scheduler.

The use of a script is necessary for ProxySQL to respond correctly to Percona XtraDB Cluster state modifications. ProxySQL comes with two scripts for Galera, both of them are too basic and don’t consider a lot of relevant conditions. I’ve written a more complete script.

This script is a prototype and requires QA and debugging, but is still more powerful than the default ones.

The script is designed to manage X number of nodes that belong to a given HG. The script works by HG, and as such it will perform isolated actions/checks by the HG. It is not possible to have more than one check running on the same HG. The check will create a lock file {proxysql_galera_check_${hg}.pid} that will be used to prevent duplicates. galera_check will connect to the ProxySQL node and retrieve all the information regarding the nodes/proxysql configuration. It will then check in parallel each node and will retrieve the status and configuration. galera_check analyzes and manages the following node states:

  • read_only
  • wsrep_status
  • wsrep_rejectqueries
  • wsrep_donorrejectqueries
  • wsrep_connected
  • wsrep_desinccount
  • wsrep_ready
  • wsrep_provider
  • wsrep_segment
  • Number of nodes in by segment
  • Retry loop

As mentioned, the number of nodes inside a segment is relevant. If a node is the only one in a segment, the check behaves accordingly. For example, if a node is the only one in the MAIN segment, it will not put the node in OFFLINE_SOFT when the node becomes a donor, to prevent the cluster from becoming unavailable for applications.

The script allows you to declare a segment as MAIN — quite useful when managing production and DR sites, as the script manages the segment acting as main in a more conservative way. The check can be configured to perform retries after a given interval, where the interval is the time define in the ProxySQL scheduler. As such, if the check is set to have two retries for UP and three for DOWN, it will loop that number before doing anything.

Percona XtraDB Cluster/Galera performs some actions under the hood, some of them not totally correct. This feature is useful in some uncommon circumstances, where Galera behaves weirdly. For example, whenever a node is set to READ_ONLY=1, Galera desyncs and resyncs the node. A check that doesn’t take this into account sets the node to OFFLINE and back for no reason.

Another important differentiation for this check is that it use special HGs for maintenance, all in the range of 9000. So if a node belongs to HG 10, and the check needs to put it in maintenance mode, the node will be moved to HG 9010. Once all is normal again, the node will be put back on its original HG.

This check does NOT modify any node states. This means it will NOT modify any variables or settings in the original node. It will ONLY change node states in ProxySQL.

Multi-Writer Mode

The recommended way to use Galera is in multi-writer mode. You can then play with the weight to have a node act as MAIN node and prevent/reduce certification failures and Brutal force Abort from Percona XtraDB Cluster. Use this configuration:

Delete from mysql_replication_hostgroups where writer_hostgroup=500 ;
delete from mysql_servers where hostgroup_id in (500,501);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.5',500,3306,1000000000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.5',501,3306,100);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.6',500,3306,1000000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.6',501,3306,1000000000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.7',500,3306,100);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.7',501,3306,1000000000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.8',500,3306,1);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.8',501,3306,1);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.9',500,3306,1);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.9',501,3306,1);
LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL TO DISK;

In this test, we will NOT use Replication HostGroup. We will do that later when testing a single writer. For now, we’ll focus on multi-writer.

Segment 1 covers HG 500 and 501, while segment two only covers 501. Weight for the servers in HG 500 is progressive from 1 to 1 billion, in order to reduce the possible random writes on the non-main node.

As such nodes:

  • HG 500S1 192.168.1.5 – 1.000.000.000
    • S1 192.168.1.6 – 1.000.000
    • S1 192.168.1.7 – 100
    • S2 192.168.1.8 – 1
    • S2 192.168.1.9 – 1
  • HG 501S1 192.168.1.5 – 100
    • S1 192.168.1.6 – 1000000000
    • S1 192.168.1.7 – 1000000000
    • S2 192.168.1.8 – 1
    • S2 192.168.1.9 – 1

The following command shows what ProxySQL is doing:

watch -n 1 'mysql -h 127.0.0.1 -P 3310 -uadmin -padmin -t -e "select * from stats_mysql_connection_pool where hostgroup in (500,501,9500,9501) order by hostgroup,srv_host ;" -e " select hostgroup_id,hostname,status,weight,comment from mysql_servers where hostgroup_id in (500,501,9500,9501)  order by hostgroup_id,hostname ;"'

Download the check from GitHub (https://github.com/Tusamarco/proxy_sql_tools) and activate it in ProxySQL. Be sure to set the parameters that match your installation:

delete from scheduler where id=10;
INSERT  INTO scheduler (id,active,interval_ms,filename,arg1) values (10,0,2000,"/var/lib/proxysql/galera_check.pl","-u=admin -p=admin -h=192.168.1.50 -H=500:W,501:R -P=3310 --execution_time=1 --retry_down=2 --retry_up=1 --main_segment=1 --debug=0  --log=/var/lib/proxysql/galeraLog");
LOAD SCHEDULER TO RUNTIME;SAVE SCHEDULER TO DISK;


If you want to activate it:

update scheduler set active=1 where id=10;
LOAD SCHEDULER TO RUNTIME;


The following is the kind of scenario we have:

+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+
| hostgroup | srv_host    | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries  | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+
| 500       | 192.168.1.9 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0        | 0               | 0               | 413        |
| 500       | 192.168.1.8 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0        | 0               | 0               | 420        |
| 500       | 192.168.1.7 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0        | 0               | 0               | 227        |
| 500       | 192.168.1.6 | 3306     | ONLINE | 0        | 10       | 10     | 0       | 12654    | 1016975         | 0               | 230        |
| 500       | 192.168.1.5 | 3306     | ONLINE | 0        | 9        | 29     | 0       | 107358   | 8629123         | 0               | 206        |
| 501       | 192.168.1.9 | 3306     | ONLINE | 0        | 4        | 6      | 0       | 12602425 | 613371057       | 34467286486     | 413        |
| 501       | 192.168.1.8 | 3306     | ONLINE | 0        | 6        | 7      | 0       | 12582617 | 612422028       | 34409606321     | 420        |
| 501       | 192.168.1.7 | 3306     | ONLINE | 0        | 6        | 6      | 0       | 18580675 | 905464967       | 50824195445     | 227        |
| 501       | 192.168.1.6 | 3306     | ONLINE | 0        | 6        | 14     | 0       | 18571127 | 905075154       | 50814832276     | 230        |
| 501       | 192.168.1.5 | 3306     | ONLINE | 0        | 1        | 10     | 0       | 169570   | 8255821         | 462706881       | 206        |
+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+


To generate a load, use the following commands (or whatever you like, but use a different one for read-only and reads/writes):

Write
sysbench --test=/usr/share/doc/sysbench/tests/db/oltp.lua --mysql-host=192.168.1.50 --mysql-port=3311 --mysql-user=stress_RW --mysql-password=test --mysql-db=test_galera --db-driver=mysql --oltp-tables-count=50 --oltp-tablesize=50000 --max-requests=0 --max-time=9000 --oltp-point-selects=5 --oltp-read-only=off --oltp-dist-type=uniform --oltp-reconnect-mode=transaction --oltp-skip-trx=off --num-threads=10 --report-interval=10 --mysql-ignore-errors=all run
Read only
sysbench --test=/usr/share/doc/sysbench/tests/db/oltp.lua --mysql-host=192.168.1.50 --mysql-port=3311 --mysql-user=stress_RW --mysql-password=test --mysql-db=test_galera --db-driver=mysql --oltp-tables-count=50 --oltp-tablesize=50000 --max-requests=0 --max-time=9000 --oltp-point-selects=5 --oltp-read-only=on --num-threads=10 --oltp-reconnect-mode=query --oltp-skip-trx=on --report-interval=10 --mysql-ignore-errors=all run


The most common thing that could happen to a cluster node is to become a donor. This is a planned activity for Percona XtraDB Cluster and is suppose to be managed in a less harmful way.

We’re going to simulate crashing a node and forcing it to elect our main node as DONOR (the one with the highest WEIGHT).

To do so, we need to have the parameter wsrep_sst_donor set.

show global variables like 'wsrep_sst_donor';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| wsrep_sst_donor | node1 | <---
+-----------------+-------+


Activate the check if not already done:

update scheduler set active=1 where id=10;


And now run traffic. Check load:

select * from stats_mysql_connection_pool where hostgroup in (500,501,9500,9501) order by hostgroup,srv_host ;
+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+
| hostgroup | srv_host    | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries  | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+
| 500       | 192.168.1.5 | 3306     | ONLINE | 10       | 0        | 30     | 0       | 112662   | 9055479         | 0               | 120        | <--- our Donor
| 500       | 192.168.1.6 | 3306     | ONLINE | 0        | 10       | 10     | 0       | 12654    | 1016975         | 0               | 111        |
| 500       | 192.168.1.7 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0        | 0               | 0               | 115        |
| 500       | 192.168.1.8 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0        | 0               | 0               | 316        |
| 500       | 192.168.1.9 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0        | 0               | 0               | 329        |
| 501       | 192.168.1.5 | 3306     | ONLINE | 0        | 1        | 10     | 0       | 257271   | 12533763        | 714473854       | 120        |
| 501       | 192.168.1.6 | 3306     | ONLINE | 0        | 10       | 18     | 0       | 18881582 | 920200116       | 51688974309     | 111        |
| 501       | 192.168.1.7 | 3306     | ONLINE | 3        | 6        | 9      | 0       | 18927077 | 922317772       | 51794504662     | 115        |
| 501       | 192.168.1.8 | 3306     | ONLINE | 0        | 1        | 8      | 0       | 12595556 | 613054573       | 34447564440     | 316        |
| 501       | 192.168.1.9 | 3306     | ONLINE | 1        | 3        | 6      | 0       | 12634435 | 614936148       | 34560620180     | 329        |
+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+


Now on one of the nodes:

  1. Kill MySQL.
  2. Remove the content of the data directory.
  3. Restart the node.

The node will go in SST and our galera_check script will manage it:

+--------------+-------------+--------------+------------+--------------------------------------------------+
| hostgroup_id | hostname    | status       | weight     | comment                                          |
+--------------+-------------+--------------+------------+--------------------------------------------------+
| 500          | 192.168.1.5 | OFFLINE_SOFT | 1000000000 | 500_W_501_R_retry_up=0;500_W_501_R_retry_down=0; | <---- the donor
| 500          | 192.168.1.6 | ONLINE       | 1000000    |                                                  |
| 500          | 192.168.1.7 | ONLINE       | 100        |                                                  |
| 500          | 192.168.1.8 | ONLINE       | 1          |                                                  |
| 500          | 192.168.1.9 | ONLINE       | 1          |                                                  |
| 501          | 192.168.1.5 | OFFLINE_SOFT | 100        | 500_W_501_R_retry_up=0;500_W_501_R_retry_down=0; |
| 501          | 192.168.1.6 | ONLINE       | 1000000000 |                                                  |
| 501          | 192.168.1.7 | ONLINE       | 1000000000 |                                                  |
| 501          | 192.168.1.8 | ONLINE       | 1          |                                                  |
| 501          | 192.168.1.9 | ONLINE       | 1          |                                                  |
+--------------+-------------+--------------+------------+--------------------------------------------------+


We can also check the galera_check log and see what happened:

2016/09/02 16:13:27.298:[WARN] Move node:192.168.1.5;3306;500;3010 SQL: UPDATE mysql_servers SET status='OFFLINE_SOFT' WHERE hostgroup_id=500 AND hostname='192.168.1.5' AND port='3306'
2016/09/02 16:13:27.303:[WARN] Move node:192.168.1.5;3306;501;3010 SQL: UPDATE mysql_servers SET status='OFFLINE_SOFT' WHERE hostgroup_id=501 AND hostname='192.168.1.5' AND port='3306'


The node will remain in OFFLINE_SOFT while the other node (192.168.1.6 with the 2nd WEIGHT) serves the writes, until the node is in DONOR state.

All as expected, the node was set in OFFLINE_SOFT state, which mean the existing connections finished, while the node was not accepting any NEW connections.

As soon the node stops sending data to the Joiner, it was moved back and traffic restarted:

2016/09/02 16:14:58.239:[WARN] Move node:192.168.1.5;3306;500;1000 SQL: UPDATE mysql_servers SET status='ONLINE' WHERE hostgroup_id=500 AND hostname='192.168.1.5' AND port='3306'
2016/09/02 16:14:58.243:[WARN] Move node:192.168.1.5;3306;501;1000 SQL: UPDATE mysql_servers SET status='ONLINE' WHERE hostgroup_id=501 AND hostname='192.168.1.5' AND port='3306'


+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+
| hostgroup | srv_host    | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries  | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+
| 500       | 192.168.1.5 | 3306     | ONLINE | 6        | 1        | 37     | 0       | 153882   | 12368557        | 0               | 72         | <---
| 500       | 192.168.1.6 | 3306     | ONLINE | 1        | 9        | 10     | 0       | 16008    | 1286492         | 0               | 42         |
| 500       | 192.168.1.7 | 3306     | ONLINE | 1        | 0        | 1      | 0       | 1398     | 112371          | 0               | 96         |
| 500       | 192.168.1.8 | 3306     | ONLINE | 0        | 0        | 24545  | 791     | 24545    | 122725          | 0               | 359        |
| 500       | 192.168.1.9 | 3306     | ONLINE | 1        | 0        | 1      | 0       | 15108    | 1214366         | 0               | 271        |
| 501       | 192.168.1.5 | 3306     | ONLINE | 1        | 0        | 11     | 0       | 2626808  | 128001112       | 7561278884      | 72         |
| 501       | 192.168.1.6 | 3306     | ONLINE | 5        | 7        | 20     | 0       | 28629516 | 1394974468      | 79289633420     | 42         |
| 501       | 192.168.1.7 | 3306     | ONLINE | 2        | 8        | 10     | 0       | 29585925 | 1441400648      | 81976494740     | 96         |
| 501       | 192.168.1.8 | 3306     | ONLINE | 0        | 0        | 16779  | 954     | 12672983 | 616826002       | 34622768228     | 359        |
| 501       | 192.168.1.9 | 3306     | ONLINE | 0        | 4        | 6      | 0       | 13567512 | 660472589       | 37267991677     | 271        |
+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+


This was easy, and more or less managed by the standard script. But what would happen if my donor was set to DO NOT serve query when in the DONOR state?

Wait, what?? Yes, Percona XtraDB Cluster (and Galera in general) can be set to refuse any query when the node goes in DONOR state. If not managed this can cause issues as the node will simply reject queries (but ProxySQL sees the node as alive).

Let me show you:

show global variables like 'wsrep_sst_donor_rejects_queries';
+---------------------------------+-------+
| Variable_name                   | Value |
+---------------------------------+-------+
| wsrep_sst_donor_rejects_queries | ON    |
+---------------------------------+-------+


For the moment, let’s deactivate the check. Then, do the same stop and delete of the data dir, then restart the node. SST takes place.

Sysbench will report:

ALERT: mysql_drv_query() returned error 2013 (Lost connection to MySQL server during query) for query 'BEGIN'
FATAL: failed to execute function `event': 3
ALERT: mysql_drv_query() returned error 2013 (Lost connection to MySQL server during query) for query 'BEGIN'
FATAL: failed to execute function `event': 3


But ProxySQL?

+-----------+-------------+----------+---------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+
| hostgroup | srv_host    | srv_port | status  | ConnUsed | ConnFree | ConnOK | ConnERR | Queries  | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+-------------+----------+---------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+
| 500       | 192.168.1.5 | 3306     | ONLINE  | 0        | 0        | 101    | 0       | 186331   | 14972717        | 0               | 118        | <-- no writes in wither HG
| 500       | 192.168.1.6 | 3306     | ONLINE  | 0        | 9        | 10     | 0       | 20514    | 1648665         | 0               | 171        |  |
| 500       | 192.168.1.7 | 3306     | ONLINE  | 0        | 1        | 3      | 0       | 5881     | 472629          | 0               | 134        |  |
| 500       | 192.168.1.8 | 3306     | ONLINE  | 0        | 0        | 205451 | 1264    | 205451   | 1027255         | 0               | 341        |  |
| 500       | 192.168.1.9 | 3306     | ONLINE  | 0        | 1        | 2      | 0       | 15642    | 1257277         | 0               | 459        |  -
| 501       | 192.168.1.5 | 3306     | ONLINE  | 1        | 0        | 13949  | 0       | 4903347  | 238627310       | 14089708430     | 118        |
| 501       | 192.168.1.6 | 3306     | ONLINE  | 2        | 10       | 20     | 0       | 37012174 | 1803380964      | 103269634626    | 171        |
| 501       | 192.168.1.7 | 3306     | ONLINE  | 2        | 11       | 13     | 0       | 38782923 | 1889507208      | 108288676435    | 134        |
| 501       | 192.168.1.8 | 3306     | SHUNNED | 0        | 0        | 208452 | 1506    | 12864656 | 626156995       | 34622768228     | 341        |
| 501       | 192.168.1.9 | 3306     | ONLINE  | 1        | 3        | 6      | 0       | 14451462 | 703534884       | 39837663734     | 459        |
+-----------+-------------+----------+---------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+
mysql> select * from mysql_server_connect_log where hostname in ('192.168.1.5','192.168.1.6','192.168.1.7','192.168.1.8','192.168.1.9')  order by time_start_us desc limit 10;
+-------------+------+------------------+-------------------------+--------------------------------------------------------------------------------------------------------+
| hostname    | port | time_start_us    | connect_success_time_us | connect_error                                                                                          |
+-------------+------+------------------+-------------------------+--------------------------------------------------------------------------------------------------------+
| 192.168.1.9 | 3306 | 1472827444621954 | 1359                    | NULL                                                                                                   |
| 192.168.1.8 | 3306 | 1472827444618883 | 0                       | Can't connect to MySQL server on '192.168.1.8' (107)                                                   |
| 192.168.1.7 | 3306 | 1472827444615819 | 433                     | NULL                                                                                                   |
| 192.168.1.6 | 3306 | 1472827444612722 | 538                     | NULL                                                                                                   |
| 192.168.1.5 | 3306 | 1472827444606560 | 473                     | NULL                                                                                                   | <-- donor is seen as up
| 192.168.1.9 | 3306 | 1472827384621463 | 1286                    | NULL                                                                                                   |
| 192.168.1.8 | 3306 | 1472827384618442 | 0                       | Lost connection to MySQL server at 'handshake: reading inital communication packet', system error: 107 |
| 192.168.1.7 | 3306 | 1472827384615317 | 419                     | NULL                                                                                                   |
| 192.168.1.6 | 3306 | 1472827384612241 | 415                     | NULL                                                                                                   |
| 192.168.1.5 | 3306 | 1472827384606117 | 454                     | NULL                                                                                                   | <-- donor is seen as up
+-------------+------+------------------+-------------------------+--------------------------------------------------------------------------------------------------------+
select * from mysql_server_ping_log where hostname in ('192.168.1.5','192.168.1.6','192.168.1.7','192.168.1.8','192.168.1.9')  order by time_start_us desc limit 10;
+-------------+------+------------------+----------------------+------------------------------------------------------+
| hostname    | port | time_start_us    | ping_success_time_us | ping_error                                           |
+-------------+------+------------------+----------------------+------------------------------------------------------+
| 192.168.1.9 | 3306 | 1472827475062217 | 311                  | NULL                                                 |
| 192.168.1.8 | 3306 | 1472827475060617 | 0                    | Can't connect to MySQL server on '192.168.1.8' (107) |
| 192.168.1.7 | 3306 | 1472827475059073 | 108                  | NULL                                                 |
| 192.168.1.6 | 3306 | 1472827475057281 | 102                  | NULL                                                 |
| 192.168.1.5 | 3306 | 1472827475054188 | 74                   | NULL                                                 | <-- donor is seen as up
| 192.168.1.9 | 3306 | 1472827445061877 | 491                  | NULL                                                 |
| 192.168.1.8 | 3306 | 1472827445060254 | 0                    | Can't connect to MySQL server on '192.168.1.8' (107) |
| 192.168.1.7 | 3306 | 1472827445058688 | 53                   | NULL                                                 |
| 192.168.1.6 | 3306 | 1472827445057124 | 131                  | NULL                                                 |
| 192.168.1.5 | 3306 | 1472827445054015 | 98                   | NULL                                                 | <-- donor is seen as up
+-------------+------+------------------+----------------------+------------------------------------------------------+


As you can see, all seems OK. Let’s turn on galera_check and see what happens when we run some read and write loads.

And now let me do the stop-delete-restart-SST process again:

kill -9 <mysqld_safe_pid> <mysqld_pid>; rm -fr data/*;rm -fr logs/*;sleep 2;./start


As soon as the node goes down, ProxySQL shuns the node.

+-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| hostgroup | srv_host    | srv_port | status  | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| 500       | 192.168.1.5 | 3306     | ONLINE  | 7        | 3        | 34     | 0       | 21570   | 1733833         | 0               | 146        |
| 500       | 192.168.1.6 | 3306     | ONLINE  | 1        | 8        | 12     | 0       | 9294    | 747063          | 0               | 129        |
| 500       | 192.168.1.7 | 3306     | ONLINE  | 1        | 0        | 4      | 0       | 3396    | 272950          | 0               | 89         |
| 500       | 192.168.1.8 | 3306     | SHUNNED | 0        | 0        | 1      | 6       | 12      | 966             | 0               | 326        | <-- crashed
| 500       | 192.168.1.9 | 3306     | ONLINE  | 1        | 0        | 2      | 0       | 246     | 19767           | 0               | 286        |
| 501       | 192.168.1.5 | 3306     | ONLINE  | 0        | 1        | 2      | 0       | 772203  | 37617973        | 2315131214      | 146        |
| 501       | 192.168.1.6 | 3306     | ONLINE  | 9        | 3        | 12     | 0       | 3439458 | 167514166       | 10138636314     | 129        |
| 501       | 192.168.1.7 | 3306     | ONLINE  | 1        | 12       | 13     | 0       | 3183822 | 155064971       | 9394612877      | 89         |
| 501       | 192.168.1.8 | 3306     | SHUNNED | 0        | 0        | 1      | 6       | 11429   | 560352          | 35350726        | 326        | <-- crashed
| 501       | 192.168.1.9 | 3306     | ONLINE  | 0        | 1        | 1      | 0       | 312253  | 15227786        | 941110520       | 286        |
+-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+


Immediately after,  galera_check identifies the node is requesting the SST, and that the DONOR is our writer (given it is NOT the only writer in the HG, and it has the variable wsrep_sst_donor_rejects_queries active), it cannot be set to OFFLINE_SOFT. We do not want ProxySQL to consider it OFFLINE_HARD (because it is not).

As such, the script moves it to a special HG:

2016/09/04 16:11:22.091:[WARN] Move node:192.168.1.5;3306;500;3001 SQL: UPDATE mysql_servers SET hostgroup_id=9500 WHERE hostgroup_id=500 AND hostname='192.168.1.5' AND port='3306'
2016/09/04 16:11:22.097:[WARN] Move node:192.168.1.5;3306;501;3001 SQL: UPDATE mysql_servers SET hostgroup_id=9501 WHERE hostgroup_id=501 AND hostname='192.168.1.5' AND port='3306'
+--------------+-------------+------+--------+------------+-------------+-----------------+---------------------+---------+----------------+--------------------------------------------------+
| hostgroup_id | hostname    | port | status | weight     | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment                                          |
+--------------+-------------+------+--------+------------+-------------+-----------------+---------------------+---------+----------------+--------------------------------------------------+
| 500          | 192.168.1.6 | 3306 | ONLINE | 1000000    | 0           | 1000            | 0                   | 0       | 0              |                                                  |
| 500          | 192.168.1.7 | 3306 | ONLINE | 100        | 0           | 1000            | 0                   | 0       | 0              |                                                  |
| 500          | 192.168.1.8 | 3306 | ONLINE | 1          | 0           | 1000            | 0                   | 0       | 0              | 500_W_501_R_retry_up=0;500_W_501_R_retry_down=0; |
| 500          | 192.168.1.9 | 3306 | ONLINE | 1          | 0           | 1000            | 0                   | 0       | 0              | 500_W_501_R_retry_up=0;500_W_501_R_retry_down=0; |
| 501          | 192.168.1.6 | 3306 | ONLINE | 1000000000 | 0           | 1000            | 0                   | 0       | 0              |                                                  |
| 501          | 192.168.1.7 | 3306 | ONLINE | 1000000000 | 0           | 1000            | 0                   | 0       | 0              |                                                  |
| 501          | 192.168.1.9 | 3306 | ONLINE | 1          | 0           | 1000            | 0                   | 0       | 0              | 500_W_501_R_retry_up=0;500_W_501_R_retry_down=0; |
| 9500         | 192.168.1.5 | 3306 | ONLINE | 1000000000 | 0           | 1000            | 0                   | 0       | 0              | 500_W_501_R_retry_up=0;500_W_501_R_retry_down=0; | <-- Special HG
| 9501         | 192.168.1.5 | 3306 | ONLINE | 100        | 0           | 1000            | 0                   | 0       | 0              | 500_W_501_R_retry_up=0;500_W_501_R_retry_down=0; | <-- Special HG
+--------------+-------------+------+--------+------------+-------------+-----------------+---------------------+---------+----------------+--------------------------------------------------+


The Donor continues to serve the Joiner, but applications won’t see it.

What applications see is also very important. Applications doing WRITEs will see:

[ 10s] threads: 10, tps: 9.50, reads: 94.50, writes: 42.00, response time: 1175.77ms (95%), errors: 0.00, reconnects: 0.00
...
[ 40s] threads: 10, tps: 2.80, reads: 26.10, writes: 11.60, response time: 3491.45ms (95%), errors: 0.00, reconnects: 0.10
[ 50s] threads: 10, tps: 4.80, reads: 50.40, writes: 22.40, response time: 10062.13ms (95%), errors: 0.80, reconnects: 351.60 <--- Main writer moved to another HG
[ 60s] threads: 10, tps: 5.90, reads: 53.10, writes: 23.60, response time: 2869.82ms (95%), errors: 0.00, reconnects: 0.00
...


When one node shifts to another, the applications will have to manage the RE-TRY, but it will only be a short time and will cause limited impact on the production flow.

Application readers see no errors:

[ 10s] threads: 10, tps: 0.00, reads: 13007.31, writes: 0.00, response time: 9.13ms (95%), errors: 0.00, reconnects: 0.00
[ 50s] threads: 10, tps: 0.00, reads: 9613.70, writes: 0.00, response time: 10.66ms (95%), errors: 0.00, reconnects: 0.20 <-- just a glitch in reconnect
[ 60s] threads: 10, tps: 0.00, reads: 10807.90, writes: 0.00, response time: 11.07ms (95%), errors: 0.00, reconnects: 0.20
[ 70s] threads: 10, tps: 0.00, reads: 9082.61, writes: 0.00, response time: 23.62ms (95%), errors: 0.00, reconnects: 0.00
...
[ 390s] threads: 10, tps: 0.00, reads: 13050.80, writes: 0.00, response time: 8.97ms (95%), errors: 0.00, reconnects: 0.00


When the Donor ends providing SST, it comes back and the script manages it. Then  galera_check puts it in the right HG:

2016/09/04 16:12:34.266:[WARN] Move node:192.168.1.5;3306;9500;1010 SQL: UPDATE mysql_servers SET hostgroup_id=500 WHERE hostgroup_id=9500 AND hostname='192.168.1.5' AND port='3306'
2016/09/04 16:12:34.270:[WARN] Move node:192.168.1.5;3306;9501;1010 SQL: UPDATE mysql_servers SET hostgroup_id=501 WHERE hostgroup_id=9501 AND hostname='192.168.1.5' AND port='3306'


The crashed node is restarted by the SST process, and the node goes up. But if the level of load in the cluster is mid/high, it will remain in the JOINED state for sometime, becoming visible by the ProxySQL again. ProxySQL will not, however, correctly recognize the state.

2016-09-04 16:17:15 21035 [Note] WSREP: 3.2 (node4): State transfer from 1.1 (node1) complete.
2016-09-04 16:17:15 21035 [Note] WSREP: Shifting JOINER -> JOINED (TO: 254515)


To avoid this issue, the script will move it to a special HG, allowing it to recovery without interfering with a real load.

+-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| hostgroup | srv_host    | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| 500       | 192.168.1.5 | 3306     | ONLINE | 6        | 2        | 15     | 0       | 3000    | 241060          | 0               | 141        |
| 500       | 192.168.1.6 | 3306     | ONLINE | 1        | 9        | 13     | 0       | 13128   | 1055268         | 0               | 84         |
| 500       | 192.168.1.7 | 3306     | ONLINE | 1        | 0        | 4      | 0       | 3756    | 301874          | 0               | 106        |
| 500       | 192.168.1.9 | 3306     | ONLINE | 1        | 0        | 2      | 0       | 4080    | 327872          | 0               | 278        |
| 501       | 192.168.1.5 | 3306     | ONLINE | 1        | 0        | 2      | 0       | 256753  | 12508935        | 772048259       | 141        |
| 501       | 192.168.1.6 | 3306     | ONLINE | 4        | 8        | 12     | 0       | 5116844 | 249191524       | 15100617833     | 84         |
| 501       | 192.168.1.7 | 3306     | ONLINE | 2        | 11       | 13     | 0       | 4739756 | 230863200       | 13997231724     | 106        |
| 501       | 192.168.1.9 | 3306     | ONLINE | 0        | 1        | 1      | 0       | 496524  | 24214563        | 1496482104      | 278        |
| 9500      | 192.168.1.8 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0       | 0               | 0               | 331        |<-- Joined not Sync
| 9501      | 192.168.1.8 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0       | 0               | 0               | 331        |<-- Joined not Sync
+-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+


Once the node fully recovers, galera_check puts it back in the original HG, ready serve requests:

+-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| hostgroup | srv_host    | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| 500       | 192.168.1.5 | 3306     | ONLINE | 0        | 1        | 15     | 0       | 3444    | 276758          | 0               | 130        |
| 500       | 192.168.1.6 | 3306     | ONLINE | 0        | 9        | 13     | 0       | 13200   | 1061056         | 0               | 158        |
| 500       | 192.168.1.7 | 3306     | ONLINE | 0        | 0        | 4      | 0       | 3828    | 307662          | 0               | 139        |
| 500       | 192.168.1.8 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0       | 0               | 0               | 0          |<-- up again
| 500       | 192.168.1.9 | 3306     | ONLINE | 0        | 0        | 2      | 0       | 4086    | 328355          | 0               | 336        |
| 501       | 192.168.1.5 | 3306     | ONLINE | 0        | 1        | 2      | 0       | 286349  | 13951366        | 861638962       | 130        |
| 501       | 192.168.1.6 | 3306     | ONLINE | 0        | 12       | 12     | 0       | 5239212 | 255148806       | 15460951262     | 158        |
| 501       | 192.168.1.7 | 3306     | ONLINE | 0        | 13       | 13     | 0       | 4849970 | 236234446       | 14323937975     | 139        |
| 501       | 192.168.1.8 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0       | 0               | 0               | 0          |<-- up again
| 501       | 192.168.1.9 | 3306     | ONLINE | 0        | 1        | 1      | 0       | 507910  | 24768898        | 1530841172      | 336        |
+-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+


A summary of the logical steps is:

                +---------+
                |  Crash  |
                +----+----+
                     |
                     v
            +--------+-------+
            |  ProxySQL      |
            |  shun crashed  |
            |      node      |
            +--------+-------+
                     |
                     |
                     v
   +-----------------+-----------------+
   |  Donor has one of the following?  |
   |  wsrep_sst_dono _rejects_queries  |
   |  OR                               |
   |  wsrep_reject_queries             |
   +-----------------------------------+
      |No                            |Yes
      v                              v
+-----+----------+       +-----------+----+
| Galera_check   |       | Galera_check   |
| put the donor  |       | put the donor  |
| in OFFLINE_SOFT|       | in special HG  |
+---+------------+       +-----------+----+
    |                                |
    |                                |
    v                                v
+---+--------------------------------+-----+
|            Donor SST ends                |
+---+---------------+----------------+-----+
    |               |                |
    |               |                |
+---+------------+  |    +-----------+----+
| Galera_check   |  |    | Galera_check   |
| put the donor  |  |    | put the donor  |
| ONLINE         |  |    | in Original HG |
+----------------+  |    +----------------+
                    |
                    |
+------------------------------------------+
|           crashed SST ends               |
+-------------------+----------------------+
                    |
                    |
       +------------+-------------+
       |  Crashed node back but   +<------------+
       |  Not Sync?               |             |
       +--------------------------+             |
          |No                   |Yes            |
          |                     |               |
          |                     |               |
+---------+------+       +------+---------+     |
| Galera_check   |       | Galera_check   |     |
| put the node   |       | put the node   +-----+
| back orig. HG  |       | Special HG     |
+--------+-------+       +----------------+
         |
         |
         |
         |      +---------+
         +------>   END   |
                +---------+


As mentioned, galera_check can manage several node states.

Another case is when we can’t have the node accept ANY queries. We might need that for several reasons, including preparing the node for maintenance (or whatever).

In Percona XtraDB Cluster (and other Galera implementations) we can set the value of wsrep_reject_queries to:

  • NONE
  • ALL
  • ALL_KILL

Let see how it works. Run some load, then on the main writer node (192.168.1.5):

set global wsrep_reject_queries=ALL;


This blocks any new query execution until the run is complete. Do a simple select on the node:

(root@localhost:pm) [test]>select * from tbtest1;
ERROR 1047 (08S01): WSREP has not yet prepared node for application use


ProxySQL won’t see these conditions:

+-------------+------+------------------+----------------------+------------+
| hostname    | port | time_start_us    | ping_success_time_us | ping_error |
+-------------+------+------------------+----------------------+------------+
| 192.168.1.5 | 3306 | 1473005467628001 | 35                   | NULL       | <--- ping ok
| 192.168.1.5 | 3306 | 1473005437628014 | 154                  | NULL       |
+-------------+------+------------------+----------------------+------------+
+-------------+------+------------------+-------------------------+---------------+
| hostname    | port | time_start_us    | connect_success_time_us | connect_error |
+-------------+------+------------------+-------------------------+---------------+
| 192.168.1.5 | 3306 | 1473005467369575 | 246                     | NULL          | <--- connect ok
| 192.168.1.5 | 3306 | 1473005407369441 | 353                     | NULL          |
+-------------+------+------------------+-------------------------+---------------+


The script galera_check will instead manage it:

+-----------+-------------+----------+--------------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| hostgroup | srv_host    | srv_port | status       | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+-------------+----------+--------------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| 500       | 192.168.1.5 | 3306     | OFFLINE_SOFT | 0        | 0        | 8343   | 0       | 10821   | 240870          | 0               | 93         | <--- galera check put it OFFLINE
| 500       | 192.168.1.6 | 3306     | ONLINE       | 10       | 0        | 15     | 0       | 48012   | 3859402         | 0               | 38         | <--- writer
| 500       | 192.168.1.7 | 3306     | ONLINE       | 0        | 1        | 6      | 0       | 14712   | 1182364         | 0               | 54         |
| 500       | 192.168.1.8 | 3306     | ONLINE       | 0        | 1        | 2      | 0       | 1092    | 87758           | 0               | 602        |
| 500       | 192.168.1.9 | 3306     | ONLINE       | 0        | 1        | 4      | 0       | 5352    | 430152          | 0               | 238        |
| 501       | 192.168.1.5 | 3306     | OFFLINE_SOFT | 0        | 0        | 1410   | 0       | 197909  | 9638665         | 597013919       | 93         |
| 501       | 192.168.1.6 | 3306     | ONLINE       | 2        | 10       | 12     | 0       | 7822682 | 380980455       | 23208091727     | 38         |
| 501       | 192.168.1.7 | 3306     | ONLINE       | 0        | 13       | 13     | 0       | 7267507 | 353962618       | 21577881545     | 54         |
| 501       | 192.168.1.8 | 3306     | ONLINE       | 0        | 1        | 1      | 0       | 241641  | 11779770        | 738145270       | 602        |
| 501       | 192.168.1.9 | 3306     | ONLINE       | 1        | 0        | 1      | 0       | 756415  | 36880233        | 2290165636      | 238        |
+-----------+-------------+----------+--------------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+


In this case, the script will put the node in OFFLINE_SOFT, given the set global wsrep_reject_queries=ALL means do not accept NEW and complete the existing as OFFLINE_SOFT.

The script also manages the case of set global wsrep_reject_queries=ALL_KILL;. From the ProxySQL point of view, these do not exist either:

+-------------+------+------------------+----------------------+------------+
| hostname    | port | time_start_us    | ping_success_time_us | ping_error |
+-------------+------+------------------+----------------------+------------+
| 192.168.1.5 | 3306 | 1473005827629069 | 59                   | NULL       |<--- ping ok
| 192.168.1.5 | 3306 | 1473005797628988 | 57                   | NULL       |
+-------------+------+------------------+----------------------+------------+
+-------------+------+------------------+-------------------------+---------------+
| hostname    | port | time_start_us    | connect_success_time_us | connect_error |
+-------------+------+------------------+-------------------------+---------------+
| 192.168.1.5 | 3306 | 1473005827370084 | 370                     | NULL          | <--- connect ok
| 192.168.1.5 | 3306 | 1473005767369915 | 243                     | NULL          |
+-------------+------+------------------+-------------------------+---------------+
+-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| hostgroup | srv_host    | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| 9500      | 192.168.1.5 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0       | 0               | 0               | 0          |<--- galera check put it in special HG
| 9501      | 192.168.1.5 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0       | 0               | 0               | 0          |
+-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+


The difference here is that the script moves the node to the special HG to isolate it, instead leaving it in the original HG.

The integration between ProxySQL and Percona XtraDB Custer (Galera) works perfectly for multi-writer if you have a script likegalera_check that correctly manages the different Percona XtraDB Custer/Galera states.

ProxySQL and PXC using Replication HostGroup

Sometimes we might need to have 100% of the write going to only one node at a time. As explained above, ProxySQL uses weight to redirect a % of the load to a specific node.

In most cases, it will be enough to set the weight in the main writer to a very high value (like 10 billion) and one thousand on the next node to almost achieve a single writer.

But this is not 100% effective, it still allows ProxySQL to send a query once every X times to the other node(s). To keep it consistent with the ProxySQL logic, the solution is to use replication Hostgroups.

Replication HGs are special HGs that the proxy sees as connected for R/W operations. ProxySQL analyzes the value of the READ_ONLY variables and assigns to the READ_ONLY HG the nodes that have it enabled.

The node having READ_ONLY=0 resides in both HGs. As such the first thing we need to modify is to tell ProxySQL that HG 500 and 501 are replication HGs.

INSERT INTO mysql_replication_hostgroups VALUES (500,501,'');
LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;
select * from mysql_replication_hostgroups ;
+------------------+------------------+---------+
| writer_hostgroup | reader_hostgroup | comment |
+------------------+------------------+---------+
| 500              | 501              |         |
+------------------+------------------+---------+


Now whenever I set the value of READ_ONLY on a node, ProxySQL will move the node accordingly. Let see how. Current:

+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+
| hostgroup | srv_host    | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries  | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+
| 500       | 192.168.1.5 | 3306     | ONLINE | 6        | 1        | 7      | 0       | 16386    | 1317177         | 0               | 97         |
| 500       | 192.168.1.6 | 3306     | ONLINE | 1        | 9        | 15     | 0       | 73764    | 5929366         | 0               | 181        |
| 500       | 192.168.1.7 | 3306     | ONLINE | 1        | 0        | 6      | 0       | 18012    | 1447598         | 0               | 64         |
| 500       | 192.168.1.8 | 3306     | ONLINE | 1        | 0        | 2      | 0       | 1440     | 115728          | 0               | 341        |
| 501       | 192.168.1.5 | 3306     | ONLINE | 0        | 1        | 1      | 0       | 1210029  | 58927817        | 3706882671      | 97         |
| 501       | 192.168.1.6 | 3306     | ONLINE | 1        | 11       | 12     | 0       | 16390790 | 798382865       | 49037691590     | 181        |
| 501       | 192.168.1.7 | 3306     | ONLINE | 1        | 12       | 13     | 0       | 15357779 | 748038558       | 45950863867     | 64         |
| 501       | 192.168.1.8 | 3306     | ONLINE | 0        | 1        | 1      | 0       | 1247662  | 60752227        | 3808131279      | 341        |
| 501       | 192.168.1.9 | 3306     | ONLINE | 0        | 1        | 1      | 0       | 1766309  | 86046839        | 5374169120      | 422        |
+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+


Set global READ_ONLY=1 on the following nodes: 192.168.1.6/7/8/9.

After:

+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+
| hostgroup | srv_host    | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries  | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+
| 500       | 192.168.1.5 | 3306     | ONLINE | 10       | 0        | 20     | 0       | 25980    | 2088346         | 0               | 93         |
| 501       | 192.168.1.5 | 3306     | ONLINE | 0        | 1        | 1      | 0       | 1787979  | 87010074        | 5473781192      | 93         |
| 501       | 192.168.1.6 | 3306     | ONLINE | 4        | 8        | 12     | 0       | 18815907 | 916547402       | 56379724890     | 79         |
| 501       | 192.168.1.7 | 3306     | ONLINE | 1        | 12       | 13     | 0       | 17580636 | 856336023       | 52670114510     | 131        |
| 501       | 192.168.1.8 | 3306     | ONLINE | 0        | 1        | 1      | 0       | 15324    | 746109          | 46760779        | 822        |
| 501       | 192.168.1.9 | 3306     | ONLINE | 0        | 1        | 1      | 0       | 16210    | 789999          | 49940867        | 679        |
+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+


If in this scenario a reader node crashes, the application will not suffer at all given the redundancy.

But if the writer is going to crash, THEN the issue exists because there will be NO node available to manage the failover. The solution is to either do the node election manually or to have the script elect the node with the lowest read weight in the same segment as the new writer.

Below is what happens when a node crashes (bird-eye view):

                         +---------+
                         |  Crash  |
                         +----+----+
                              |
                              v
                     +--------+-------+
                     |  ProxySQL      |
                     |  shun crashed  |
                     |      node      |
                     +--------+-------+
                              |
                              |
                              v
            +-----------------+-----------------+
+----------->   HostGroup has another active    |
|           |   Node in HG writer?              |
|           +--+--------------+---------------+-+
|              |              |               |
|              |              |               |
|              |No            |               |Yes
|              |              |               |
|        +-----v----------+   |   +-----------v----+
|        |ProxySQL will   |   |   |ProxySQL will   |
|        |stop serving    |   |   |redirect load   >--------+
|        |writes          |   |   |there           |        |
|        +----------------+   |   +----------------+        |
|                             |                             |
|                             v                             |
|                     +-------+--------+                    |
|                     |ProxySQL checks |                    |
|                     |READ_ONLY on    |                    |
|                     |Reader HG       |                    |
|                     |                |                    |
|                     +-------+--------+                    |
|                             |                             |
|                             v                             |
|                     +-------+--------+                    |
|                     |Any Node with   |                    |
|                     |READ_ONLY = 0 ? |                    |
|                     +----------------+                    |
|                      |No            |Yes                  |
|                      |              |                     |
|           +----------v------+    +--v--------------+      |
|           |ProxySQL will    |    |ProxySQL will    |      |
|           |continue to      |    |Move node to     |      |
+<---------<+do not serve     |    |Writer HG        |      |
|           |Writes           |    |                 |      |
|           +-----------------+    +--------v--------+      |
|                                           |               |
+-------------------------------------------+               |
                         +---------+                        |
                         |   END   <------------------------+
                         +---------+


The script should act immediately after the ProxySQL SHUNNED the node step, just replacing the READ_ONLY=1 with READ_ONLY=0 on the reader node with the lowest READ WEIGHT.

ProxySQL will do the rest, copying the node into the WRITER HG, keeping low weight, such that WHEN/IF the original node will comeback the new node will not compete for traffic.

Since it included that special function in the check, the feature allows automatic fail-over. This experimental feature is active only if explicitly set in the parameter that the scheduler passes to the script. To activate it add --active_failover in the scheduler. My recommendation is to have two entries in the scheduler and activate the one with --active_failover for testing, and remember to deactivate the other one.

Let see the manual procedure first:

The process is:

1 Generate some load
2 Kill the writer node
3 Manually elect a reader as writer
4 Recover crashed node

Current load:

+-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| hostgroup | srv_host    | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| 500       | 192.168.1.5 | 3306     | ONLINE | 10       | 0        | 10     | 0       | 30324   | 2437438         | 0               | 153        |
| 501       | 192.168.1.5 | 3306     | ONLINE | 0        | 1        | 1      | 0       | 1519612 | 74006447        | 4734427711      | 153        |
| 501       | 192.168.1.6 | 3306     | ONLINE | 4        | 8        | 12     | 0       | 7730857 | 376505014       | 24119645457     | 156        |
| 501       | 192.168.1.7 | 3306     | ONLINE | 2        | 10       | 12     | 0       | 7038332 | 342888697       | 21985442619     | 178        |
| 501       | 192.168.1.8 | 3306     | ONLINE | 1        | 0        | 1      | 0       | 612523  | 29835858        | 1903693835      | 337        |
| 501       | 192.168.1.9 | 3306     | ONLINE | 1        | 0        | 1      | 0       | 611021  | 29769497        | 1903180139      | 366        |
+-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+


Kill the main node 192.168.1.5:

+-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| hostgroup | srv_host    | srv_port | status  | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| 501       | 192.168.1.5 | 3306     | SHUNNED | 0        | 0        | 1      | 11      | 1565987 | 76267703        | 4879938857      | 119        |
| 501       | 192.168.1.6 | 3306     | ONLINE  | 1        | 11       | 12     | 0       | 8023216 | 390742215       | 25033271548     | 112        |
| 501       | 192.168.1.7 | 3306     | ONLINE  | 1        | 11       | 12     | 0       | 7306838 | 355968373       | 22827016386     | 135        |
| 501       | 192.168.1.8 | 3306     | ONLINE  | 1        | 0        | 1      | 0       | 638326  | 31096065        | 1984732176      | 410        |
| 501       | 192.168.1.9 | 3306     | ONLINE  | 1        | 0        | 1      | 0       | 636857  | 31025014        | 1982213114      | 328        |
+-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
+-------------+------+------------------+----------------------+------------------------------------------------------+
| hostname    | port | time_start_us    | ping_success_time_us | ping_error                                           |
+-------------+------+------------------+----------------------+------------------------------------------------------+
| 192.168.1.5 | 3306 | 1473070640798571 | 0                    | Can't connect to MySQL server on '192.168.1.5' (107) |
| 192.168.1.5 | 3306 | 1473070610798464 | 0                    | Can't connect to MySQL server on '192.168.1.5' (107) |
+-------------+------+------------------+----------------------+------------------------------------------------------+
+-------------+------+------------------+-------------------------+------------------------------------------------------+
| hostname    | port | time_start_us    | connect_success_time_us | connect_error                                        |
+-------------+------+------------------+-------------------------+------------------------------------------------------+
| 192.168.1.5 | 3306 | 1473070640779903 | 0                       | Can't connect to MySQL server on '192.168.1.5' (107) |
| 192.168.1.5 | 3306 | 1473070580779977 | 0                       | Can't connect to MySQL server on '192.168.1.5' (107) |
+-------------+------+------------------+-------------------------+------------------------------------------------------+


When the node is killed ProxySQL, shun it and report issues with the checks (connect and ping). During this time frame the application will experience issues if it is not designed to manage the retry and eventually a queue, and it will crash.

Sysbench reports the errors:

Writes

[  10s] threads: 10, tps: 6.70, reads: 68.50, writes: 30.00, response time: 1950.53ms (95%), errors: 0.00, reconnects:  0.00
...
[1090s] threads: 10, tps: 4.10, reads: 36.90, writes: 16.40, response time: 2226.45ms (95%), errors: 0.00, reconnects:  1.00  <-+ killing the node
[1100s] threads: 10, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 1.00, reconnects:  0.00         |
[1110s] threads: 10, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 1.00, reconnects:  0.00         |
[1120s] threads: 10, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 1.00, reconnects:  0.00         |
[1130s] threads: 10, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 1.00, reconnects:  0.00         |-- Gap waiting for a node to become
[1140s] threads: 10, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 1.00, reconnects:  0.00         |   READ_ONLY=0
[1150s] threads: 10, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 1.00, reconnects:  0.00         |
[1160s] threads: 10, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 1.00, reconnects:  0.00         |
[1170s] threads: 10, tps: 4.70, reads: 51.30, writes: 22.80, response time: 80430.18ms (95%), errors: 0.00, reconnects:  0.00 <-+
[1180s] threads: 10, tps: 8.90, reads: 80.10, writes: 35.60, response time: 2068.39ms (95%), errors: 0.00, reconnects:  0.00
...
 [1750s] threads: 10, tps: 5.50, reads: 49.80, writes: 22.80, response time: 2266.80ms (95%), errors: 0.00, reconnects:  0.00 -- No additional errors


I decided to promote node 192.168.1.6 given the weight for readers was equal and as such no difference in this setup.

(root@localhost:pm) [(none)]>set global read_only=0;
Query OK, 0 rows affected (0.00 sec)


Checking ProxySQL:

+-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| hostgroup | srv_host    | srv_port | status  | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| 500       | 192.168.1.6 | 3306     | ONLINE  | 10       | 0        | 10     | 0       | 1848    | 148532          | 0               | 40         |
| 501       | 192.168.1.5 | 3306     | SHUNNED | 0        | 0        | 1      | 72      | 1565987 | 76267703        | 4879938857      | 38         |
| 501       | 192.168.1.6 | 3306     | ONLINE  | 2        | 10       | 12     | 0       | 8843069 | 430654903       | 27597990684     | 40         |
| 501       | 192.168.1.7 | 3306     | ONLINE  | 1        | 11       | 12     | 0       | 8048826 | 392101994       | 25145582384     | 83         |
| 501       | 192.168.1.8 | 3306     | ONLINE  | 1        | 0        | 1      | 0       | 725820  | 35371512        | 2259974847      | 227        |
| 501       | 192.168.1.9 | 3306     | ONLINE  | 1        | 0        | 1      | 0       | 723582  | 35265066        | 2254824754      | 290        |
+-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+


As the READ_ONLY value is modified, ProxySQL moves it to the writer HG, and writes can take place again. At this point in time production activities are recovered.

Reads had just a minor glitch:

Reads

[  10s] threads: 10, tps: 0.00, reads: 20192.15, writes: 0.00, response time: 6.96ms (95%), errors: 0.00, reconnects:  0.00
...
[ 410s] threads: 10, tps: 0.00, reads: 16489.03, writes: 0.00, response time: 9.41ms (95%), errors: 0.00, reconnects:  2.50
...
[ 710s] threads: 10, tps: 0.00, reads: 18789.40, writes: 0.00, response time: 6.61ms (95%), errors: 0.00, reconnects:  0.00


The glitch happened when node 192.168.1.6 was copied over to HG 500, but with no interruptions or errors. At this point let us put back the crashed node, which comes back elect Node2 (192.168.1.6) as Donor.

This was a Percona XtraDB Cluster/Galera choice, and we have to accept and manage it.

Note that the other basic scripts put the node in OFFLINE_SOFT, given the node will become a DONOR.
Galera_check will recognize that Node2 (192.168.1.6) is the only active node in the segment for that specific HG (writer), while is not the only present for the READER HG.

As such, it will put the node in OFFLINE_SOFT only for the READER HG, trying to reduce the load on the node, but it will keep it active in the WRITER HG, to prevent service interruption.

Restart the node and ask for a donor:

2016-09-05 12:21:43 8007 [Note] WSREP: Flow-control interval: [67, 67]
2016-09-05 12:21:45 8007 [Note] WSREP: Member 1.1 (node1) requested state transfer from '*any*'. Selected 0.1 (node2)(SYNCED) as donor.
2016-09-05 12:21:46 8007 [Note] WSREP: (ef248c1f, 'tcp://192.168.1.8:4567') turning message relay requesting off
2016-09-05 12:21:52 8007 [Note] WSREP: New cluster view: global state: 234bb6ed-527d-11e6-9971-e794f632b140:324329, view# 7: Primary, number of nodes: 5, my index: 3, protocol version 3


Galera_check  sets OFFLINE_SOFT 192.168.1.6 only for the READER HG, and ProxySQL uses the others to serve reads.

+-----------+-------------+----------+--------------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+
| hostgroup | srv_host    | srv_port | status       | ConnUsed | ConnFree | ConnOK | ConnERR | Queries  | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+-------------+----------+--------------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+
| 500       | 192.168.1.6 | 3306     | ONLINE       | 10       | 0        | 10     | 0       | 7746     | 622557          | 0               | 86         |
| 501       | 192.168.1.5 | 3306     | ONLINE       | 0        | 0        | 1      | 147     | 1565987  | 76267703        | 4879938857      | 38         |
| 501       | 192.168.1.6 | 3306     | OFFLINE_SOFT | 0        | 0        | 12     | 0       | 9668944  | 470878452       | 30181474498     | 86         | <-- Node offline
| 501       | 192.168.1.7 | 3306     | ONLINE       | 9        | 3        | 12     | 0       | 10932794 | 532558667       | 34170366564     | 62         |
| 501       | 192.168.1.8 | 3306     | ONLINE       | 0        | 1        | 1      | 0       | 816599   | 39804966        | 2545765089      | 229        |
| 501       | 192.168.1.9 | 3306     | ONLINE       | 0        | 1        | 1      | 0       | 814893   | 39724481        | 2541760230      | 248        |
+-----------+-------------+----------+--------------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+


When the SST donor task is over,  galera_check moves the 192.168.1.6 back ONLINE as expected. But at the same time, it moves the recovering node to the special HG to avoid to have it included in any activity until ready.

2016-09-05 12:22:36 27352 [Note] WSREP: 1.1 (node1): State transfer from 0.1 (node2) complete.
2016-09-05 12:22:36 27352 [Note] WSREP: Shifting JOINER -> JOINED (TO: 325062)
+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+
| hostgroup | srv_host    | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries  | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+
| 500       | 192.168.1.6 | 3306     | ONLINE | 10       | 0        | 10     | 0       | 1554     | 124909          | 0               | 35         |
| 501       | 192.168.1.6 | 3306     | ONLINE | 2        | 8        | 22     | 0       | 10341612 | 503637989       | 32286072739     | 35         |
| 501       | 192.168.1.7 | 3306     | ONLINE | 3        | 9        | 12     | 0       | 12058701 | 587388598       | 37696717375     | 13         |
| 501       | 192.168.1.8 | 3306     | ONLINE | 1        | 0        | 1      | 0       | 890102   | 43389051        | 2776691164      | 355        |
| 501       | 192.168.1.9 | 3306     | ONLINE | 0        | 1        | 1      | 0       | 887994   | 43296865        | 2772702537      | 250        |
| 9500      | 192.168.1.5 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0        | 0               | 0               | 57         | <-- Special HG for recover
| 9501      | 192.168.1.5 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0        | 0               | 0               | 57         | <-- Special HG for recover
+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+


Once finally the node is in SYNC with the group, it is put back online in the READER HG and in the writer HG:

2016-09-05 12:22:36 27352 [Note] WSREP: 1.1 (node1): State transfer from 0.1 (node2) complete.
2016-09-05 12:22:36 27352 [Note] WSREP: Shifting JOINER -> JOINED (TO: 325062)


+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+
| hostgroup | srv_host    | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries  | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+
| 500       | 192.168.1.5 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0        | 0               | 0               | 0          | <-- Back on line
| 500       | 192.168.1.6 | 3306     | ONLINE | 10       | 0        | 10     | 0       | 402      | 32317           | 0               | 68         |
| 501       | 192.168.1.5 | 3306     | ONLINE | 1        | 0        | 1      | 0       | 6285     | 305823          | 19592814        | 312        | <-- Back on line
| 501       | 192.168.1.6 | 3306     | ONLINE | 4        | 6        | 22     | 0       | 10818694 | 526870710       | 33779586475     | 68         |
| 501       | 192.168.1.7 | 3306     | ONLINE | 0        | 12       | 12     | 0       | 12492316 | 608504039       | 39056093665     | 26         |
| 501       | 192.168.1.8 | 3306     | ONLINE | 1        | 0        | 1      | 0       | 942023   | 45924082        | 2940228050      | 617        |
| 501       | 192.168.1.9 | 3306     | ONLINE | 1        | 0        | 1      | 0       | 939975   | 45834039        | 2935816783      | 309        |
+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+
+--------------+-------------+------+--------+------------+
| hostgroup_id | hostname    | port | status | weight     |
+--------------+-------------+------+--------+------------+
| 500          | 192.168.1.5 | 3306 | ONLINE | 100        |
| 500          | 192.168.1.6 | 3306 | ONLINE | 1000000000 |
| 501          | 192.168.1.5 | 3306 | ONLINE | 100        |
| 501          | 192.168.1.6 | 3306 | ONLINE | 1000000000 |
| 501          | 192.168.1.7 | 3306 | ONLINE | 1000000000 |
| 501          | 192.168.1.8 | 3306 | ONLINE | 1          |
| 501          | 192.168.1.9 | 3306 | ONLINE | 1          |
+--------------+-------------+------+--------+------------+


But given it is coming back with its READER WEIGHT, it will NOT compete with the previously elected WRITER.

The recovered node will stay on “hold” waiting for a DBA to act and eventually put it back, or be set as READ_ONLY and as such be fully removed from the WRITER HG.

Let see the automatic procedure now:

For the moment, we just stick to the MANUAL failover process. The process is:

  1. Generate some load
  2. Kill the writer node
  3. Script will do auto-failover
  4. Recover crashed node

Check our scheduler config:

+----+--------+-------------+-----------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------+------+------+------+---------+
| id | active | interval_ms | filename | arg1 | arg2 | arg3 | arg4 | arg5 | comment |
+----+--------+-------------+-----------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------+------+------+------+---------+
| 10 | 1 | 2000 | /var/lib/proxysql/galera_check.pl | -u=admin -p=admin -h=192.168.1.50 -H=500:W,501:R -P=3310 --execution_time=1 --retry_down=2 --retry_up=1 --main_segment=1 --active_failover --debug=0 --log=/var/lib/proxysql/galeraLog | NULL | NULL | NULL | NULL | | <--- Active
| 20 | 0 | 1500 | /var/lib/proxysql/galera_check.pl | -u=admin -p=admin -h=192.168.1.50 -H=500:W,501:R -P=3310 --execution_time=1 --retry_down=2 --retry_up=1 --main_segment=1 --debug=0 --log=/var/lib/proxysql/galeraLog | NULL | NULL | NULL | NULL | |
+----+--------+-------------+-----------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------+------+------+------+---------+


The active one is the one with auto-failover. Start load and check current load:

+-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| hostgroup | srv_host    | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| 500       | 192.168.1.5 | 3306     | ONLINE | 10       | 0        | 10     | 0       | 952     | 76461           | 0               | 0          |
| 501       | 192.168.1.5 | 3306     | ONLINE | 1        | 0        | 1      | 0       | 53137   | 2587784         | 165811100       | 167        |
| 501       | 192.168.1.6 | 3306     | ONLINE | 5        | 5        | 11     | 0       | 283496  | 13815077        | 891230826       | 109        |
| 501       | 192.168.1.7 | 3306     | ONLINE | 3        | 7        | 10     | 0       | 503516  | 24519457        | 1576198138      | 151        |
| 501       | 192.168.1.8 | 3306     | ONLINE | 1        | 0        | 1      | 0       | 21952   | 1068972         | 68554796        | 300        |
| 501       | 192.168.1.9 | 3306     | ONLINE | 0        | 1        | 1      | 0       | 21314   | 1038593         | 67043935        | 289        |
+-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+


Kill the main node 192.168.1.5:

+-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| hostgroup | srv_host    | srv_port | status  | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| 500       | 192.168.1.6 | 3306     | ONLINE  | 10       | 0        | 10     | 0       | 60      | 4826            | 0               | 0          |
| 501       | 192.168.1.5 | 3306     | SHUNNED | 0        | 0        | 1      | 11      | 177099  | 8626778         | 552221651       | 30         |
| 501       | 192.168.1.6 | 3306     | ONLINE  | 3        | 7        | 11     | 0       | 956724  | 46601110        | 3002941482      | 49         |
| 501       | 192.168.1.7 | 3306     | ONLINE  | 2        | 8        | 10     | 0       | 1115685 | 54342756        | 3497575125      | 42         |
| 501       | 192.168.1.8 | 3306     | ONLINE  | 0        | 1        | 1      | 0       | 76289   | 3721419         | 240157393       | 308        |
| 501       | 192.168.1.9 | 3306     | ONLINE  | 1        | 0        | 1      | 0       | 75803   | 3686067         | 236382784       | 231        |
+-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+


When the node is killed the node is SHUNNED, but this time the script already set the new node 192.168.1.6 to ONLINE. See script log:

2016/09/08 14:04:02.494:[INFO] END EXECUTION Total Time:102.347850799561
2016/09/08 14:04:04.478:[INFO] This Node Try to become a WRITER set READ_ONLY to 0 192.168.1.6:3306:HG501
2016/09/08 14:04:04.479:[INFO] This Node NOW HAS READ_ONLY = 0 192.168.1.6:3306:HG501
2016/09/08 14:04:04.479:[INFO] END EXECUTION Total Time:71.8140602111816


More importantly, let’s look at the application experience:

Writes

[  10s] threads: 10, tps: 9.40, reads: 93.60, writes: 41.60, response time: 1317.41ms (95%), errors: 0.00, reconnects:  0.00
[  20s] threads: 10, tps: 8.30, reads: 74.70, writes: 33.20, response time: 1350.96ms (95%), errors: 0.00, reconnects:  0.00
[  30s] threads: 10, tps: 8.30, reads: 74.70, writes: 33.20, response time: 1317.81ms (95%), errors: 0.00, reconnects:  0.00
[  40s] threads: 10, tps: 7.80, reads: 70.20, writes: 31.20, response time: 1407.51ms (95%), errors: 0.00, reconnects:  0.00
[  50s] threads: 10, tps: 6.70, reads: 60.30, writes: 26.80, response time: 2259.35ms (95%), errors: 0.00, reconnects:  0.00
[  60s] threads: 10, tps: 6.60, reads: 59.40, writes: 26.40, response time: 3275.78ms (95%), errors: 0.00, reconnects:  0.00
[  70s] threads: 10, tps: 5.70, reads: 60.30, writes: 26.80, response time: 1492.56ms (95%), errors: 0.00, reconnects:  1.00 <-- just a reconnect experience
[  80s] threads: 10, tps: 6.70, reads: 60.30, writes: 26.80, response time: 7959.74ms (95%), errors: 0.00, reconnects:  0.00
[  90s] threads: 10, tps: 6.60, reads: 59.40, writes: 26.40, response time: 2109.03ms (95%), errors: 0.00, reconnects:  0.00
[ 100s] threads: 10, tps: 6.40, reads: 57.60, writes: 25.60, response time: 1883.96ms (95%), errors: 0.00, reconnects:  0.00
[ 110s] threads: 10, tps: 5.60, reads: 50.40, writes: 22.40, response time: 2167.27ms (95%), errors: 0.00, reconnects:  0.00


With no errors and no huge delay, our application (managing to reconnect) had only a glitch, and had to reconnect.

Read had no errors or reconnects.

The connection errors were managed by ProxySQL, and given it found five in one second it SHUNNED the node. Thegalera_script was able to promote a reader, and given it is a failover, no delay with retry loop. The whole thing was done in such brief time that application barely saw it.

Obviously, an application with thousands of connections/sec will experience larger impact, but the time-window will be very narrow. Once the failed node is ready to come back, either we choose to start it with READ_ONLY=1, and it will come back as the reader.
Or we will keep it as it is and it will come back as the writer.

No matter what, the script manages the case as it had done in the previous (manual) exercise.

Conclusions

ProxySQL and galera_check, when working together, are quite efficient in managing the cluster and its different scenarios. When using the single-writer mode, solving the manual part of the failover dramatically improves the efficiency in production state recovery performance — going from few minutes to seconds or less.

The multi-writer mode remains the preferred and most recommended way to use ProxySQL/Percona XtraDB Cluster given it performs failover without the need of additional scripts or extensions. It’s also the preferred method if a script is required to manage the integration with ProxySQL.

In both cases, the use of a script can identify the multiple states of Percona XtraDB Cluster and the mutable node scenario. It is a crucial part of the implementation, without which ProxySQL might not behave correctly.

Interested in reducing database costs by moving from Oracle Enterprise to open source subscription?  Read the total cost of ownership (TCO) analysis. Brought to you in partnership with MariaDB.

Topics:
proxysql ,xtradb ,mysql

Published at DZone with permission of Marco Tusa, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}