ProxySQL Rules: Applying and Chaining the Rules
ProxySQL's rules, as necessary as they can be, can lead to performance problems if they aren't used with care. Linking them together can give you a leg up.
Join the DZone community and get the full member experience.
Join For FreeIn this post, I am going to show you how you can minimize the performance impact of ProxySQL rules by using some finesse.
Apply Test
In my previous post, we could see the effect of the rules on ProxySQL performance. As we could also see, the “apply” option does not help with 1000 tables. Are we sure about this? Let’s consider: if we know 90% of our traffic won’t match any rules, it doesn’t matter if we have 10 or 500 rules – it has to check all of them. And this is going to have a serious effect on performance. How can we avoid that?
Let’s insert rule number ONE, which matches all queries, like this:
insert into mysql_query_rules (username,destination_hostgroup,active,retries,match_digest,apply) values('testuser_rw',600,1,3,'(from|into|update|into table) sbtest([1-9]d{3,}|[1-9][0-9][1-9])b',1);
This rule matches all queries where table names > sbtest100. But again, this logic also can be applied on “userids” or any other keys. We just have to know our application and our query distribution.
With this rule, the 90% of the queries have to check only one rule (the first one):
Now we have 101 rules, but the performance is almost the same as when we had only ten rules! As we can see, creating the rules based on our query distribution has a huge impact!
But what if we don’t know which queries are the busiest, or every query has the same amount of hits? Can we do anything? Yes, we can.
Chaining
In my previous post, I mentioned the “flagIN”, “flagOUT” options. With these options, we can chain the rules. But why is that good for us?
If we have 100 rules and 100 tables, even with applying, on average ProxySQL has to check 50 rules. But if we write rules like these:
insert into mysql_query_rules (flagin,flagout,username,active,retries,match_digest,apply) VALUES
(0,1000,'testuser_rw',1,3,'(from|into|update|into table) sbtest.b',0),
(0,1100,'testuser_rw',1,3,'(from|into|update|into table) sbtest1.b',0),
(0,1200,'testuser_rw',1,3,'(from|into|update|into table) sbtest2.b',0),
(0,1300,'testuser_rw',1,3,'(from|into|update|into table) sbtest3.b',0),
(0,1400,'testuser_rw',1,3,'(from|into|update|into table) sbtest4.b',0),
(0,1500,'testuser_rw',1,3,'(from|into|update|into table) sbtest5.b',0),
(0,1600,'testuser_rw',1,3,'(from|into|update|into table) sbtest6.b',0),
(0,1700,'testuser_rw',1,3,'(from|into|update|into table) sbtest7.b',0),
(0,1800,'testuser_rw',1,3,'(from|into|update|into table) sbtest8.b',0),
(0,1900,'testuser_rw',1,3,'(from|into|update|into table) sbtest9.b',0);
insert into mysql_query_rules (flagin,destination_hostgroup,active,match_digest,apply) VALUES
(1100,600,1,'(from|into|update|into table) sbtest11b',1),
(1100,600,1,'(from|into|update|into table) sbtest12b',1),
(1100,600,1,'(from|into|update|into table) sbtest13b',1),
(1100,600,1,'(from|into|update|into table) sbtest14b',1),
(1100,600,1,'(from|into|update|into table) sbtest15b',1),
(1100,600,1,'(from|into|update|into table) sbtest16b',1),
(1100,600,1,'(from|into|update|into table) sbtest17b',1),
(1100,600,1,'(from|into|update|into table) sbtest18b',1),
(1100,600,1,'(from|into|update|into table) sbtest19b',1);
...
We are going to have more than 100 rules, but first, we match on the first digit after the second and then go on. With this approach, ProxySQL has to only check 15 rules on average.
Let’s see the results:
As we can see, even with more rules, chaining is way faster than without chaining.
Tips
Hits
ProxySQL keeps statistics about a rule’s hits. When you add a rule you can see how many queries it applied to:
select * from stats_mysql_query_rules;
+---------+------+
| rule_id | hits |
+---------+------+
| 2 | 6860 |
| 3 | 6440 |
| 4 | 6880 |
| 5 | 6610 |
| 6 | 6850 |
| 7 | 7130 |
| 8 | 6620 |
| 9 | 7300 |
| 10 | 6750 |
| 11 | 7050 |
| 12 | 7280 |
| 13 | 6780 |
| 14 | 6670 |
...
Query_Processor_time_nsec
ProxySQL does not record how much time it spends on a rule (not yet, anyway: https://github.com/sysown/proxysql/issues/966), but it has a global stat:
select * from stats_mysql_global where Variable_name="Query_Processor_time_nsec";
+---------------------------+----------------+
| Variable_Name | Variable_Value |
+---------------------------+----------------+
| Query_Processor_time_nsec | 3184114671740 |
+---------------------------+----------------+
You can monitor this statistic, and if you see a huge increase after you added a rule, you might want to review it again.
Conclusion
ProxySQL can handle many rules, and of course, they have some costs. But if you design your rules based on your workload and your query distribution, you can minimize this cost a lot.
Published at DZone with permission of Tibor Korocz, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments