Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

When Optimal MySQL Isn't Best

DZone's Guide to

When Optimal MySQL Isn't Best

When it comes to MySQL query construction, what happens when ''optimal'' isn't ''best?'' See how one developer is trying to overcome that roadblock.

· Database Zone
Free Resource

Download the Guide to Open Source Database Selection: MySQL vs. MariaDB and see how the side-by-side comparison of must-have features will ease the journey. Brought to you in partnership with MariaDB.

The MySQL optimal query plan strategy isn’t always the best option. In this post, I’ll show you how to help MySQL choose the best strategy for maximum performance.

As an example, I will discuss a real life use case when we needed to select some amount of the latest news (let’s say 25 news articles) by tags. Suppose we have 1 billion records in the news table and 10,000 records in table tags. The natural relation between news and tags will be many-to-many using bridge table tags_in_news as shown below:

And the SQL query will look like:

select n.* from news n, tags_in_news tin
    where 'getNewsByTags' = 'getNewsByTags'
    and n.deleted = false
    and n.id = tin.news_id
    and tin.deleted = false 
    and tin.uk_tag_id in (0, 1119, 2228, 4031, 1792, 1117, 3954, 2150, 3919, 4087)
    and n.modified_date <= Now()
    order by n.modified_date desc limit 25


After the first run of this query, its execution time is not as fast as I would like it to be. It's because of the order by sorting construction. So, I created index for the modified_date field:

create index ix_modified_date on news (modified_date)


But even after that, MySQL did not choose the query plan strategy using this index — it chose the optimal strategy using another index, which was not the best strategy. So I had to force MySQL to use this index. Fortunately, we can use force index construction, and now our query will look like this:

select n.* from news n force index (ix_modified_date), tags_in_news tin
    where 'getNewsByTags' = 'getNewsByTags'
    and n.deleted = false and n.id = tin.news_id
    and tin.deleted = false
    and tin.uk_tag_id in (0, 1119, 2228, 4031, 1792, 1117, 3954, 2150, 3919, 4087)
    and n.modified_date <= Now()
    order by n.modified_date desc limit 25


Bingo! Query time decreased from more than a second to less than 50 ms. I discovered the best query execution strategy. So the problem was resolved... Until one “fine” morning when our DB server fell down. After a slow_query_log investigation, I found out that for some tag values, our query executes dramatically slowly.

And the cause was really strange: the total number of records returned from the query was less than the limit value. In such cases, it’s better not to use force index and let MySQL decide how to execute the query. Hmm... looks like the optimal strategy is better than the best strategy? So, I had two options: to give up and use the optimal strategy or to fight and find a solution to get rid of this issue. I’ve chosen the second one. But more on it in my next article. Stay tuned!

Interested in reducing database costs by moving from Oracle Enterprise to open source subscription?  Read the total cost of ownership (TCO) analysis. Brought to you in partnership with MariaDB.

Topics:
mysql ,query ,tags ,java 8 ,mysql database

Published at DZone with permission of Ivan Shulak. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}