DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workkloads.

Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Enhancing Performance: Optimizing Complex MySQL Queries for Large Datasets
  • Common Mistakes to Avoid When Writing SQL Code
  • DuckDB Optimization: A Developer's Guide to Better Performance
  • Cost Optimization Strategies for Managing Large-Scale Open-Source Databases

Trending

  • Segmentation Violation and How Rust Helps Overcome It
  • Streamlining Event Data in Event-Driven Ansible
  • Key Considerations in Cross-Model Migration
  • Hybrid Cloud vs Multi-Cloud: Choosing the Right Strategy for AI Scalability and Security
  1. DZone
  2. Data Engineering
  3. Databases
  4. How to Optimize MySQL UNION for High Speed

How to Optimize MySQL UNION for High Speed

By 
Sean Hull user avatar
Sean Hull
·
Jun. 17, 13 · Interview
Likes (0)
Comment
Save
Tweet
Share
23.2K Views

Join the DZone community and get the full member experience.

Join For Free

There are two ways to speedup UNIONs in a MySQL database. First use UNION ALL if at all possible, and second try to push down your conditions.

1. UNION ALL is much faster than UNION

How does a UNION work? Imagine you have two tables for shirts. The short_sleeve table looks like this:

blue
green
gray
black

And long_sleeve another that looks like this:

red
green
yellow
blue

Related: Why Generalists are Better at Scaling the Web

If you UNION those two tables, first MySQL will sort the combined set into a temp table like this:

black
blue
blue
gray
green
green
red
yellow

Once it’s done this sort, it can easily remove the duplicate blue & duplicate green for this resulting set:

black
blue
gray
green
red
yellow

See also: Mythical MySQL DBA – the talent drought.

Why does it do this? UNION is defined that way in SQL. Duplicates must be removed and this is an efficient way for the MySQL engine to remove them. Combine results, sort, remove duplicates and return the set.

Queries with UNION can be accelerated in two ways. Switch to UNION ALL or try to push ORDER BY, LIMIT and WHERE conditions inside each subquery. You’ll be glad you did!

What if we did UNION ALL? The result would look like this:

blue
green
gray
black
red
green
yellow
blue

Read this: MySQL DBA Interview & Hiring Guide.

It doesn’t have to sort, and doesn’t have to remove duplicates. If you imagine combining two 10 million row tables, and don’t have to sort, this speedup can be HUGE.

2. Use Push-down Conditions to speedup UNION in MySQL

Imagine with our example above the shirts have a design date, the year they were released. Yes we’re keeping this example very simple to illustrate the concept.

Here is the short_sleeve table:

blue		2013
green		2013
green		2012
gray		2011
black		2009
black		2011

And long_sleeve table looks like this:

red		2012
red		2013
green		2011
yellow	2010
blue		2011

For 2013 designs could combine them like this:

(SELECT type, release FROM short_sleeve)
UNION
(SELECT type, release FROM long_sleeve);
WHERE release >=2013;

See also: 5 More Things Deadly to Scalability and the original 5 Things Toxic to Scalability..

Here the WHERE clause works on this 11 record temp table:

black		2009
black		2011
blue		2011
blue		2013
gray		2011
green		2013
green		2012
green		2011
red		2012
red		2013
yellow	2010

But it would be much faster to move the WHERE inside each subquery like this:

(SELECT type, release FROM short_sleeve WHERE release >=2013)
UNION
(SELECT type, release FROM long_sleeve WHERE release >=2013);

That would be operating on a combined 3 record table. Faster to sort & remove duplicates. Smaller result sets cache better too, providing a pay forward dividend. That’s what performance optimization is all about!

Read this: RDS or MySQL – 10 Use Cases.

Remember multi-million row sets in each part of this query will quickly illustrate the optimization. We’re using very small results to make visualizing easier.

You can also use this optimization for ORDER BY and for LIMIT conditions. By reducing the number of records returned by EACH PART of the UNION, you reduce the work that happens at the stage where they are all combined.

If you’re seeing some UNION queries in your slow query log, I suggest you try this optimization out and see if you can tweak it.

Database MySQL optimization

Published at DZone with permission of Sean Hull. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Enhancing Performance: Optimizing Complex MySQL Queries for Large Datasets
  • Common Mistakes to Avoid When Writing SQL Code
  • DuckDB Optimization: A Developer's Guide to Better Performance
  • Cost Optimization Strategies for Managing Large-Scale Open-Source Databases

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!