This post will be short. It's about a lesson learned by tuning the SQL used for loading items from unrelated tables using the same WHERE clause. It was performed on Oracle 11g, but I am pretty confident that it applies to most SQL databases.
Using the "WHERE" clause after the whole "UNION" is performed is significantly slower than using the "WHERE" clause inside inner selects.
This, if you can ensure removal of duplicates by the inner WHERE clauses...
(SELECT * FROM TABLE_1 WHERE COL > 1) a UNION ALL (SELECT * FROM TABLE_2 WHERE COL > 1) b;
... is better than following ...
(SELECT * FROM TABLE_1 WHERE COL > 1) a UNION (SELECT * FROM TABLE_2 WHERE COL > 1) b;
... which is better than following ...
SELECT * FROM ( (SELECT * FROM TABLE_1) a UNION (SELECT * FROM TABLE_2) b) WHERE COL > 1;
- UNION ALL is faster than UNION because plain UNION is expecting that within two joined datasets are duplicates which need to be removed. If you can ensure (by inner WHERE clauses) that there will be no duplicates, it's far better to use UNION ALL and let database engine optimize the inner selects.
- Using a WHERE clause on the result of grouped results is too expensive because you are operating on more internal results than you need. Also, the database engine's optimization can’t be processed — the results don't have anything in common.
- See this.
- And this.
P.S. If you enjoyed this post, you can share this post anywhere as well as follow me on Twitter to stay in touch.