There are already only very few real-world use-cases for
FULL [ OUTER ] JOIN, but maybe, you have run into this beast in the past. But when was the last time you’ve seen a
NATURAL JOIN? Right.
A quick reminder from our article about JOINs.
FULL JOIN is a type of
OUTER JOIN that retains data from both sides of the
JOIN operation, regardless if the
JOIN predicate matches or not. For example, querying the Sakila database:
SELECT first_name, last_name, title FROM actor FULL JOIN film_actor USING (actor_id) FULL JOIN film USING (film_id)
The result will look something like this:
first_name last_name title --[ LEFT JOIN ] ------------------------- Jon Doe <-- Actors that didn't play in any film Claire Miller --[ INNER JOIN ] ------------------------ Samuel Jackson Django Unchained <-- Actors played in many films Samuel Jackson Pulp Fiction John Travolta Pulp Fiction <-- Films featured several actors --[ RIGHT JOIN ]------------------------- Meh <-- Films that don't have actors
Note the special “markers” that delimit the parts that would have also been contributed by
RIGHT JOIN. In a way, a
FULL JOIN combines them all.
This one is less obviously useful. It looks useful at first, because if we carefully design our schemas as in the Sakila database used above, then we can simplify our joins as such:
SELECT first_name, last_name, title FROM actor NATURAL JOIN film_actor NATURAL JOIN film
Which is just syntax sugar for this:
SELECT first_name, last_name, title FROM actor JOIN film_actor USING (<all common columns>) JOIN film USING (<all common columns>)
NATURAL JOIN will join two tables using all the columns they have in common. You’d think that this would be the useful outcome:
SELECT first_name, last_name, title FROM actor JOIN film_actor USING (actor_id) JOIN film USING (film_id)
Unfortunately, the outcome looks more like this:
SELECT first_name, last_name, title FROM actor JOIN film_actor USING (actor_id, last_update) JOIN film USING (film_id, last_update)
In other words: It makes no sense at all. In real world situations (and the Sakila database is already close enough to the real world), we’ll always have accidentally conflicting column names that will be considered for
NATURAL JOIN, when they hsould not be.
OK, So Why NATURAL FULL JOIN?
One is very rarely needed, and the other is pretty useless in the real world. Not so fast!
I recently helped a customer who had two very similar tables that both contained payments. Let’s call them:
STANDING_ORDER, the latter being a
PAYMENT template which generates actual
Now, there was some PL/SQL logic that operated on payments, which after some change request, needed to operate both on payments and/or on standing orders. If this was Java, we’d just (ab)use subtype polymorphism for the quick win, or factor out common logic into common component types for the thorough solution. For instance, these kinds of columns certainly appear in both cases:
ID— the surrogate key.
CREATED_AT— audit info.
MODIFIED_AT— audit info.
DELETED— soft deletion.
There are many more common columns. There are other columns that are specific to one or the other type:
STANDING_ORDER_ID— Only payments reference the standing order that may have created them.
Standing Order Only
PERIODICITY— This is a standing order’s core feature.
Now, in order to refactor that PL/SQL logic, I needed a row type that combines all these columns easily. I didn’t want to manually track the exact columns of both tables, I wanted this to always work, even if the maintainers of the tables add/remove/rename columns.
NATURAL FULL JOIN to the Rescue
I created a view like this:
CREATE VIEW like_a_payment SELECT * FROM payment NATURAL FULL JOIN standing_order WHERE 1 = 0
This view itself doesn’t do anything. It just sits there and produces the row type I need. Why? Because the
NATURAL JOIN will join by ALL the common columns, namely
ID, CREATED_AT, MODIFIED_AT, DELETED, AMOUNT, CURRENCY and then adds the columns from the left table, namely
STANDING_ORDER_ID, and then from the right table, namely
I can now use this type in PL/SQL:
DECLARE paym like_a_payment%ROWTYPE; BEGIN -- Do things... END;
But Why FULL?
True, I didn’t need
FULL yet, because I’m not producing any rows anyway. Correct. But observe how I will populate the
PAYM local variable in the PL/SQL block:
DECLARE l_paym_id like_a_payment.id%TYPE; l_paym like_a_payment%ROWTYPE; BEGIN l_paym_id := 1; SELECT * INTO l_paym FROM ( SELECT * FROM payment WHERE id = l_paym_id ) NATURAL FULL JOIN ( SELECT * FROM standing_order WHERE id = l_paym_id ); END;
And I’m done! This could not have been done any more easily!
- I’m using
NATURALbecause now I have to use it all the time to populate the record (or select individual columns, but meh).
- I’m using
FULLbecause only one of the two queries will produce a result, and the remaining columns will be
Of course, this is kind of a hack. Please don’t abuse this too often. Here’s some hints to follow:
- The refactoring would have been substantial. I needed a quick win.
- The logic actually needs almost all the columns from the payments. Otherwise, you might want to review that
- The derived tables are necessary in Oracle for performance reasons.
- The actual code contained two distinct queries, each one joining “one side” with an empty substitute query, as we know that the ID is either a payment or a standing order.
Having said so, now you know one of the very few real world use-cases of
NATURAL FULL JOIN. Put this in your code base and impress your coworkers!