Math and SQL Part 5: Projection and Selection
Join the DZone community and get the full member experience.
Join For Free
In "Relational Theory and SQL," Chris Date says that tuples are unordered. Tuples are strongly ordered, but what Date is getting at is that a relation has no natural ordering of columns, and therefore from any relation with a given ordering, a new relation can be created with a different ordering. This process is called 'projection' and it is entirely implicit in SQL. In the interest of being concise, in that book, Date does not stick with his typical approach of starting from clear definitions and discussing what these mean.
This being said, understanding projection, selection, and the difference between them makes understanding relational databases far easier in my view.
Because SQL operates on bags, I will use both SQL (for bag examples) and Set::Relation (for set examples) in order to discuss the math differences. I will finally offer an idea of what a clearer English rendition of SQL's math semantics would be.
I: Projection  Transform One Relation into Vertical Subset
Projection, represented by a Pi character (π) creates a set of ordered tuples with an ordering based on the operation. What projection does, essentially, is take a subset (or the whole set) from each tuple, possibly reordering it in the process, in order to transform one relation into a derived relation.
In relational algebra, projection takes a set of tuples and returns a set of tuples with a subset of fields (possibly rearranged).
For example, consider a relation R, where the tuple structure of R is (employee_id, first_name, last_name, badge_number, office_id, tax_id)
(  employee_id  first_name  last_name  office_id  tax_id  ) 
(  1  Eric  Young  1  123456789  ) 
(  2  Thomas  Young  1  324553334  ) 
(  3  Julie  James  1  443456234  ) 
(  4  Jared  Young  2  533445442  ) 
(  5  Todd  Kals  2  532442332  ) 
Then we could write π_{(last_name, office_id)}(R), we would get:
(  last_name  office_id  ) 
(  Young  1  ) 
(  James  1  ) 
(  Young  2  ) 
(  Kals  2  ) 
Note that there is one fewer row in the output table because in both cases, you have a wellformed set.
The equivalent of the previous expression in SQL would be:
SELECT DISTINCT last_name, office_id FROM R;
From these examples a few things should be fairly clear. First tuples are strongly ordered and we give humanreadable labels as a nicety, but that projection can be used to reorder the tuple elements arbitrarily. Consequently tuples are consistently ordered within a relation, but that order is largely arbitrary. Consequently tuples cannot be said to have a natural order. Instead they derive their structure and order from the relation they are a part of.
Unfortunately SQL makes all kinds of implicit tuple transformations either by implicit projection or by some sort of reverse operation (I will call that "overlay" for the moment). Because projection is entirely implicit in SQL, it is never stated. The SQL statement above in fact doesn't perform a selection operation but merely performs a projection.
II: Selection  Turn a Relation into a Horizontal Subset
Selection takes a relation and provides a subset of it based on arbitrary criteria. The structure of the relation is unchanged, but a subset of the correspondences are in the new relation. In other words, it selects tuples for further processing, and since the relation is a set, it is guaranteed to give a subset out. It is represented by the sigma character (σ) and the subscript gives the equality condition.
Given the same relation R, we can perform σ_{office_id=1}(R) and get:
(  employee_id  first_name  last_name  office_id  tax_id  ) 
(  1  Eric  Young  1  123456789  ) 
(  2  Thomas  Young  1  324553334  ) 
(  3  Julie  James  1  443456234  ) 
Simple right?
In SQL terms, a plain selection is what is found in the where clause. So the equivalent in SQL is simply:
SELECT * FROM R WHERE office_id = 1;
As you can see from these examples, selection and projection are transformed in SQL in some relatively nonintuitive ways. At the same time, understanding relational databases demands a basic understanding of these two concepts.
Some Notes on Set::Relation
In the course of putting this together I spent a little time with Set::Relation. I found that the module seems to behave relatively well. The projection method combined with members() outputs a set. This was as far as I took it in my quick spin. As per the discussion below, Set::Relation operates under the assumption that relations are wellformed sets instead of bags. This can be very helpful in operating in a highly formal relational environment.
Selection could be done using the semijoin method, or using members() and grep.
Next: On the Great Null Debate
Published at DZone with permission of Chris Travers, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Trending

Send Email Using Spring Boot (SMTP Integration)

Database Integration Tests With Spring Boot and Testcontainers

How Web3 Is Driving Social and Financial Empowerment

Knowing and Valuing Apache Kafka’s ISR (InSync Replicas)
Comments