Database Fundamentals #23: Filtering Data
In this article, explore just a little bit more about how the WHERE clause works to help you filter data so you avoid some of the bigger issues.
Join the DZone community and get the full member experience.Join For Free
If you've been reading these Database Fundamentals posts, you've already seen the WHERE clause because of your use of it when manipulating data with DELETE and UPDATE statements. It's also been used several times earlier in this series to limit the values returned from a SELECT statement.
The primary places where people run into trouble with T-SQL is in the JOIN criteria and the WHERE clause criteria. This occurs because they don't understand well enough what the filters and operators they're using will do. They end up returning too much data because they didn't us the WHERE clause or misapplied it. They also filter too much data out.
Just remember, there are even more functions than we go over here in this series. While these basic operators answer most needs, they won't answer all. Don't be afraid to open the Microsoft SQL Server documentation and learn new functions. With that in mind, we'll explore just a little bit more about how the WHERE clause works to help you filter data so you avoid some of the bigger issues.
Operators in the WHERE Clause
Almost all the operators you've used in the WHERE clause so far have been equality operators. You did experiment with the greater than operator in a previous blog post. There are a lot of different operators available to you. You also can combine operations using logic to further enhance your ability to control exactly what kind of information is returned.
For example, what if you wanted to only find people who's first name started with the letter E. You could just select all the information and use an ORDER BY on the first name, scroll down to where E ought to be and see what was there, but you've moved a lot of data for nothing. Instead, you could use a new operator called LIKE.
LIKE allows you to do a comparison in a similar way to using equals:
FROM Personnel.Person as p
WHERE p.PersonID LIKE 1;
This query will work exactly the same as if you had used the equals. But that's not what LIKE is for. To go back to finding everyone with a first name starting with the letter E. You can use a wild card, %, with the LIKE operator:
FROM Personnel.Person as p
WHERE p.FirstName LIKE 'E%';
This will match all rows that have the letter E and then, anything else.
There are a whole bunch of other functions and wild cards that work with the LIKE operator to allow you to do all different types of pattern matching.
Another way to filter information in the WHERE clause is to use the IN clause. This is used when you have a small list of values and if data matches any of them, you want to return it. This query will look through all the values in the IN operator to find any matches:
FROM Finance.FinancialTransaction AS ft
WHERE PersonID IN (2,3,4)
For the current data set, it will find one row in the Finance.FinancialTransaction table. This is a useful operator, but you should only use it with small lists, of less than 5-7 values. Otherwise you may see serious performance issues. It's also possible to use another operator, NOT, to select all values except the ones that make up the IN criteria list.
Another way to get at specific information is to use the BETWEEN operator. It's frequently used to find dates or numbers because you can look for a range of values by using BETWEEN to define upper and lower limits like this:
FROM Personnel.Person AS p
WHERE DateOfBirth BETWEEN '1/1/1975' and '1/1/1985';
If you run this against the current data set it will return two rows of information. The key is to know that these are inclusive ranges. Meaning, any value that matches or exceeds the first value and any value that matches or is less than the second will result in a match. If you want exclusive ranges you need to use the greater than in combination with less than. You can also apply the NOT operator to reverse the selection to only get values that are outside the defined range.
With all these operators you can mix and match them using logical constructs consisting of AND, OR and the use of parenthesis. For example, to select a specific company that is in the Aerospace industry and starts with the letter S, you could use a query like this:
FROM Management.Company AS c
WHERE c.CompanyDesc = 'Aerospace & Defense'
AND c.CompanyName LIKE 'S%';
This will limit the result to only those values that meet both criteria. You can change the logic though. You can go for combinations of data using the OR clause. If you wanted to find the people in the system who's first name started with the letter E or the letter S your query would look like this:
FROM Personnel.Person AS p
WHERE p.FirstName LIKE 'E%'
OR p.FirstName LIKE 'S%';
You can mix and match these as necessary and break them up logically using parenthesis to change the way all the filtering takes place.
All of this barely scratches the surface of what is possible when filtering data within T-SQL. The next blog post in the series will add to the concepts and functions in a WHERE clause.
Published at DZone with permission of Grant Fritchey, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Seven Steps To Deploy Kedro Pipelines on Amazon EMR
SRE vs. DevOps
Tech Hiring: Trends, Predictions, and Strategies for Success
Effortlessly Streamlining Test-Driven Development and CI Testing for Kafka Developers