Better Perl: Using map and grep
Once you understand map and grep, you'll start seeing lists everywhere and the opportunity to make your code more succinct and expressive at the same time.
Join the DZone community and get the full member experience.Join For Free
As a Perl developer, you're probably aware of the language's strengths as a text-processing language and how many computing tasks can be broken down into those types of tasks. You might not realize, though, that Perl is also a world-class list processing language and that many problems can be expressed in terms of lists and their transformations.
Chief among Perl's tools for list processing are the functions
grep. I can't count how many times in my twenty-five years as a developer I've run into code that could've been simplified if only the author was familiar with these two functions. Once you understand
grep, you'll start seeing lists everywhere and the opportunity to make your code more succinct and expressive at the same time.
What Are Lists?
Before we get into functions that manipulate lists, we need to understand what they are. A list is an ordered group of elements, and those elements can be any kind of data you can represent in the language: numbers, strings, objects, regular expressions, references, etc., as long as they're stored as scalars. You might think of a list as the thing that an array stores, and in fact Perl is fine with using an array where a list can go.
Here we're assigning the list of numbers from
3 to the array
@foo. The difference between the array and the list is that the list is a fixed collection, while arrays and their elements can be modified by various operations.
perlfaq4 has a great discussion on the differences between the two.
Lists Are Everywhere, Man!
Ever wanted to
sort some data? You were using a list.
join a bunch of things together into a string? List again.
Heck, even the humble
say take a list (and an optional filehandle) as arguments; it's why you can treat Perl as an upscale AWK and feed it scalars to output with a field separator.
map: The List Transformer
map function is devious in its simplicity: It takes two inputs, an expression or block of code, and a list to run it on. For every item in the list, it will alias to it, and then return none, one, or many items in a list based on what happens in the expression or code block. You can call it like this:
Or like this:
We're going to ignore the first way, though because Conway (Perl Best Practices, 2005) tells us that when you specify the first argument as an expression, it's harder to tell it apart from the remaining arguments, especially if that expression uses a built-in function where the parentheses are optional. So always use a code block!
You should always turn to
map (and not, say, a
foreach loop) when generating a new list from an old list. For example:
When paired with a lookup table,
map is also the most efficient way to tell if a member of a list equals a string, especially if that list is static:
Here we're using
map 's ability to return multiple items per source element to generate a constant hash, and then testing membership in that hash.
grep: The List Filter
Perl, of course, is really good at regular expressions, but its
grep function goes beyond and enables you to match using any expression or code block. Think of it as a partner to
map uses a code block to transform a list,
grep uses one to filter it down. In fact, other languages typically call this function
You can, of course, use regular expressions with
grep, especially because a regexp match in Perl defaults to matching on the
$_ variable and
grep happens to provide that to its code block argument. So:
grep really comes into its own when used for its general filtering capabilities; for instance, making sure that you don't accidentally try to compare an undefined value:
Or when executing a complicated function that returns true or false depending on its arguments:
You might even consider chaining
grep together. Here's an example for getting the JPEG images out of a file list and then lowercasing the results:
'Side Effects May Include...'
map above I noted that it aliased
$_ for every element in the list. I used that term deliberately because modifications to
$_ will modify the original element itself, and that is usually an error. Programmers call that a 'side effect,' and they can lead to unexpected behavior or at least difficult-to-maintain code. Consider:
The intent may have been to find files ending in
.pm that don't have a corresponding
.pod file, but the actual behavior is replacing the
.pm suffix with
.pod, then checking whether that filename exists. If it doesn't, it's passed through to
@pm_files has had its contents modified.
If you really do need to modify a copy of each element, assign a variable within your code block like this:
But at that point you should probably refactor your multi-line block as a separate function:
And if you do need side effects, just use a
foreach loop; future code maintainers (i.e., you in six months) will thank you.
Taking You Higher
grep are examples of higher-order functions, since they take a function (in the form of a code block) as an argument. So congratulations, you just significantly leveled up your knowledge of Perl and computer science. If you're interested in more such programming techniques, I recommend Mark Jason Dominus' Higher Order Perl (2005), available for free online.
Published at DZone with permission of Mark Gardner, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.