Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

More on Haskell/FP function organization; lazy evaluation

DZone's Guide to

More on Haskell/FP function organization; lazy evaluation

· Java Zone
Free Resource

Learn how to troubleshoot and diagnose some of the most common performance issues in Java today. Brought to you in partnership with AppDynamics.

After my last post scrolled off the bottom of the page, I realized I missed a couple of opportunities: one related to some additional code optimization, and one related to the topic of lazy (or nonstrict) evaluation.

First, let me review what I was doing. I was processing a data structure I was using to represent events generated from a process monitor. The structure includes a timestamp, a class name and source line number, a text message, and a set of properties. Displaying one of these in ghci yields the following:


Event {timestamp = 1320513130333, className = "com.adamsresearch.jarview.JarView",
 lineNumber = 388, message = "fileNotFound", properties = 
[Property {key = "userId", value = "adams"},Property {key = "sessionId", value = "EFGH5678"}]}
My goal was to write -- in authentic FP style -- some code to extract, from a List of these events, events whose userId property value matched a particular user ID. Here is the code (minus type definitions, etc.) I ended up using:
eventGeneratedByUser :: String -> Event -> Bool
eventGeneratedByUser id event =
 if (containsUserPropertyWithValue (properties event) id)
   then True
   else False
 where
   containsUserPropertyWithValue (x:xs) id =
     if (isUserWithId x id)
       then True
       else containsUserPropertyWithValue xs id
     where
       isUserWithId prop id =
         if (key prop == "userId" && value prop == id)
           then True
           else False
   containsUserPropertyWithValue [] _ = False

Finally, in ghci, I retrieved the list of events I wanted with the following:
    *Main> let filterPred = eventGeneratedByUser "adams"
    *Main> let eventsForAdams = filter filterPred eventList

One theme object-oriented developers hear about repeatedly in FP is to stop thinking in loops, and start thinking in terms of recursion. An additional theme is (as best as I can state it) to avoid recursion when you can use a mapping-style function, that is, a function which structurally is defined to operate on all the elements of a list. When you set up a recursive function to process a list, a large part of what you are building is boilerplate. Boilerplate is the developer's curse: some live with it (care to pound out the boilerplate for Java's GridBagLayout and GridBagConstraints, anyone?), while others replace it with "frameworks" that the rest of us usually find even less appealing than writing the boilerplate code ourselves.

As you can tell from the above source code, I used recursion to create the boolean function which determines if an event is associated with a user. However, when I actually filtered the list, instead of using a recursive function I used Haskell's filter, passing it my partial function. For that last step, I feel I finally was writing in the spirit of Haskell and FP in general, but now let's take a look at that source code again and see if there's room for improvement.

Note first my innermost nested function, isUserWithId:
isUserWithId prop id =
 if (key prop == "userId" && value prop == id)
   then True
   else False

Note from my old post that I used Haskell record syntax to define the Property data type and that Haskell has used type inference to recognize that I am passing a Property structure to the isUserWithId function. All this function does is look at a key-value pair and return True if the key is userId and the value matches the parameter I pass in. I think this function is fine "as is".

The enclosing function, found in the where clause of the function eventGeneratedByUser, is a recursive function. Let's repeat the whole function:
eventGeneratedByUser :: String -> Event -> Bool
eventGeneratedByUser id event =
 if (containsUserPropertyWithValue (properties event) id)
   then True
   else False
 where
   containsUserPropertyWithValue (x:xs) id =
     if (isUserWithId x id)
       then True
       else containsUserPropertyWithValue xs id
     where
       isUserWithId prop id =
         if (key prop == "userId" && value prop == id)
           then True
           else False
   containsUserPropertyWithValue [] _ = False

What I am doing with this function is essentially iterating over the Property elements of an Event, returning True as soon as I find a Property that matches the desired condition.

Can this be converted from a recursive function to something a little more compact? As it turns out, yes, easily. Haskell contains a function called any which acts on a List and returns True if any element of the List is True. To use this function in my example, I would rearrange my code to calculate the isUserWithId value for each Property in the data structure, wrap those boolean values in a List, and apply the Haskell function any to the List. If you are wondering if I will be wasting cycles calculating values I may not need, bear with me for a few minutes.

The first question is how to generate a List of boolean values, the results of applying isUserWithId to a List of properties. To start, let me promote isUserWithId to a standalone function, but before we do that, let's do the same thing we did in my last post -- plan ahead to be able to use it as a partial function. By this I mean we'll want to apply this function to a list of Property values, where the "user ID" will be built-in, so to speak, to the function definition. I'll redefine the function to take the user ID first, so I can create a partial function later which I can apply directly to a Property (note this involves re-ordering the arguments in the isUserWithId equation, also):
isUserWithId :: String -> Property -> Bool
isUserWithId id prop =
 if (key prop == "userId" && value prop == id)
   then True
   else False

We can verify the function, both in its "full form", where we pass in both the user ID and the Property and in its partial form, where we first create a partial function with the user ID specified, and then pass in only a Property:
    *Main> prop4
    Property {key = "userId", value = "smith"}
    *Main> isUserWithId "smith" prop4
    True
    *Main> let containsSmith = isUserWithId "smith"
    *Main> containsSmith prop4
    True

Next, I'll take the existing three, short Property Lists from my example and concatenate them into a larger list:
    *Main> let bigPropList = concat [propList1,propList2,propList3]
    *Main> bigPropList
    [Property {key = "userId", value = "smith"},Property {key = "sessionId", value = "ABCD1234"},
Property {key = "userId", value = "adams"},
Property {key = "sessionId", value = "EFGH5678"},Property {key = "userId", value = "jobim"},Property {key = "sessionId", value = "WXYZ9876"}]
Now let's use any to return True if any item in the list matches our partial function, which is testing to see if a Property has user ID "smith":
    *Main> any containsSmith bigPropList
    True

and that is literally all we had to do to get our result. Note the interesting invocation of any: our partial function takes a Property, not a List of Property values. The any function steps through the elements of the list and returns True as soon as the invocation of containsSmith against a single Property returns True.

To integrate this code into our new version of eventGeneratedByUser, we could do the following:
eventGeneratedByUser :: String -> Event -> Bool
eventGeneratedByUser id event =
 any (isUserWithId id) (properties event)

and then test this in ghci:
    *Main> head eventList
    Event {timestamp = 1320512200548, className = "java.lang.String", lineNumber = 1293, message = "NPE in substring()", properties = [Property {key = "userId", value = "smith"},Property {key = "sessionId", value = "ABCD1234"}]}
    *Main> eventGeneratedByUser "smith" (head eventList)
    True

Note the code is considerably more compact, and (unlike in some languages where "terse" = "dense") actually more readable. In fact, things are so clean we can go a step farther than we did before and provide a function to return a subset of the List, something we just did interactively in ghci the last time:
getEventsGeneratedByUser :: String -> [Event] -> [Event]
getEventsGeneratedByUser id events =
 filter (eventGeneratedByUser id) events

and, testing in ghci:
    *Main> getEventsGeneratedByUser "jobim" eventList
    [Event {timestamp = 1320512200699, className = "javax.swing.JPanel", lineNumber = 388, message = "initialized", 
properties = [Property {key = "userId", value = "jobim"},Property {key = "sessionId", value = "WXYZ9876"}]},Event
{timestamp = 1320512203699, className = "javax.swing.JList", lineNumber = 1255, message = "model replaced", properties =
[Property {key = "userId", value = "jobim"},Property {key = "sessionId", value = "WXYZ9876"}]},Event {timestamp = 1320512259333,
className = "javax.lang.Integer", lineNumber = 133, message = "number format exception", properties = [Property {key = "userId",
value = "jobim"},Property {key = "sessionId", value = "WXYZ9876"}]}]
Note how compact and readable our resulting code is. We have one function which determines if a Property contains a user ID of a specified value -- 5 lines of code, including the type declaration. We have two additional functions, one of which returns True if an Event is generated by a specified user, and one which returns a subset of a List of Events which satisfy this condition -- each 3 lines of code, including the type declaration. And as I mentioned earlier, in this case, compact actual means readable.

Lazy Evaluation

In the above code, you might have been wondering "What is the performance impact of looping through a List of, say, 5000 properties and returning True if only one of them meets the test? Especially if the first property meets the test?". Haskell evaluates expressions only when they are needed. Technically in Haskell this is called nonstrict evaluation, and it allows you to do some things that would truly look odd in Java. For example, in Haskell you can define a list as a range of elements using ..; the following defines a list of integer values from 1 through 9:
    *Main> [1..9]
    [1,2,3,4,5,6,7,8,9]

While this may look normal, what do you think of this?
    Prelude> [1..]
    [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20
,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40
,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60
,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80
,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100
,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120
,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135
    ...

where I have added the line breaks to make it easier to read. Big surprise: this is an infinite list starting with the integer 1, and ghci will happily continue printing this output to the screen forever, or until Something Really Bad Happens, whichever comes first. I had to hit Control-C to stop the output. But, you can assign this infinite list to a variable:
    Prelude> let infiniteList = [1..]
    Prelude>

with no problems. The reason is that infiniteList has not yet been evaluated (populated). If you were to try to show this in ghci, you'd be back to the infinite output again. Nonstrict evaluation allows us to do some interesting things. For example, you can take the first 11 items of an infinite list:
    Prelude> take 11 infiniteList
    [1,2,3,4,5,6,7,8,9,10,11]
    Prelude>

This feature allows us to write code that (for example) returns True if "any" element in a List satisfies some predicate. Haskell isn't going to sift through the entire List, evaluating every expression, and then step through the List and apply the predicate. So it doesn't matter how many properties my events have; I'll return a True as soon as I get a match. Note that there are cases where all the elements will indeed be evaluated; for example, if you are summing the elements of a List (of course, you didn't expect Haskell to give you the sum of the elements of an infinite List, right?).

Nonstrict evaluation is already one of my favorite features of Haskell, right up there with standard library functions that operate on all the elements of a List. Next post, I expect to discuss I/O in Haskell, continuing along with the process-event-monitoring example I've been using to date.

 

From http://wayne-adams.blogspot.com/2011/11/more-on-haskellfp-function-organization.html

Understand the needs and benefits around implementing the right monitoring solution for a growing containerized market. Brought to you in partnership with AppDynamics.

Topics:

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}