The World's Shortest MapReduce App

DZone 's Guide to

The World's Shortest MapReduce App

· Big Data Zone ·
Free Resource
We often say that GridGain is very simple to use, but we often forget that many of our users don't actually exploit the full power of GridGain functional APIs. In this blog I want to demonstrate several GridGain features that are available at your fingertips once you download GridGain.

1. Broadcasting
Here is a simple app that broadcasts execution of a closure to all participating nodes (in our simple scenario the closure simply prints out a string): 

The "F.println()" method returns a simple closure that will print out a passed in argument. Then GridGain will take this closure and execute it on all available nodes.

2. Splitting Execution
Here is a simple app that splits a given phrase into words, creates closures that print individual words, and executes them on different nodes:

F.yield("Splitting This Message Into Words".split(" "), F.println())

What happens here is that initial String is split into words using standard JDK split method. Then method "F.yield()" will take each word, give it to "F.println()" closure and return a collection of "F.println()" closures curried with a predefined argument (one for each word). GridGain will then take this collection of closures and execute them on the Grid sending each closure to a different node in round-robin fashion.

3. The World's Shortest MapReduce App
Now let's try to actually get a little more sophisticated and execute an app which will split a phrase into multiple words, have individual grid nodes count letters in individual words and then, in reduction step, add up all the counts. Here is how this app will look like:

int letterCnt = G.grid().forkjoin(
F.yield("Counting Letters In This Phrase".split(" "),
new C1<String, Integer>() {
@Override public Integer apply(String word) {
return word.length();

Here are the things to note here. The C1 is a convenience alias for a GridClosure class which in our case takes a string and returns a number of characters in that string. A collection of closures with different words as an argument will be distributed to individual grid nodes and each of those closures will return a number of characters in the word it received. Then the local node will use "F.sumIntReducer()" which simply adds all character counts given to it.

And to top it all, none of the code above requires any deployment. You simply startup a few bare bone GridGain nodes, write your code, hit the Run button, and your code just executes on the Grid (well, it's utilizing GridGain peer-class-loading mechanism underneath). There are no Ant or Maven scripts to execute, just write your code and run it. If you need to change your code, just change it and run it again.

I doubt there is another product out there that will let you execute fairly complex MapReduce applications in such a concise and elegant manner ;-)

From http://gridgain.blogspot.com/2010/10/worlds-shortest-mapreduce-app.html


Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}