MapReduce Without Splitting

DZone 's Guide to

MapReduce Without Splitting

· Java Zone ·
Free Resource

I frequently get the question along these lines: since GridGain is all about MapReduce type of processing, what if I don’t (or can’t) logically split my task into multiple sub-tasks?

There are two facets to this question:

  1. Non-splitting is a perfectly fine use case of MapReduce – you simply split into one sub-task. That allows you move the entire task execution onto the grid.
  2. By not splitting and simply putting the entire task for execution on the grid you gain scalability (but usually not the performance).

There are edge use case where you can gain performance even in this case:

  • If your computers on the grid are more powerful or less busy (given the right collision resolution policy)
  • If you run your tasks locally on multi-core CPU (assuming the original processing was sequential allowing you gain performance by utilizing better threading performance on multi-core CPUs)

Non-split is extremely important use case as it allows gain scalability with minimum effort. In fact, with GridGain you achieve that with just one @Gridify annotation in most cases:

publiv void someBusiness(Object arg) throws SomeException {
// Some business logic.
Next time you call this method its entire execution will be moved onto the grid.

I’ve seen number of pilot projects where within several hours since downloading GridGain – one would have 6-8 nodes grid and offloading task execution onto it – gaining instant 6-8 times scalability improvements (!). And I repeat – within several hours…



Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}