Collocation: The First Rule of Distributed Programming
Join the DZone community and get the full member experience.
Join For FreeI am pleased to announce that we just recently released GridGain 3.0.4. The last couple of releases have been focused, among other things, around convenient and effective collocation of computations and data, and also grouping of data
that is usually accessed together on the same nodes. Sending
computations exactly to the nodes where the accessed data is residing is
one of the key components in achieving better scalability. Without
collocation, nodes fetch various data from other nodes for brief periods
of time, just to perform often a quick computation and discard it
almost immediately thereafter. This creates unnecessary data traffic,
a.k.a. data noise, and can at times bring a server to its knees.
In my previous blog post I showed how to collocate computations and data using direct API via GridCache.mapKeyToNode(..) method. We have also added analogous methods on Grid
API to provide capability of finding data affinity on the nodes that do
not cache any data themselves. In our latest 3.0.4 release we have also added a
very convenient way to provide collocation via @GridCacheAffinityMapped annotation.
Say you have 2 types of objects, Person and Company. Multiple persons can work for the same company. This means that you generally may wish to access Person objects together with the Company
for which they work. To do that in a scalable fashion, you may wish to
ensure that all people working for the same company are cached on the
same node. This way you can send computations to that node and access
multiple people from the same company locally. Here is how it can be
done in GridGain.
From http://gridgain.blogspot.com/2011/01/collocation-first-rule-of-distributed.html
Opinions expressed by DZone contributors are their own.
Comments