Say Hello To GridGain Data Grid
Join the DZone community and get the full member experience.
Join For FreeI have been thinking how a HelloWorld example should look for data
grid. After checking some other products I have noticed that the most
popular approach for a HelloWolrd app on a data grid is creating an
example which has two counter parts: client and server. The client
example generally prints out the operation on cache, and the server
would usually print out the same operation whenever the data ends up on
remote server. This way users can see that the value stored in cache
actually does get distributed to remote nodes.
After looking at such examples it occurred to me that this client/server approach can be implemented a lot simpler in GridGain using zero deployment and basic event subscription.
All we need to do is make sure that cache operations get printed out on
remote nodes so we can visualize what's going on. However, for that we
don't need to create a separate server app - we can do it all from our
client example code.
So, let's make sure that events are printed out. For that we will
execute a closure on all grid nodes which will subscribe to cache events
and print them. This closure can be executed directly from example code
and will be automatically deployed on remote nodes. Here is how the code will look like:
// Execute this runnable on all grid nodes, local and remote.
G.grid().run(BROADCAST, new Runnable() {
@Override public void run() {
// Event listener which will print out cache events, so we
// can visualize what happens on remote nodes.
GridLocalEventListener lsnr = new GridLocalEventListener() {
@Override public void onEvent(GridEvent e) {
System.out.println("Event '" + e.type() + "' for key: " +
((GridCacheEvent)e).key());
}
};
// GridNodeLocal is a ConcurrentMap attached to every grid node.
Object prev = grid.nodeLocal().putIfAbsent("lsnr", lsnr);
// Make sure that we only subscribe once regardless
// of how many times we run the example.
if (prev == null)
grid.addLocalEventListener(lsnr,
EVT_CACHE_OBJECT_PUT,
EVT_CACHE_OBJECT_READ,
EVT_CACHE_OBJECT_REMOVED);
}
});
Note how easy it is in GridGain to execute any kind of code on all grid nodes (or any subset of nodes) without actually having to deploy anything. Now lets play with some basic cache operations and see what happens:
// Create strongly typed cache projection to avoid casting.
final GridCacheProjection<Integer, String> cache =
G.grid().cache().projection(Integer.class, String.class);
// Store some values in cache.
for (int i = 0; i < 10; i++)
cache.put(i, "value-" + i);
// Note that size may differ depending on whether cache
// is distributed or partitioned.
System.out.println("Cache size: " + cache.size());
// Visit every cache element stored on local node.
// Note that 'CI1' is a just a type alias for 'GridInClosure' type.
cache.forEach(new CI1<GridCacheEntry<Integer, String>>() {
@Override public void apply(GridCacheEntry<Integer, String> e) {
// Peek at locally cached values.
System.out.println("Visited locally cached entry: " + e.peek());
}
});
// Collocate computations and data.
for (int i = 0; i < 10; i++) {
final int key = i;
// Find primary node for a key.
final UUID nodeId = cache.mapKeyToNode(key);
// Execute your computations on nodes where the data is cached to avoid a
// potentially heavy operation of bringing data to the local node.
// This is called Collocation of Computations and Data.
G.grid().node(nodeId).run(UNICAST, new Runnable() {
@Override public void run() {
System.out.println("Collocating computations and data on node: " + nodeId);
// Usually you would do something more complex than this :)
System.out.println("Cached value: " + cache.peek(key));
}
});
}
// The 'get' operation will bring values from remote nodes
// even if they are not cached on local node. Generally,
// you would want to avoid it, if possible, as it may
// create unnecessary data traffic.
for (int i = 0; i < 10; i++)
System.out.println("Cached value: " + cache.get(i));
The example above is just a small sample of what you can do with GridGain data grid. Note that if the cache is configured to be replicated (which is default), then data will be replicated to all nodes and every node will get the same copy. If cache is partitioned, then only a designated primary node (and also backup nodes, if any) will get to cache a specific key-value pair.
Also note how easily we brought our computations to the nodes where the
data is cached, as opposed to bringing the data to the computations.
Performing computations without any unnecessary data movement (a.k.a.
data noise) is one of the most important elements in achieving better
scalability.
To run this example, startup a few stand alone GridGain nodes by executing GRIDGAIN_HOME/bin/ggstart.{sh|bat} script and watch what happens.
From http://gridgain.blogspot.com/2011/01/say-hello-to-gridgain-data-grid.html
Opinions expressed by DZone contributors are their own.
Comments