Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Getting Started With Clustering

DZone's Guide to

Getting Started With Clustering

· Cloud Zone
Free Resource

Site24x7 - Full stack It Infrastructure Monitoring from the cloud. Sign up for free trial.

If you don't like to read and prefer video demos, you can skip directly to the Screencast at the bottom of this post.

What do Clustering frameworks really do? More often than not clustering frameworks will provide capability to auto-discover servers on the network, share resources, and schedule tasks. Some will also add distributed messaging and distributed event notification capabilities.

While there are some well known clustering frameworks, like Zookeeper or Mesos, they usually provide very rudimentary clustering capabilities. However, often on top of basic clustering, you also need to perform MapReduce computations, distribute closures, or distribute data. For cases like these, Compute Grids (a.k.a. High Performance Computing Grids) or Data Grids become very useful.

For those not familiar with term "Data Grid", it is simply a Distributed Cache with more advanced features, like distributed data querying, transactions, etc...

Compute Grids or Data Grids often provide very advanced clustering APIs which are very simple to use. Here I will show some basic examples on top of GridGain In-Memory Data Grid, which is Open Source and licensed under Apache license.

GridGain clustering supports auto-node discovery, but at the same time adds capabilities to create any virtual sub-groups of grid nodes within cluster and exchange messages between them or get remote event notifications. While I have blogged about it in more detail before, here is a pretty simple example which demonstrates auto-discovery and distributed computations on the cluster:

try (Grid grid = GridGain.start()) {
   // Create sample runnable.
   Runnable r = new GridRunnable() {
      @Override public void run() {
         System.out.println("Hello World");
      }
   }
  
   // Broadcast to all grid nodes.
   grid.compute().broadcast(r).get();
  
   // Broadcast to remote nodes only.
   grid.forRemotes().compute().broadcast(r).get();
  
   // Unicast to some remote node picked by load balancer.
   grid.forRemotes().compute().run(r).get();
  
   // Unicast to some node with CPU load less than 50%.
   grid.forPredicate(new GridPredicate<GridNode>() {
      @Override public boolean apply(GridNode node) {
         return node.metrics().getCurrentCpuLoad() < 0.5;
      }
   }).compute().run(r).get();
}

Screencast

Here is a brief screencast showing how to get started with running computations on your cluster in under 5 minutes:

Site24x7 - Full stack It Infrastructure Monitoring from the cloud. Sign up for free trial.

Topics:

Published at DZone with permission of Dmitriy Setrakyan, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}