In-Memory Data Grids… Explained.
Join the DZone community and get the full member experience.
Join For Freemany companies that would not have considered using in-memory technology in the past due to its cost are now changing their core systems’ architectures to accommodate it. they want to take advantage of the low-latency transaction processing in-memory technology offers. with the price of 1gb of ram less than a one dollar and ram prices dropping 30% every 18 months it has become economically affordable to load entire operational datasets into memory and achieve dramatic performance improvements.
companies are using this, for example, to perform calculations or create live dashboards that give management immediate insight into crucial operational data from their systems. currently, users often have to wait until the end of a reporting period for batch jobs to process the accumulated data and generate the desired reports.
modern in-memory technology connects to existing data stores such as hadoop or traditional data warehouses and makes this data available in ram, where it can then be queried or used in processing tasks with an unprecedented performance. the power of such insight in real-time lets companies react exponentially faster and more flexible than what current systems allow.
this paper is meant to help readers understand what the key features of modern in-memory products are and how they affect the eventual integration and performance. two key components are the underlying basis for the core capabilities of in-memory technology: in-memory compute and data grids. this paper concentrates on the in-memory data grids.
what is an in-memory data grid?
the goal of an in-memory data grid (imdg) is to provide extremely low latency access to, and high availability of, application data by keeping it in memory and to do so in a highly parallelized way. by loading terabytes of data into memory, an imdg is able to support most of the big data processing requirements. at a very high level an imdg is a distributed key-value object store similar in its interface to a typical concurrent hash map. you store and retrieve objects using keys.
unlike systems where keys and values are limited to byte arrays or strings, an imdg can have any application domain object as either a value or a key. this provides tremendous flexibility: exactly the same object your business logic is using can be kept in the data grid – without the extra step of marshaling and de-marshaling. it also simplifies the use of the data grid because you can in most cases interface with the distributed data store like with a simple hash map.
being able to work with domain objects directly is one of the main differences between imdgs and in-memory databases (imdb). with the latter, users still need to perform object-to-relational mapping (orm) which typically adds significant performance overhead and complexity. with in-memory data grids this is avoided.
how do in-memory data grids differ from other solutions?
an imdg, in general, is significantly different from products such as nosql databases, imdbs or newsql databases. for example, here are just some of the gridgain’s imdg features that make it unique:
- distributed acid transactions with in-memory optimized 2pc protocol
- data partitioning across a cluster (including fully replication)
- work with domain objects rather than with primitive types or “documents”
- tight integration with in-memory compute grid (imcg)
- zero deployment for both imcg and imdg
- pluggable segmentation (a.k.a. “brain split” problem) resolution
- pluggable expiration policies (including built-in lru, lirs, random and time-based)
- read-through and write-through with pluggable durable store
- synchronous and asynchronous operations throughout
- pluggable data overflow storage
- master/master data replication and invalidation in both synchronous and asynchronous modes
- write-behind cache store support
- automatic, manual and delayed pre-loading on topology changes
- support for fully active replicas (backups)
- support for structured and unstructured data
- pluggable indexing support
essentially imdgs in their purest form can be viewed as distributed hash maps with each key cached on a particular cluster node – the bigger the cluster, the more data you can cache. the trick to this architecture is to make sure that the processing occurs on those cluster nodes where the required data is cached. by doing this all cache operations become local and there is no, or minimal, data movement within the cluster. in fact, when using a well-designed imdg there should be absolutely no data movement on stable topologies. the only time when some of the data is moved is when new nodes join or some existing nodes leave, hence causing some data repartitioning within the cluster.
the picture below shows a classic imdg with a key set of
{k1, k2, k3}
where each key belongs to a different node. the external database
component is optional. if present, then imdgs will usually automatically
read data from the database or write data to it (a.k.a.
read-through
and
write-through
logic):
even though imdgs usually share some common basic functionality, there are many features and implementation details that are different between vendors. when evaluating an imdg product, pay attention to eviction policies, (pre)loading techniques, concurrent repartitioning or its memory overhead, for example. also pay attention to the ability to query data at runtime. some imdgs, such as gridgain for example, allow users to query in-memory data using standard sql, including support for distributed joins, which is pretty rare.
the typical use of imdgs is to partition data across the cluster and then send collocated computations to the nodes where the data is. since computations are usually part of compute grids and have to be properly deployed, load-balanced, failed-over or scheduled, the integration between compute grids and imdgs is very important to obtain the best performance. especially when both in-memory compute and data grids are optimized to work together and utilize the same apis, it is faster for developers to deploy system that offers the highest performance reliably.
distributed acid transactions
one of the distinguishing characteristic of imdgs is support for distributed acid transactions . generally, a 2-phase-commit (2pc) protocol is used to ensure data consistency within a cluster. different imdgs will have different underlying locking mechanisms, but more advanced implementations provide concurrent locking mechanisms (like mvcc – multi-version concurrency control), reduce network chattiness to a minimum, and specifically optimize its main algorithms for in-memory processing – guaranteeing transactional acid consistency with very high performance.
guaranteed data consistency is one of the main differences between imdgs and nosql databases.
nosql databases are usually designed with an eventual consistency (ec) approach where data is allowed to be inconsistent for a period of time as long as it will eventually become consistent. generally, the writes on ec-based systems are somewhat fast, but reads are slow (to be more precise: as fast as writes are). latest imdgs with an *optimized* 2pc protocol should at least match, if not outperform, ec-based systems on writes, and be significantly faster on reads. it is interesting to note that the industry has made a full circle moving from a then-slow 2pc approach to the ec approach, and now from ec to an optimized 2pc, which often is significantly faster.
different products have optimized the 2pc protocol in different ways, but generally the purpose of all optimizations is to increase concurrency, minimize network overhead, and reduce the number of locks a transaction requires to complete. as an example, google’s distributed global database, spanner, is based on a transactional 2pc approach simply because 2pc provided a faster and more straightforward way to guarantee data consistency and a high throughput. gridgain introduced “hyperlocking” technology that enabled effective single and group distributed locking that is at the core of its transactional performance.
distributed data grid transactions in gridgain span data cached on local as well as remote nodes. while automatic enlisting into jee/jta transactions is supported, gridgain data grid also allows users to create more light-weight cache transactions which are often more convenient to use. gridgain cache transactions support all acid properties that you would expect from any transaction, including support for optimistic and pessimistic concurrency levels and read-committed, repeatable-read, and serializable isolation levels. if a persistent data store is configured, then the transactions will also automatically span the data store.
multiversion concurrency control (mvcc)
gridgain’s in-memory data grid concurrency is based on advanced implementation of mvcc (multi version concurrency control) – the same technology used by practically all database management systems. it provides practically a lock free concurrency management by maintaining multiple version of data instead of using locks with a wide scope. thus, mvcc in gridgain provides a backbone for high performance and overall system throughput for systems under load.
in-memory sql queries
what use would be from caching all the data in memory if you could not query it? the in-memory platform should offer a variety of different ways to query its data, such as standard sql-based queries or lucene-based text queries.
the jdbc driver implementation lets you to query distributed data from the gridgain cache using standard sql queries and the standard jdbc api. it will automatically get only the fields you actually need from the objects stored in cache.
the gridgain sql query type lets you perform distributed cache queries using standard sql syntax. there are almost no restrictions as to which sql syntax can be used. all inner, outer, or full joins are supported, as well as rich set of sql grammar and functions. the ability to join different classes of objects stored in cache or across different caches makes gridgain queries a very powerful tool. all indices are usually kept in memory resulting in very low latencies for the execution of queries.
text queries are available when you are working with unstructured text data. gridgain can index such data with the lucene or h2text engine to let you query large volumes of text efficiently.
if there is no need to return result to the caller, all query results can be visited directly on the remote nodes. then all the logic is performed directly on the remotely queried nodes without sending any queried data to the caller. this way analytics can be run directly on structured or unstructured data with in-memory speed and low latencies. at the same time gridgain provides applications and developers a familiar way to retrieve and analyze the data.
here’s the quick example. notice how java code looks 100% identical as if you talk to a standard sql database – yet you are working in in-memory data platform:
// register jdbc driver. class.forname("org.gridgain.jdbc.gridjdbcdriver"); // open jdbc connection. conn = drivermanager.getconnection( "jdbc:gridgain:/ / localhost/" + cache_name, configuration() ); // create prepared statement. preparedstatement stmt = conn.preparestatement( "select name, age from person where age >= ?" ); // configure prepared statement. stmt.setint(1, minage); // get result set. resultset rs = stmt.executequery();
bigmemory support
traditionally jvm has been very good with garbage collection (gc). however, when running with large amounts of memory available, gc pauses can get very long. this generally happens because gc now has a lot more memory to manage and often cannot cope without stopping your application completely (a.k.a. lock-the-world pauses) and allowing itself to catch up. in our internal tests with heap size set to 60g or 90g gc pauses some times were as long as 5 minutes. traditionally this problem was solved by starting multiple jvms on the same physical box, but that does not always work very well as some applications want to collocate large amounts of data in one jvm for faster processing.
to mitigate large gc pauses, gridgain supports bigmemory with data allocated off-heap instead of on-heap. thus, the jvm gc does not know about it and does not slow down. you can start your java application with a relatively small heap, e.g. below 512m, and then let gridgain utilize hundreds of gigabytes of memory as off-heap data cache. whenever data is first accessed, it gets cached in the on-heap memory. then, after a certain period of non-use, it gets placed into off-heap memory cache. if your off-heap memory gets full, the least used data can be optionally evicted to the disk overflow store, also called swap store.
one of the distinguishing characteristics of gridgain off-heap memory is that the on-heap memory foot print is constant and does not grow with the size of your data. also, an off-heap cache entry has very little overhead, which means that you can fit more data in memory. another interesting feature of gridgain is that both primary and secondary indices for sql can be optionally kept in off-heap memory as well.
datacenter replication
when working with multiple data centers it is important to make sure that if one data center goes down, another data center is fully capable of picking up its load and data. data center replication is meant to solve exactly this problem. when data center replication is turned on, gridgain data grid will automatically make sure that each data center is consistently backing up its data to other data centers (there can be one ore more).
gridgain supports both active-active and active-passive modes for replication. in active-active mode both data centers are fully operational online and act as a backup copy of each other. in active-passive node, only one data center is active and another data center serves only as a backup for the active data center.
datacenter replication can be either transactional or eventually-consistent. in transactional mode, a data grid transaction will be considered complete only when all the data has be replicated to another datacenter. if the replication step failed, then the whole transaction will be rolled back on both datacenters. in eventually consistent mode transaction will usually complete before the replication finished. in this mode the data is usually concurrently buffered on one data center and then gets flushed to another data center either when buffer fills up or when certain time period elapses. eventually consistent mode is generally a lot faster, but it also introduces a lag between updates on one data center and data being replicated to another.
if one of the datacenters goes offline, then another will immediately take responsibility for it. whenever the crashed data center goes back online then it will receive all the updates it has missed from another data center.
in-memory compute grid integration
integration between imcg and imdg is based on idea of `affinity routing`. affinity routing is one of the key concepts behind compute and data grid technologies (whether they are in-memory or disk based). in general, affinity routing allows to co-locate a job and the data set this job needs to process.
the idea is pretty simple: if job and data are not co-located, then job will arrive on some remote node and will have to fetch necessary data from yet another node where data is stored. once processed this data will most likely will have to be discarded (since it’s already stored and backed up elsewhere). this process induces expensive network trip plus all associated marshaling and demarshaling. at scale – this behavior can bring almost any system to a halt.
affinity co-location solves this problem by co-locating job with its necessary data set. we say that there is an affinity between a processing (i.e. job) and the data that this processing requires – and therefore we can route the job based on this affinity to a node where data is stored to avoid unnecessary network trips and extra marshaling and demarshaling.
gridgain provides advanced capabilities for affinity co-location: from a simple single-method call to sophisticated apis supporting complex affinity keys and non-trivial topologies.
summary
in-memory data grids are used throughout a wide spectrum of industries in applications as diverse as risk analytics, trading systems, bioinformatics, ecommerce or online gaming. essentially, every project that struggles with scalability and performance can benefit from in-memory processing and an in-memory data grid architecture. when you consider different products, make sure you have the advanced features outlined in this paper. this way you can find an optimal solution for your needs and ensure right at the onset that your solution will actually scale flawlessly in those critical moments when you need it to scale.
Published at DZone with permission of Nikita Ivanov, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments