Infinispan for the Power-user: Cache Modes
This is the second part of an in-depth series of articles on Infinispan. In this article I'd like to talk about Infinispan's different cache modes, and why you'd use one over the other.
Read the other parts in this series:Part 1 - Remoting
Part 2 - Cache Modes
Part 3 - Event Notifications
Infinispan is primarily a peer-to-peer system where cache instances are expected to live within the same JVM of your application.
NOTE: As of Infinispan 4.1.0, Infinispan also supports client-server style of interaction, but that is outside the scope of this article and will be addressed in a separate article once Infinispan 4.1.0 is released.
Infinispan supports 3 broad clustered modes, and a single non-clustered mode. Further, the clustered modes could be configured to use a synchronous or asynchronous transport for network communications. Let's start with the local mode.
While Infinispan is particularly interesting in clustered mode, it also offers a very capable local mode, where it acts as a simple, in-memory data cache similar to JBoss Cache and EHCache. But why would one use a local cache rather than a map? Caches offer a lot of features over and above a simple map, including write-through and write-behind caching to persist data, eviction of entries to prevent running out of memory, and support for expirable entries. Infinispan, specifically, is built around a high-performance, read-biased data container which uses modern techniques like MVCC locking – which buys you non-blocking, thread-safe reads even when concurrent writes are taking place. Infinispan also makes heavy use of compare-and-swap and other lock-free algorithms, making it ideal for high-throughput, multi-CPU/multi-core environments. Further, Infinispan's Cache API extends the JDK's ConcurrentMap – making migration from a map to Infinispan trivial. Performance benchmarks in local mode are available in this blog article.
Replication is a simple clustered mode where cache instances automatically discover neighboring instances on other JVMs on the same local network, and form a cluster. Entries added to any of these cache instances will be replicated to all other cache instances in the cluster, and can be retrieved locally from any instance. This clustered mode provides a quick and easy way to share state across a cluster, however replication practically only performs well in small clusters (under 10 servers), due to the number of replication messages that need to happen – as the cluster size increases. Infinispan can be configured to use UDP multicast which mitigates this problem to some degree.
Invalidation is a clustered mode that does not actually share any data at all, but simply aims to remove data that may be stale from remote caches. This cache mode only makes sense if you have another, permanent store for your data such as a database and are only using Infinispan as an optimization in a read-heavy system, to prevent hitting the database every time you need some state.
Invalidation allows you to store state in the cache, but every time an entry is modified, all remote caches are notified that the entry has been modified and if the remote caches happen to have a cached copy, this cached copy is purged.
Distribution is a powerful clustering mode which allows Infinispan to scale linearly as more servers are added to the cluster. Distribution makes use of a consistent hash algorithm to determine where in a cluster entries should be stored. This algorithm is configured with the number of copies of each entry that should be maintained cluster-wide. This represents the tradeoff between performance and durability of data. The more copies you maintain, the lower performance will be, but also the lower the risk of losing data due to server outages. But regardless of how many copies are maintained, distribution still scales linearly and this is key to scalability. Another feature of the consistent hash algorithm is that it is deterministic in locating entries without resorting to multicasting requests or maintaining expensive metadata. This means that doing a PUT would result in at most num_copies remote calls, and doing a GET anywhere in the cluster would result in at most 1 remote call. In reality, num_copies remote calls are made even for a GET, but these are done in parallel and as soon as any one of these returns, the entry is passed back to the caller.
To prevent repeated remote calls when doing multiple GETs, L1 caching can be enabled. L1 caching places remotely received values in a near cache for a short period of time (configurable) so repeated lookups would not result in remote calls. In the above diagram, if L1 was enabled, a subsequent GET for the same key on Server3 would not result in any remote calls.
L1 caching is not free though. Enabling it comes at a cost, and this cost is that every time a key is updated, an invalidation message needs to be multicast to ensure nodes with the entry in L1 invalidates the entry. Further, this adds a memory overhead as L1 caches take up space. Is L1 caching right for you? The correct approach is to benchmark your application with and without L1 enabled and see what works best for your access pattern.
Synchronous or asynchronous?
Orthogonal, but still related to clustered modes discussed above, is whether remote calls made are synchronously or asynchronously. Synchronous simply means that any remote call made will block until the recipient applies the change and sends back an acknowledgement. Asynchronous mode is where remote calls are made, but the caller does not wait for an acknowledgement, but simply assumes the remote call was successful and returns. Choosing between synchronous and asynchronous is a tradeoff between performance and guarantees that remote calls succeed. Any of the clustered modes discussed above can be configured to be synchronous or asynchronous.