Scaling Databases With EclipseLink And Redis
Integrating EclipseLink with Redis using database record as cache entry.
Join the DZone community and get the full member experience.Join For Free
EclipseLink has two types of caches: the shared cache (L2) maintains objects read from database; and the isolated cache (L1) holds objects for various operations during the lifecycle of a transaction. L2 lifecycle is tied to a particular JVM and spans multiple transactions. Cache coordination between different JVMs is off by default. EclipseLink provides a distributed cache coordination feature that you can enable to ensure data in distributed applications remains current. Both L1 and L2 cache store domain objects.
“Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache, and message broker. Redis provides data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes, and streams.” — redis.io
This article is about EclipseLink and Redis but the concept can be applied to any ORM and Distributed Cache libraries.
Unlike Hibernate with out-of-the-box support for L2 integration with Redis, there’s no equivalent support for EclipseLink as far as L2 integration with distributed cache.
EclipseLink does provide CacheInterceptor class with several APIs that developers, theoretically, can implement to intercept various operations on EclipseLink cache. Unfortunately, these APIs are not well documented and not easy to implement so you don’t see any open source libraries supporting EclipseLink L2 integration with Redis yet.
The good news is there is a much easier and simpler approach to integrate EclipseLink with Redis than going through CacheInterceptor interface. This approach uses cache-aside pattern to read data and store database record as a cache entry. We’ve been using this approach in production at Intuit for QuickBook Online Payroll to help scale our database and improve application performance. It has been a great success.
- When your application needs to read data from the database, it’ll check Redis (L3) first to see if data is available
- If the data is available (a cache hit), the cached data returned
- If the data is not (a cache miss), the database is queried for data. The cache will be populated and data returned to the caller
DatabaseRecord is an object in EclipseLink that represents a database row as field value pairs. A DatabaseRecord provides data to one or many domain objects. EclipseLink has APIs to build domain objects from DatabaseRecord.
Domain objects are used by L2 cache. Using DatabaseRecord as cache entry simplifies the implementation greatly because we don’t have to worry about maintaining a domain object relationship. The primary key can be used together with domain classname to create a cached key. At a conceptual level, DatabaseRecord is similar to a database table row. The important point is this approach caches data, not the object tree.
Here is the conceptual Read flow
For implementation, we use a combination of AspectJ to hook into EclipseLink lifecycle to intercept Read/Write operation for populating and invalidating cache
Enough Talk, Show Me the Code
DatabaseRecordAspect.java: This class would intercept the selectOneRow and selectAllRows methods used by EclipseLink to read one object and read the collection of objects
DatabaseRecordInterceptor.java: Responsible for intercepting ExpressionQueryMechanism.selectOneRow() and ExpressionQueryMechanism.selectAllRows() to cache DatabaseRecord before results are translated into EclipseLink objects
For cache invalidation, you just need to register your invalidators with DescriptorEventAdapter and implement postUpdate(), postDelete() and postInsert(). When EclipseLink write occurs, one of these methods would get executed and you can call your invalidators to delete/update cache entry in Redis. You should populate and invalidate cache asynchronously to avoid blocking the application.
We use Lettuce as a client library to talk to Redis and Kryo for serialization
Integrating EclipseLink with Redis is much easier if you cache data used to populate domain objects. Initially, we attempted to cache domain object but ran into several issues with EclipseLink. Domain object maintains associations with other objects. When you read object back from Redis, you’ll have to reconstruct the object tree. This gets complicated when there are lazy-load associations among objects. Data caching is simple and the cache entry size is consistent (i.e. one database row). Predictable cache entry size helps us optimize Redis cache size and make serialization and deserialization faster. You don’t have to change much of your existing code to get this working. This is really a game changer for us in regard to reducing our database load and providing consistent performance.
Opinions expressed by DZone contributors are their own.