Scaling Databases With EclipseLink And Redis
Integrating EclipseLink with Redis using database record as cache entry.
Join the DZone community and get the full member experience.
Join For FreeOverview
EclipseLink has two types of caches: the shared cache (L2) maintains objects read from database; and the isolated cache (L1) holds objects for various operations during the lifecycle of a transaction. L2 lifecycle is tied to a particular JVM and spans multiple transactions. Cache coordination between different JVMs is off by default. EclipseLink provides a distributed cache coordination feature that you can enable to ensure data in distributed applications remains current. Both L1 and L2 cache store domain objects.
“Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache, and message broker. Redis provides data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes, and streams.” — redis.io
This article is about EclipseLink and Redis but the concept can be applied to any ORM and Distributed Cache libraries.
Challenges
Unlike Hibernate with out-of-the-box support for L2 integration with Redis, there’s no equivalent support for EclipseLink as far as L2 integration with distributed cache.
EclipseLink does provide CacheInterceptor class with several APIs that developers, theoretically, can implement to intercept various operations on EclipseLink cache. Unfortunately, these APIs are not well documented and not easy to implement so you don’t see any open source libraries supporting EclipseLink L2 integration with Redis yet.
Solution
The good news is there is a much easier and simpler approach to integrate EclipseLink with Redis than going through CacheInterceptor interface. This approach uses cache-aside pattern to read data and store database record as a cache entry. We’ve been using this approach in production at Intuit for QuickBook Online Payroll to help scale our database and improve application performance. It has been a great success.
Cache-aside
- When your application needs to read data from the database, it’ll check Redis (L3) first to see if data is available
- If the data is available (a cache hit), the cached data returned
- If the data is not (a cache miss), the database is queried for data. The cache will be populated and data returned to the caller
Database Record
DatabaseRecord is an object in EclipseLink that represents a database row as field value pairs. A DatabaseRecord provides data to one or many domain objects. EclipseLink has APIs to build domain objects from DatabaseRecord.
Domain objects are used by L2 cache. Using DatabaseRecord as cache entry simplifies the implementation greatly because we don’t have to worry about maintaining a domain object relationship. The primary key can be used together with domain classname to create a cached key. At a conceptual level, DatabaseRecord is similar to a database table row. The important point is this approach caches data, not the object tree.
Here is the conceptual Read flow
For implementation, we use a combination of AspectJ to hook into EclipseLink lifecycle to intercept Read/Write operation for populating and invalidating cache
Enough Talk, Show Me the Code
DatabaseRecordAspect.java: This class would intercept the selectOneRow and selectAllRows methods used by EclipseLink to read one object and read the collection of objects
xxxxxxxxxx
public class DatabaseRecordAspect {
("execution(org.eclipse.persistence.internal.sessions.AbstractRecord org.eclipse.persistence.internal.queries.ExpressionQueryMechanism.selectOneRow())")
public void selectOneRow(){
}
("execution(java.util.Vector org.eclipse.persistence.internal.queries.ExpressionQueryMechanism.selectAllRows())")
public void selectAllRows(){
}
("selectOneRow() && this(expressionQueryMechanism)")
public AbstractRecord aroundSelectOneRow(ProceedingJoinPoint thisJoinPoint, ExpressionQueryMechanism expressionQueryMechanism) throws Throwable {
return new DatabaseRecordInterceptor().handleSelectOneRow(thisJoinPoint, expressionQueryMechanism);
}
("selectAllRows() && this(expressionQueryMechanism)")
public Vector aroundSelectAllRows(ProceedingJoinPoint thisJoinPoint, ExpressionQueryMechanism expressionQueryMechanism) throws Throwable {
return new DatabaseRecordInterceptor().handleSelectAllRows(thisJoinPoint, expressionQueryMechanism);
}
}
DatabaseRecordInterceptor.java: Responsible for intercepting ExpressionQueryMechanism.selectOneRow() and ExpressionQueryMechanism.selectAllRows() to cache DatabaseRecord before results are translated into EclipseLink objects
xxxxxxxxxx
// This could be Redis, Memcache, Apache Ignite...
private final Cache cache;
public DatabaseRecordInterceptor() {
cache = CacheFactory.getInstance(CacheType.REDIS);
}
/**
* Intercept ExpressionQueryMechanism.selectOneRow() to cache AbstractRecord
*
* @param thisJoinPoint Aspect join point
* @param expressionQueryMechanism Query Object for a given query
* @return AbstractRecord EclipseLink database record instance
*/
public AbstractRecord handleSelectOneRow(ProceedingJoinPoint thisJoinPoint, ExpressionQueryMechanism expressionQueryMechanism) throws Throwable {
ReadObjectQuery readObjectQuery = expressionQueryMechanism.getReadObjectQuery();
AbstractRecord databaseRecord = null;
String cachedKey;
// Look up cache
try {
cachedKey = extractCacheKey(readObjectQuery);
if (cachedKey != null) {
databaseRecord = (AbstractRecord) cache.get(cachedKey);
}
} catch (Throwable t) {
return (AbstractRecord) thisJoinPoint.proceed();
}
if (databaseRecord == null) { // cache miss
// Proceed with database query
databaseRecord = (AbstractRecord) thisJoinPoint.proceed();
// then putting that into cache
try {
if (databaseRecord != null) {
if (cachedKey == null) {
cachedKey = extractCacheKeyFromPrimaryKeyAndAbstractRecord(readObjectQuery, databaseRecord);
}
}
if (cachedKey != null) {
cache.put(cachedKey, databaseRecord);
}
} catch (Throwable t) {
// handle exception
} finally {
return databaseRecord;
}
} else {
return databaseRecord;
}
}
/**
* Intercept ExpressionQueryMechanism.handleSelectAllRows() to cache AbstractRecord. This is the case of ReadAllQuery.
* EclipseLink doesn't cache the entire collection but caching individual objects in a collection by its primary key.
* We're following the same algorithm to cache individual DatabaseRecord which has primary key. This is the same key
* as in L2 so that when L2 is updated, we can update the corresponding DatabaseRecord correctly
*
* @param thisJoinPoint Aspect join point
* @param expressionQueryMechanism Query Object for a given query
* @return list of AbstractRecord
*/
public Vector handleSelectAllRows(ProceedingJoinPoint thisJoinPoint, ExpressionQueryMechanism expressionQueryMechanism) throws Throwable {
Vector rows = (Vector) thisJoinPoint.proceed();
try {
if (rows != null && rows.size() > 0) {
try {
ObjectLevelReadQuery readObjectQuery = (ObjectLevelReadQuery) expressionQueryMechanism.getQuery();
List<NameValuePair> keyAndObjects = new ArrayList<>();
if (!rows.isEmpty()) {
for (Object row : rows) {
AbstractRecord databaseRecord = (AbstractRecord) row;
String cachedKey = extractCacheKeyFromPrimaryKey(readObjectQuery,
extractPrimaryKeyFromRow(readObjectQuery, databaseRecord));
// CacheKey would be null if no primary
if (cachedKey != null) {
keyAndObjects.add(new NameValuePair(cachedKey, databaseRecord));
}
}
if (keyAndObjects.size() > 0) {
// This call in batch and async
cache.put(keyAndObjects);
}
}
} catch (Throwable t) {
// handle exception
}
}
return rows;
} catch (Throwable t) {
return null;
}
}
//
// Helpers to generate cache key from EclipseLink primary key
//
private String extractCacheKey(ReadObjectQuery readObjectQuery) {
Object primaryKey;
if (readObjectQuery.isPrimaryKeyQuery()) { // Query by id
primaryKey = readObjectQuery.getSelectionId();
if (primaryKey == null) {
primaryKey = readObjectQuery.getDescriptor().getObjectBuilder().extractPrimaryKeyFromObject(readObjectQuery.getSelectionObject(), readObjectQuery.getSession());
}
return extractCacheKeyFromPrimaryKey(readObjectQuery, primaryKey.toString());
} else {
AbstractRecord translationRow = readObjectQuery.getTranslationRow();
primaryKey = extractPrimaryKeyFromRow(readObjectQuery, translationRow);
return extractCacheKeyFromPrimaryKey(readObjectQuery, primaryKey);
}
}
private String extractCacheKeyFromPrimaryKey(ObjectLevelReadQuery readObjectQuery, Object primaryKey) {
if (primaryKey != null) {
return makeCacheKey(primaryKey, readObjectQuery.getDescriptor());
} else {
return null;
}
}
private Object extractPrimaryKeyFromRow(ObjectLevelReadQuery readObjectQuery, AbstractRecord row) {
return readObjectQuery.getDescriptor().getObjectBuilder().extractPrimaryKeyFromRow(row, readObjectQuery.getSession());
}
private String extractCacheKeyFromPrimaryKeyAndAbstractRecord(ObjectLevelReadQuery readObjectQuery, AbstractRecord row) {
if (readObjectQuery != null && row != null) {
return extractCacheKeyFromPrimaryKey(readObjectQuery, extractPrimaryKeyFromRow(readObjectQuery, row).toString());
} else {
return null;
}
}
private String makeCacheKey(final Object pk, ClassDescriptor classDescriptor) {
return classDescriptor.getJavaClass().getSimpleName() + "-" + pk.toString();
}
For cache invalidation, you just need to register your invalidators with DescriptorEventAdapter and implement postUpdate(), postDelete() and postInsert(). When EclipseLink write occurs, one of these methods would get executed and you can call your invalidators to delete/update cache entry in Redis. You should populate and invalidate cache asynchronously to avoid blocking the application.
We use Lettuce as a client library to talk to Redis and Kryo for serialization
Conclusion
Integrating EclipseLink with Redis is much easier if you cache data used to populate domain objects. Initially, we attempted to cache domain object but ran into several issues with EclipseLink. Domain object maintains associations with other objects. When you read object back from Redis, you’ll have to reconstruct the object tree. This gets complicated when there are lazy-load associations among objects. Data caching is simple and the cache entry size is consistent (i.e. one database row). Predictable cache entry size helps us optimize Redis cache size and make serialization and deserialization faster. You don’t have to change much of your existing code to get this working. This is really a game changer for us in regard to reducing our database load and providing consistent performance.
Opinions expressed by DZone contributors are their own.
Comments