Caching Over MyBatis: The Widely Used Ehcache Implementation with MyBatis
Join the DZone community and get the full member experience.
Join For FreeThis article represents the first Proof of Concept from series described in the previous article 4 Hands-On Approaches to Improve Your Data Access Layer Implementation and it presents how to implement Ehcache over MyBatis, how to achieve an optim configuration for it and personal opinions of the author about the chosen approach for the Data Access Layer.
Throughout my research on caching over MyBatis I have discovered that Ehcache is the first option among developers when they need to implement a cache mechanism over MyBatis, using a 3rd party library. Ehcache is probably so popular because it represents an open source, java-based cache, available under an Apache 2 license. Also, it scales from in-process with one or more nodes through to a mixed in-process/out-of-process configuration with terabyte-sized caches. In addition, for those applications needing a coherent distributed cache, Ehcache uses the open source Terracotta Server Array. Last but not least, among its adopters is the Wikimedia Foundation that uses Ehcache to improve the performance of its wiki projects.
Within this article, the following aspects will be addressed:
1. How will an application benefit from caching using Ehcache? Ehcache's features will be detailed in this section.
2. Hands-on implementation of the EhCachePOC project - in this section the key concepts of EhCache will be explored through a hands on implementation.
3. Summary - How has the application performance been improved after this implementation?
Code of all the projects that will be implemented can be found at https://github.com/ammbra/CacherPoc or if you are interested only in the current implementation, you can access it here: https://github.com/ammbra/CacherPoc/tree/master/EhCachePoc
How will an application benefit from caching using Ehcache?
The time taken for an application to process a request principally depends on the speed of the CPU and main memory. In order to "speed up" your application you can perform one or more of the following:
- improve the algorithm performance
- achieve parallelisation of the computations across multiple CPUs or multiple machines
- upgrade the CPU speed
As explained in the previous article, high availability applications should perform a small amount of actions with the database. Since the time taken to complete a computation depends principally on the rate at which data can be obtained, then the application should be able to temporarily store computations that may be reused again. Caching may be able to reduce the workload required, this means a caching mechanism should be created!
Ehcache is described as :
- Fast and Light Weight , having a simple API and requiring only a dependency on SLF4J.
- Scalable to hundreds of nodes with the Terracotta Server Array, but also because provides Memory and Disk store for scalability into gigabytes
- Flexible because supports Object or Serializable caching; also provides LRU, LFU and FIFO cache eviction policies
- Standards Based having a full implementation of JSR107 JCACHE API
- Application Persistence Provider because it offers persistent disk store which stores data between VM restarts
- JMX Enabled
- Distributed Caching Enabler because it offers clustered caching via Terracotta and replicated caching via RMI, JGroups, or JMS
- Cache Server (RESTful, SOAP cache Server)
- Search Compatible, having a standalone and distributed search using a fluent query language
Hands-on implementation of the EhCachePOC project
The implementation of EhCachePoc will look as described in the diagram below:
In order to test Ehcache performance through a POC(proof of concept) project the following project setup is performed:
1. Create a new Maven EJB Project from your IDE (this kind of project is platform provided by NetBeans but for those that use eclipse, here is an usefull tutorial) . In the article this project is named EhCachePOC.
2. Edit the project's pom by adding required jars :
<dependency> <groupId>org.mybatis</groupId> <artifactId>mybatis</artifactId> <version>3.2.6</version> </dependency> <dependency> <groupId>org.mybatis.caches</groupId> <artifactId>mybatis-ehcache</artifactId> <version>1.0.2</version> </dependency> <dependency> <groupId>log4j</groupId> <artifactId>log4j</artifactId> <version>1.2.17</version> </dependency> <dependency> <groupId>net.sf.ehcache</groupId> <artifactId>ehcache</artifactId> <version>2.7.0</version> </dependency> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-log4j12</artifactId> <version>1.7.5</version> </dependency>
3.Add your database connection driver, in this case apache derby:
<dependency> <groupId>org.apache.derby</groupId> <artifactId>derbyclient</artifactId> <version>10.11.1.1</version> </dependency>
4. Run mvn clean and mvn install commands on your project.
Now the project setup is in place, let's go ahead with MyBatis implementation :
1. Configure under resources/com/tutorial/ehcachepoc/xml folder the Configuration.xml file with :
<?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE configuration PUBLIC "-//mybatis.org//DTD Config 3.0//EN" "http://mybatis.org/dtd/mybatis-3-config.dtd"> <configuration> <environments default="development"> <environment id="development"> <transactionManager type="JDBC"/> <dataSource type="UNPOOLED"> <property name="driver" value="org.apache.derby.jdbc.ClientDriver"/> <property name="url" value="dburl"/> <property name="username" value="cruddy"/> <property name="password" value="cruddy"/> </dataSource> </environment> </environments> <mappers> <!--<mapper resource="com/tutorial/ehcachepoc/xml/EmployeeMapper.xml" />--> </mappers> </configuration>
2. Create in java your own SQLSessionFactory implementation. For example, create something similar to com.tutorial.ehcachepoc.config. SQLSessionFactory :
public class SQLSessionFactory { private static final SqlSessionFactory FACTORY; static { try { Reader reader = Resources.getResourceAsReader("com/tutorial/ehcachepoc/xml/Configuration.xml"); FACTORY = new SqlSessionFactoryBuilder().build(reader); } catch (Exception e){ throw new RuntimeException("Fatal Error. Cause: " + e, e); } } public static SqlSessionFactory getSqlSessionFactory() { return FACTORY; } }
3. Create the necessary bean classes, those that will map to your sql results, like Employee:
public class Employee implements Serializable { private static final long serialVersionUID = 1L; private Integer id; private String firstName; private String lastName; private String adress; private Date hiringDate; private String sex; private String phone; private int positionId; private int deptId; public Employee() { } public Employee(Integer id) { this.id = id; } @Override public String toString() { return "com.tutorial.ehcachepoc.bean.Employee[ id=" + id + " ]"; } }
4. Create the IEmployeeDAO interface that will expose the ejb implementation when injected:
public interface IEmployeeDAO { public List<Employee> getEmployees(); }
5. Implement the above inteface and expose the implementation as a Stateless EJB (this kind of EJB preserves only its state, but there is no need to preserve its associated client state):
@Stateless(name = "ehcacheDAO") @TransactionManagement(TransactionManagementType.CONTAINER) public class EmployeeDAO implements IEmployeeDAO { private static Logger logger = Logger.getLogger(EmployeeDAO.class); private SqlSessionFactory sqlSessionFactory; @PostConstruct public void init() { sqlSessionFactory = SQLSessionFactory.getSqlSessionFactory(); } @Override public List<Employee> getEmployees() { logger.info("Getting employees....."); SqlSession sqlSession = sqlSessionFactory.openSession(); List<Employee> results = sqlSession.selectList("retrieveEmployees"); sqlSession.close(); return results; } }
5. Create the EmployeeMapper.xml that contains the query named "retrieveEmployees"
<!DOCTYPE mapper PUBLIC "-//mybatis.org//DTD Mapper 3.0//EN" "http://mybatis.org/dtd/mybatis-3-mapper.dtd" > <mapper namespace="com.tutorial.ehcachepoc.mapper.EmployeeMapper" > <resultMap id="results" type="com.tutorial.ehcachepoc.bean.Employee" > <id column="id" property="id" javaType="integer" jdbcType="BIGINT" /> <result column="first_name" property="firstName" javaType="string" jdbcType="VARCHAR"/> <result column="last_name" property="lastName" javaType="string" jdbcType="VARCHAR"/> <result column="hiring_date" property="hiringDate" javaType="date" jdbcType="DATE" /> <result column="sex" property="sex" javaType="string" jdbcType="VARCHAR" /> <result column="dept_id" property="deptId" javaType="integer" jdbcType="BIGINT" /> </resultMap> <select id="retrieveEmployees" resultMap="results" > select id, first_name, last_name, hiring_date, sex, dept_id from employee </select> </mapper>
If you remember the CacherPOC setup from the previously article, then you can test your implementation if you add EhCachePOC project as dependency and inject the IEmployeeDAO inside the EhCacheServlet. Your CacherPOC pom.xml file should contain :
<dependency> <groupId>${project.groupId}</groupId> <artifactId>EhCachePoc</artifactId> <version>${project.version}</version> </dependency>
and your servlet should look like:
@WebServlet("/EhCacheServlet") public class EhCacheServlet extends HttpServlet { private static Logger logger = Logger.getLogger(EhCacheServlet.class); @EJB(beanName ="ehcacheDAO") IEmployeeDAO employeeDAO; private static final String LIST_USER = "/listEmployee.jsp"; @Override protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException { String forward= LIST_USER; List<Employee> results = new ArrayList<Employee>(); for (int i = 0; i < 10; i++) { for (Employee emp : employeeDAO.getEmployees()) { logger.debug(emp); results.add(emp); } try { Thread.sleep(3000); } catch (Exception e) { logger.error(e, e); } } req.setAttribute("employees", results); RequestDispatcher view = req.getRequestDispatcher(forward); view.forward(req, resp); } }
Run your CacherPoc implementation to check if your Data Access Layer with MyBatis is working or download the code provided at https://github.com/ammbra/CacherPoc
But if a great amount of employees is stored in database, or perhaps the retrieval of a number of 10xemployeesNo represents a lot of workload for the database. Also, can be noticed that the query from the EmployeeMapper.xml retrieves data that almost never changes (id, first_name, last_name, hiring_date, sex cannot change; the only value that might change in time is dept_id); so a caching mechanism can be used.
Below is described how this can be achieved using EhCache:
1. Configure directly under the resources folder the ehcache.xml file with:
<?xml version="1.0" encoding="UTF-8"?> <!-- caching configuration --> <ehcache> <defaultCache eternal="true" maxElementsInMemory="1000" timeToIdleSeconds="3600" timeToLiveSeconds="3600" maxEntriesLocalHeap="1000" maxEntriesLocalDisk="10000000" memoryStoreEvictionPolicy="LRU" statistics="true" /> </ehcache>
This xml explains that the Memory Store is used for an LRU (Last Recently Used) caching strategy, sets the limits for the number of elements allowed for storage, their time to be idle and their time to live.
The Memory Store strategy is often chosen because is fast and thread safe for use by multiple concurrent threads, being backed by LinkedHashMap. Also, all elements involved in the caching process are suitable for placement in the Memory Store.
Another approach can be tried: storing cache on disk. This can be done by replacing the ehcache tag content with:
diskStore path="F:\\cache" /> <defaultCache eternal="true" maxElementsInMemory="1000" overflowToDisk="true" diskPersistent="true" timeToIdleSeconds="0" timeToLiveSeconds="0" memoryStoreEvictionPolicy="LRU" statistics="true" />
Unlike the memory store strategy, the disk store implementation is suitable only for elements which are serializable can be placed in the off-heap; if any non serializable elements are encountered, those will be removed and WARNING level log message emitted. The eviction is made using the LFU algorithm and it is not configurable or changeable. From persistency point of view, this method of caching allows control of the cache by the disk persistent configuration; if false or omitted, disk store will not persist between CacheManager restarts.
2. Update EmployeeMapper.xml to use the previous implemented caching strategy:
<?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE mapper PUBLIC "-//mybatis.org//DTD Mapper 3.0//EN" "http://mybatis.org/dtd/mybatis-3-mapper.dtd" > <mapper namespace="com.tutorial.ehcachepoc.mapper.EmployeeMapper" > <cache type="org.mybatis.caches.ehcache.EhcacheCache"/> <resultMap id="results" type="com.tutorial.ehcachepoc.bean.Employee" > <id column="id" property="id" javaType="integer" jdbcType="BIGINT" /> <result column="first_name" property="firstName" javaType="string" jdbcType="VARCHAR"/> <result column="last_name" property="lastName" javaType="string" jdbcType="VARCHAR"/> <result column="hiring_date" property="hiringDate" javaType="date" jdbcType="DATE" /> <result column="sex" property="sex" javaType="string" jdbcType="VARCHAR" /> <result column="dept_id" property="deptId" javaType="integer" jdbcType="BIGINT" /> </resultMap> <select id="retrieveEmployees" resultMap="results" useCache="true"> select id, first_name, last_name, hiring_date, sex, dept_id from employee </select> </mapper>
By adding the line <cache type="org.mybatis.caches.ehcache.EhcacheCache"/> and specifying on the query useCache="true" you are binding the ehcache.xml configuration to your DataAccessLayer implementation.
Clean, build and redeploy both EhCachePOC and CacherPoc projects; now retrieve your employees for two times in order to allow the in-memory cache to store your values. When you run your query for the first time, your application will execute the query on the database and retrieve the results. Second time you access the employee list, your application will access the in-memory storage.
Summary - How has the application performance been improved after this implementation?
An application's performances depend on a multitude of factors
- how many times a cached piece of data can and is reduced by the application
- the proportion of the response time that is alleviated by caching
where P is proportion speed up
and S is speed up.
Let's take the application from this article as example and calculate the speed up.
When the application ran the query without caching,a JDBC transaction is performed and in your log will be something similar to :
INFO: 2014-11-27 18:01:30,020 [EmployeeDAO] INFO com.tutorial.hazelcastpoc.dao.EmployeeDAO:38 - Getting employees..... INFO: 2014-11-27 18:01:39,148 [JdbcTransaction] DEBUG org.apache.ibatis.transaction.jdbc.JdbcTransaction:98 - Setting autocommit to false on JDBC Connection [org.apache.derby.client.net.NetConnection40@1c374fd] INFO: 2014-11-27 18:01:39,159 [retrieveEmployees] DEBUG com.tutorial.hazelcastpoc.mapper.EmployeeMapper.retrieveEmployees:139 - ==> Preparing: select id, first_name, last_name, hiring_date, sex, dept_id from employee INFO: 2014-11-27 18:01:39,220 [retrieveEmployees] DEBUG com.tutorial.hazelcastpoc.mapper.EmployeeMapper.retrieveEmployees:139 - ==> Parameters: INFO: 2014-11-27 18:01:39,316 [retrieveEmployees] DEBUG com.tutorial.hazelcastpoc.mapper.EmployeeMapper.retrieveEmployees:139 - <== Total: 13
while running the queries with Ehcache caching the JDBC transaction is performed only once (to initialize the cache) and after that the log will look like :
INFO: 2014-11-28 18:04:50,020 [EmployeeDAO] INFO com.tutorial.ehcachepoc.dao.EmployeeDAO:38 - Getting employees..... INFO: 2014-11-28 18:04:50,020 [EhCacheServlet] DEBUG com.tutorial.cacherpoc.EhCacheServlet:41 - com.tutorial.crudwithjsp.model.Employee[ id=1 ]
Let's look at the time that each of our 10 times requests has scored:
- the first not cached version of 10 times requests took about 57 seconds and 51 milliseconds,
- while the
cached requests scored a time of 27seconds and 86 miliseconds.
In order to apply Amdhal's law for the system the following input is needed:
- Un-cached page time: 60 seconds
- Database time : 58 seconds
- Cache retrieval time: 28seconds
- Proportion: 96.6% (58/60) (P)
The expected system speedup is thus:
1 / (( 1 – 0.966) + 0.966 / (58/28)) = 1 / (0.034 + 0. 966/2.07) = 2 times system speedup
This result can be improved of course, but the purpose of this article was to prove that caching using Ehcache over MyBatis offers a significant improvement to what used to be available before its implementation.
Learn more from:
Opinions expressed by DZone contributors are their own.
Comments