Using Hazelcast in Spring Boot Running on Kubernetes
A simple Spring Boot application with Data JPA (Hibernate) using embedded Hazelcast for the second-level cache with the support of the Kubernetes cluster.
Join the DZone community and get the full member experience.
Join For FreeWhereas Kubernetes is the target environment of application execution, you will not find here how to declare required services or deployment most optimally. Some basic Kubernetes configurations are just enough for demonstration purposes. Though, you can use the configurations as a starting point and improve them with the help of Kubernetes Documentation.
The theme of Spring's Caching is not fully covered either. So, if you are interested in Cache Abstractions, please refer to the corresponding part of Spring Framework documentation.
What You Will Learn
The main focus of the article is to show how to build a Spring Boot application with the components and configurations necessary:
- to work with Spring Data JPA based on Hibernate
- to use Hazelcast IMDG (in-memory data grid) in embedded mode
- to deploy an application on the Kubernetes cluster
What You Need
- JDK 11 (I think the code will work with Java 8, though it will require changes in
pom.xml
) - Apache Maven
- Minikube to set up a local Kubernetes cluster.
- Docker CLI to build an application Docker image.
The code for the guide can be found on the Spring Data JPA application on Kubernetes with the caching on Hazelcast. The application was built and run on Microsoft Windows, but I believe it will work on other platforms as well after some tweaking of slashes.
Application Structure
First of all, the application should work with the database via JPA abstraction. As we are going to introduce the second level cache, this implies that Hibernate is used as an ORM framework. To spare some time, I have chosen a ready-to-use code of Spring's guide Accessing data with MySQL. It implements a simple flow controller-repository with MySQL as storage. The original code is almost the same except for small changes, like updated versions of Spring and MySQL JDBC connector in pom.xml
and refactoring of packages.
Also, I added a couple of methods into repository and controller. These methods will be helpful later in showing the results of the caching.
@Query(value = "SELECT name FROM user WHERE email = :email", nativeQuery = true)
Optional<String> findNameByEmail(String email);
@GetMapping(path = "{id}")
public @ResponseBody ResponseEntity<User> getUser(@PathVariable Integer id) {
return userRepository.findById(id)
.map(user -> ResponseEntity.ok().body(user))
.orElseGet(() -> ResponseEntity.notFound().build());
}
In general, the application provides the same functionality as it is described in the guide Accessing data with MySQL. New endpoint /users/{id}
that returns user information by their identifier, and /users/email/{email}
returns users name by their email.
Spring Boot With Hazelcast
Spring Framework supports plenty of caching platforms and libraries. Spring Boot makes it easy to use them in an application configuration. So, Spring Boot finds caching providers on the classpath and auto-configures them using default settings. See Spring Boot's documentation about the process of the auto-configuration Hazelcast Cache Provider configuration.
As we are going to use Hazelcast, set up the next properties explicitly:
spring.cache.type=hazelcast
spring.hazelcast.config=classpath:cache.yaml
And add Hazelcast into the application's dependencies:
<dependency>
<groupId>com.hazelcast</groupId>
<artifactId>hazelcast-all</artifactId>
<version>4.2.5</version>
</dependency>
With this, Spring Boot finds and injects a bean of HazelcastInstance
into an application context. The Hazelcast will be configurated by settings from cache.yaml
.
After that, HazelcastInstance
can be used in other beans, i.e., it can be used in the constructor of our UserController
:
private static final String CACHED_NAMES = "nameByEmail";
private UserRepository userRepository;
private ConcurrentMap<String, String> nameByEmailMap;
public UserController(UserRepository userRepository, HazelcastInstance hazelcastInstance) {
this.userRepository = userRepository;
nameByEmailMap = hazelcastInstance.getMap(CACHED_NAMES);
}
In the fragment above, we are using Hazelcast's Distributed Map, which is presented and used in the code as ConcurrentMap
. This gives a simple and flexible way to work with cached data as with simple map. All the heavy lifting to distribute cached data throughout the cluster Hazelcast performs behind the scenes. So, the usage of the cache as a map is fully transparent for developers.
Hibernate Second Level Cache
Next, we will enable the second-level (L2) cache for Hibernate. In our application, this is done by setting up properties:
spring.jpa.properties.hibernate.cache.use_second_level_cache=true
spring.jpa.properties.hibernate.cache.use_query_cache=true
spring.jpa.properties.hibernate.cache.region.factory_class=com.hazelcast.hibernate.HazelcastCacheRegionFactory
spring.jpa.properties.hibernate.cache.hazelcast.instance_name=users-app
spring.jpa.properties.hibernate.cache.hazelcast.shutdown_on_session_factory_close=false
Here, along with a general second-level cache, we also enable Query cache. As we are using an embedded Hazelcast that will be used in a distributed environment, the factory class is set into com.hazelcast.hibernate.HazelcastCacheRegionFactory
. A detailed description of properties and settings for Hibernate caching and Hazelcast implementation for the second-level cache can be found in documentation correspondingly Hibernate Caching and Hibernate Second Level Cache.
To make the application using Hazelcast as the second-level cache, we add the dependency into pom.xml:
<dependency>
<groupId>com.hazelcast</groupId>
<artifactId>hazelcast-hibernate53</artifactId>
<version>2.2.1</version>
</dependency>
Let me pay attention to several moments. The second-level cache for Hibernate entities is not used by default, so you need explicitly define which entities you want to be cached. In our application, it is done in the User
entity like this:
import org.hibernate.annotations.Cache;
import org.hibernate.annotations.CacheConcurrencyStrategy;
import javax.persistence.Cacheable;
import javax.persistence.Entity;
@Cacheable
@Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
@Entity
public class User {
// skipped
}
Then, Hibernate warns not to use Query cache as it brings an overhead into transaction flow and generally does not give much benefits in most cases. Here is the quote from the documentation:
Caching of query results introduces some overhead in terms of your application's normal transactional processing. For example, if you cache the results of a query against a Person, Hibernate will need to keep track of when those results should be invalidated because changes have been committed against any Person entity.
That, coupled with the fact that most applications simply gain no benefit from caching query results, leads Hibernate to disable caching of query results by default.
Though for demonstration purposes only in our application, Query cache is enabled, and UserRepository
overrides findAll()
in the next way to cache the results of the query:
import org.springframework.data.jpa.repository.QueryHints;
import javax.persistence.QueryHint;
import static org.hibernate.jpa.QueryHints.HINT_CACHEABLE;
import static org.hibernate.jpa.QueryHints.HINT_CACHE_REGION;
// some code is skipped
@QueryHints({
@QueryHint(name = HINT_CACHEABLE, value = "true"),
@QueryHint(name = HINT_CACHE_REGION, value = "query-cache-users")
})
@Override
Iterable<User> findAll();
Finally, it is possible to use Hazelcast as a pure caching framework separately from Hibernate L2 cache support. So, by default, HazelcastCacheRegionFactory
creates a new instance of Hazelcast if property hibernate.cache.hazelcast.instance_name
is not set into an existing instance. To reuse the same Hazecast instance which is created by Spring, we set up the property like this:
spring.jpa.properties.hibernate.cache.hazelcast.instance_name=users-app
The name of the instance should be the same, which is defined in the Hazelcast configuration file cache.yaml
:
hazelcast:
instance-name: users-app
This will avoid the creation of different Hazelcast instances in the application. So, Hibernate L2 cache will be set up with the same configuration from cache.yaml
.
Deploying Hazelcast on Kubernetes
Hazelcast provides different means of auto-discovery cluster members. For instance, for local (development) environment, the multicast mechanism over UDP allows members to find each other. So, it is possible to run locally several instances of the application and Hazelcast will be able to build a cluster using a multicast auto-discovery. This type of discovery is enabled as following:
hazelcast:
network:
join:
multicast:
enabled: true
For production environments, the usage of UDP is not the best choice. So, among others, Hazelcast supports clusters which are deployed in Kubernetes environment without multicasting. For the detailed information, look into the documentation on Configuring Kubernetes. To activate the discovery on Kubernetes, set the properties as follows:
hazelcast:
instance-name: users-app
cluster-name: users-app
network:
join:
multicast:
enabled: false
kubernetes:
enabled: true
service-name: hazelcast
This disables the multicast, enables Kubernetes discovery, and defines the name of the service hazelcast,
which will be used to search for cluster members via Kubernetes API. Then, we need to create this service, which can be declared as the next (see also k8s/app-hazelcast-service.yaml
):
apiVersion: v1
kind: Service
metadata:
name: hazelcast
labels:
app: hazelcast
spec:
ports:
- port: 5701
protocol: TCP
selector:
app: app-users
type: ClusterIP
The key point here is to define the same name, hazelcast
of the service which is used by Hazelcast configuration.
To allow Hazelcast to use the service inside Kubernetes for the discovery, we also need to grant certain permissions. An example of RBAC configuration for default
namespace you can find in Hazelcast documentation. The same YAML file you can find in the code, see k8s\app-hazelcast-rbac.yaml
.
Run the following commands to create the service and to grand roles:
kubectl apply -f k8s\app-hazelcast-service.yaml
kubectl apply -f k8s\app-hazelcast-rbac.yaml
Then, build the application, and after that, build a Docker image, which will be stored into your local Docker repository:
mvn clean verify
docker build -f .\docker\Dockerfile -t rapp/appusers:1.0 .
After that, you can check an availability of the image by running command docker image ls
, which should return something like this:
REPOSITORY TAG IMAGE ID CREATED SIZE
rapp/appusers 1.0 9c92dfd38bbf 24 hours ago 419MB
This means that you can use the image rapp/appusers:1.0
to deploy into Kubernetes cluster. To do this, run the next commands, which will deploy three pods with the applications and will create a load balancer for the pods:
kubectl apply -f k8s\app-k8s-deployment.yaml
kubectl apply -f k8s\app-k8s-service.yaml
The result of the deployment you can check by running kubectl get pods
, which should show the state of application pods similar to the next:
NAME READY STATUS RESTARTS AGE
app-users-55d9b89d86-gstwz 1/1 Running 0 23h
app-users-55d9b89d86-l67dx 1/1 Running 0 23h
app-users-55d9b89d86-l766r 1/1 Running 0 23h
Which means that all the pods are created, ready, and running. Also, you can check logs of any pod by running kubectl logs <name-of-pod>
. The logs will contain usual information about the starting of Spring Boot application (creation of context, Tomcat running, Hibernate instantiating, etc.). Apart from that, you can also see activities of Hazelcast: which discovery mechanism is activated, how the process of discovery and the build of cluster are performed. Finally, you can see the result of Hazelcast cluster construction similar to this:
Members {size:3, ver:3} [
Member [172.17.0.4]:5701 - 8d7d80e4-d0e2-4bd6-8116-ab3a7b494a3f
Member [172.17.0.5]:5701 - 9aa6f3f8-7da7-4037-8e36-761d8432ceea
Member [172.17.0.6]:5701 - b21112b7-35a7-49fb-ad06-0f921b9823be this
]
So, the cache cluster of three member is built, i.e., all three pods are found and added into the cluster.
Testing
As Kubernetes deploys pods into its own IP range and creates services on the declared ports, which are mapped into different external ones, you need to determine either Minikube's IP and external ports or to run Minikube in a tunnel mode, which allows using internal IPs and ports. To get Minikube's IP and the application port, run the following commands:
> minikube ip
172.20.164.53
> kubectl get service app-users
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
app-users LoadBalancer 10.104.184.147 <pending> 8080:31434/TCP 24h
So you can access the application.
Another way to access the application is to run minikube tunnel
command in a separate console. Then, check which an external IP is given to the application service:
> kubectl get service app-users
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
app-users LoadBalancer 10.104.184.147 10.104.184.147 8080:31434/TCP 24h
This means that you can open the application.
Using any of these approaches, you can test endpoints that are declared in UserController
in the same way as it is suggested in the guide Accessing data with MySQL. Additionally, there are created two endpoints /users/{id}
and /users/email/{email}
to check how entities are cached and how Hazelcast map works.
The first endpoint returns information about users by their identifier. As the application properties set spring.jpa.show-sql=true
, the logs will contain SQL requests generated by Hibernate. With enabled L2 cache, repeated requests for the same user will not log requests to the database (of cause, within time-to-live-seconds: 300
).
The second added endpoint shows the usage of Hazelcast map structure. As the endpoint triggers a custom (not-cached) request to the database, each SQL request could have been logged.
@Query(value = "SELECT name FROM user WHERE email = :email", nativeQuery = true)
Optional<String> findNameByEmail(String email);
But the results of the request are stored in the Hazelcast map after the retrieving data from the database. So, if the map returns not empty value by a given email, no additional request to the database is performed:
private Optional<String> getFromCache(String email) {
return Optional.ofNullable(nameByEmailMap.computeIfAbsent(email,
(e) -> userRepository.findNameByEmail(e).orElse(null)));
}
Postscript
As was mentioned in the foreword, this guide is not a definitive tutorial for mentioned technologies and tools. Here are some suggestions for improvements.
It is possible to use "out-of-the-box" Spring's support of caching, which in our case, is based on the fact that HazelcastInstance
is discovered by Spring and is wrapped into CacheManager
. This allows using Spring's @EnableCaching
and @Cacheable
annotations.
In our simplified Kubernetes configurations, it is possible to replace types of services, like NodePort,
into clusterIP
, which might allow us to run applications in the cluster and to discover services by names. So, it can help to avoid joggling with hardcoded IPs. Additionally, it is highly recommended to avoid using a default
namespace. So, consider using a dedicated namespace in the application and Hazelcast configuration.
Finally, never ever store your secrets in the code. In our code, the database credentials are encoded and written directly into YAML configuration files for demonstration and simplicity only. Do not repeat this in your applications. Well-adopted practice is to store secrets in environment variables or third-party vaults, which are retrieved/substituted by CI/CD pipelines during deployment.
Opinions expressed by DZone contributors are their own.
Comments