Using Hazelcast in Spring Boot Running on Kubernetes

A simple Spring Boot application with Data JPA (Hibernate) using embedded Hazelcast for the second-level cache with the support of the Kubernetes cluster.

Ruslan Appazov

Sep. 22, 22 · Tutorial

Likes (1)

Comment

Save

11.3K Views

Whereas Kubernetes is the target environment of application execution, you will not find here how to declare required services or deployment most optimally. Some basic Kubernetes configurations are just enough for demonstration purposes. Though, you can use the configurations as a starting point and improve them with the help of Kubernetes Documentation.

The theme of Spring's Caching is not fully covered either. So, if you are interested in Cache Abstractions, please refer to the corresponding part of Spring Framework documentation.

What You Will Learn

The main focus of the article is to show how to build a Spring Boot application with the components and configurations necessary:

to work with Spring Data JPA based on Hibernate
to use Hazelcast IMDG (in-memory data grid) in embedded mode
to deploy an application on the Kubernetes cluster

What You Need

JDK 11 (I think the code will work with Java 8, though it will require changes in pom.xml)
Apache Maven
Minikube to set up a local Kubernetes cluster.
Docker CLI to build an application Docker image.

The code for the guide can be found on the Spring Data JPA application on Kubernetes with the caching on Hazelcast. The application was built and run on Microsoft Windows, but I believe it will work on other platforms as well after some tweaking of slashes.

Application Structure

First of all, the application should work with the database via JPA abstraction. As we are going to introduce the second level cache, this implies that Hibernate is used as an ORM framework. To spare some time, I have chosen a ready-to-use code of Spring's guide Accessing data with MySQL. It implements a simple flow controller-repository with MySQL as storage. The original code is almost the same except for small changes, like updated versions of Spring and MySQL JDBC connector in pom.xml and refactoring of packages.

Also, I added a couple of methods into repository and controller. These methods will be helpful later in showing the results of the caching.

     Java 
   
       @Query(value = "SELECT name FROM user WHERE email = :email", nativeQuery = true)
    Optional<String> findNameByEmail(String email);

     Java 
   
 
 
       @GetMapping(path = "{id}")
    public @ResponseBody ResponseEntity<User> getUser(@PathVariable Integer id) {
        return userRepository.findById(id)
                .map(user -> ResponseEntity.ok().body(user))
                .orElseGet(() -> ResponseEntity.notFound().build());
    } 
  

In general, the application provides the same functionality as it is described in the guide Accessing data with MySQL. New endpoint /users/{id} that returns user information by their identifier, and /users/email/{email} returns users name by their email.

Spring Boot With Hazelcast

Spring Framework supports plenty of caching platforms and libraries. Spring Boot makes it easy to use them in an application configuration. So, Spring Boot finds caching providers on the classpath and auto-configures them using default settings. See Spring Boot's documentation about the process of the auto-configuration Hazelcast Cache Provider configuration.

As we are going to use Hazelcast, set up the next properties explicitly:

     Properties files 
   
   spring.cache.type=hazelcast
spring.hazelcast.config=classpath:cache.yaml

And add Hazelcast into the application's dependencies:

     XML 
   
 
 
   <dependency>
  <groupId>com.hazelcast</groupId>
  <artifactId>hazelcast-all</artifactId>
  <version>4.2.5</version>
</dependency> 
  

With this, Spring Boot finds and injects a bean of HazelcastInstance into an application context. The Hazelcast will be configurated by settings from cache.yaml.

After that, HazelcastInstance can be used in other beans, i.e., it can be used in the constructor of our UserController:

     Java 
   
 
 
   private static final String CACHED_NAMES = "nameByEmail";
private UserRepository userRepository;
private ConcurrentMap<String, String> nameByEmailMap;

public UserController(UserRepository userRepository, HazelcastInstance hazelcastInstance) {
  this.userRepository = userRepository;
  nameByEmailMap = hazelcastInstance.getMap(CACHED_NAMES);
} 
  

In the fragment above, we are using Hazelcast's Distributed Map, which is presented and used in the code as ConcurrentMap. This gives a simple and flexible way to work with cached data as with simple map. All the heavy lifting to distribute cached data throughout the cluster Hazelcast performs behind the scenes. So, the usage of the cache as a map is fully transparent for developers.

Hibernate Second Level Cache

Next, we will enable the second-level (L2) cache for Hibernate. In our application, this is done by setting up properties:

     Properties files 
   
 
 
   spring.jpa.properties.hibernate.cache.use_second_level_cache=true
spring.jpa.properties.hibernate.cache.use_query_cache=true
spring.jpa.properties.hibernate.cache.region.factory_class=com.hazelcast.hibernate.HazelcastCacheRegionFactory
spring.jpa.properties.hibernate.cache.hazelcast.instance_name=users-app
spring.jpa.properties.hibernate.cache.hazelcast.shutdown_on_session_factory_close=false 
  

Here, along with a general second-level cache, we also enable Query cache. As we are using an embedded Hazelcast that will be used in a distributed environment, the factory class is set into com.hazelcast.hibernate.HazelcastCacheRegionFactory. A detailed description of properties and settings for Hibernate caching and Hazelcast implementation for the second-level cache can be found in documentation correspondingly Hibernate Caching and Hibernate Second Level Cache.

To make the application using Hazelcast as the second-level cache, we add the dependency into pom.xml:

     XML 
   
 
 
   <dependency>
  <groupId>com.hazelcast</groupId>
  <artifactId>hazelcast-hibernate53</artifactId>
  <version>2.2.1</version>
</dependency> 
  

Let me pay attention to several moments. The second-level cache for Hibernate entities is not used by default, so you need explicitly define which entities you want to be cached. In our application, it is done in the User entity like this:

     Java 
   
 
 
   import org.hibernate.annotations.Cache;
import org.hibernate.annotations.CacheConcurrencyStrategy;

import javax.persistence.Cacheable;
import javax.persistence.Entity;

@Cacheable
@Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
@Entity
public class User {
  // skipped
} 
  

Then, Hibernate warns not to use Query cache as it brings an overhead into transaction flow and generally does not give much benefits in most cases. Here is the quote from the documentation:

Caching of query results introduces some overhead in terms of your application's normal transactional processing. For example, if you cache the results of a query against a Person, Hibernate will need to keep track of when those results should be invalidated because changes have been committed against any Person entity.

That, coupled with the fact that most applications simply gain no benefit from caching query results, leads Hibernate to disable caching of query results by default.

Though for demonstration purposes only in our application, Query cache is enabled, and UserRepository overrides findAll() in the next way to cache the results of the query:

     Java 
   
 
 
   import org.springframework.data.jpa.repository.QueryHints;
import javax.persistence.QueryHint;

import static org.hibernate.jpa.QueryHints.HINT_CACHEABLE;
import static org.hibernate.jpa.QueryHints.HINT_CACHE_REGION;

// some code is skipped

@QueryHints({
  @QueryHint(name = HINT_CACHEABLE, value = "true"),
  @QueryHint(name = HINT_CACHE_REGION, value = "query-cache-users")
})
@Override
Iterable<User> findAll(); 
  

Finally, it is possible to use Hazelcast as a pure caching framework separately from Hibernate L2 cache support. So, by default, HazelcastCacheRegionFactory creates a new instance of Hazelcast if property hibernate.cache.hazelcast.instance_name is not set into an existing instance. To reuse the same Hazecast instance which is created by Spring, we set up the property like this:

     Properties files 
   
   spring.jpa.properties.hibernate.cache.hazelcast.instance_name=users-app

The name of the instance should be the same, which is defined in the Hazelcast configuration file cache.yaml:

     YAML 
   
   hazelcast:
  instance-name: users-app

This will avoid the creation of different Hazelcast instances in the application. So, Hibernate L2 cache will be set up with the same configuration from cache.yaml.

Deploying Hazelcast on Kubernetes

Hazelcast provides different means of auto-discovery cluster members. For instance, for local (development) environment, the multicast mechanism over UDP allows members to find each other. So, it is possible to run locally several instances of the application and Hazelcast will be able to build a cluster using a multicast auto-discovery. This type of discovery is enabled as following:

     YAML 
   
 
 
   hazelcast:
  network:
    join:
      multicast:
        enabled: true 
  

For production environments, the usage of UDP is not the best choice. So, among others, Hazelcast supports clusters which are deployed in Kubernetes environment without multicasting. For the detailed information, look into the documentation on Configuring Kubernetes. To activate the discovery on Kubernetes, set the properties as follows:

     YAML 
   
 
 
   hazelcast:
  instance-name: users-app
  cluster-name: users-app
  network:
    join:
      multicast:
        enabled: false
      kubernetes:
        enabled: true
        service-name: hazelcast 
  

This disables the multicast, enables Kubernetes discovery, and defines the name of the service hazelcast, which will be used to search for cluster members via Kubernetes API. Then, we need to create this service, which can be declared as the next (see also k8s/app-hazelcast-service.yaml):

     YAML 
   
 
 
   apiVersion: v1
kind: Service
metadata:
  name: hazelcast
  labels:
    app: hazelcast
spec:
  ports:
    - port: 5701
      protocol: TCP
  selector:
    app: app-users
  type: ClusterIP 
  

The key point here is to define the same name, hazelcast of the service which is used by Hazelcast configuration.

To allow Hazelcast to use the service inside Kubernetes for the discovery, we also need to grant certain permissions. An example of RBAC configuration for default namespace you can find in Hazelcast documentation. The same YAML file you can find in the code, see k8s\app-hazelcast-rbac.yaml.

Run the following commands to create the service and to grand roles:

     Plain Text 
   
   kubectl apply -f k8s\app-hazelcast-service.yaml
kubectl apply -f k8s\app-hazelcast-rbac.yaml

Then, build the application, and after that, build a Docker image, which will be stored into your local Docker repository:

     Plain Text 
   
   mvn clean verify
docker build -f .\docker\Dockerfile -t rapp/appusers:1.0 .

After that, you can check an availability of the image by running command docker image ls, which should return something like this:

     Plain Text 
   
   REPOSITORY              TAG                    IMAGE ID       CREATED         SIZE
rapp/appusers           1.0                    9c92dfd38bbf   24 hours ago    419MB

This means that you can use the image rapp/appusers:1.0 to deploy into Kubernetes cluster. To do this, run the next commands, which will deploy three pods with the applications and will create a load balancer for the pods:

     Plain Text 
   
   kubectl apply -f k8s\app-k8s-deployment.yaml
kubectl apply -f k8s\app-k8s-service.yaml

The result of the deployment you can check by running kubectl get pods, which should show the state of application pods similar to the next:

     Plain Text 
   
   NAME                            READY   STATUS    RESTARTS   AGE
app-users-55d9b89d86-gstwz      1/1     Running   0          23h
app-users-55d9b89d86-l67dx      1/1     Running   0          23h
app-users-55d9b89d86-l766r      1/1     Running   0          23h

Which means that all the pods are created, ready, and running. Also, you can check logs of any pod by running kubectl logs <name-of-pod>. The logs will contain usual information about the starting of Spring Boot application (creation of context, Tomcat running, Hibernate instantiating, etc.). Apart from that, you can also see activities of Hazelcast: which discovery mechanism is activated, how the process of discovery and the build of cluster are performed. Finally, you can see the result of Hazelcast cluster construction similar to this:

     Plain Text 
   
 
 
   Members {size:3, ver:3} [
        Member [172.17.0.4]:5701 - 8d7d80e4-d0e2-4bd6-8116-ab3a7b494a3f
        Member [172.17.0.5]:5701 - 9aa6f3f8-7da7-4037-8e36-761d8432ceea
        Member [172.17.0.6]:5701 - b21112b7-35a7-49fb-ad06-0f921b9823be this
] 
  

So, the cache cluster of three member is built, i.e., all three pods are found and added into the cluster.

Testing

As Kubernetes deploys pods into its own IP range and creates services on the declared ports, which are mapped into different external ones, you need to determine either Minikube's IP and external ports or to run Minikube in a tunnel mode, which allows using internal IPs and ports. To get Minikube's IP and the application port, run the following commands:

     Plain Text 
   
 
 
   > minikube ip
172.20.164.53
> kubectl get service app-users
NAME        TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
app-users   LoadBalancer   10.104.184.147   <pending>     8080:31434/TCP   24h 
  

So you can access the application.

Another way to access the application is to run minikube tunnel command in a separate console. Then, check which an external IP is given to the application service:

     Plain Text 
   
   > kubectl get service app-users
NAME        TYPE           CLUSTER-IP       EXTERNAL-IP      PORT(S)          AGE
app-users   LoadBalancer   10.104.184.147   10.104.184.147   8080:31434/TCP   24h

This means that you can open the application.

Using any of these approaches, you can test endpoints that are declared in UserController in the same way as it is suggested in the guide Accessing data with MySQL. Additionally, there are created two endpoints /users/{id} and /users/email/{email} to check how entities are cached and how Hazelcast map works.

The first endpoint returns information about users by their identifier. As the application properties set spring.jpa.show-sql=true, the logs will contain SQL requests generated by Hibernate. With enabled L2 cache, repeated requests for the same user will not log requests to the database (of cause, within time-to-live-seconds: 300).

The second added endpoint shows the usage of Hazelcast map structure. As the endpoint triggers a custom (not-cached) request to the database, each SQL request could have been logged.

     Java 
   
   @Query(value = "SELECT name FROM user WHERE email = :email", nativeQuery = true)
Optional<String> findNameByEmail(String email);

But the results of the request are stored in the Hazelcast map after the retrieving data from the database. So, if the map returns not empty value by a given email, no additional request to the database is performed:

     Java 
   
   private Optional<String> getFromCache(String email) {
  return Optional.ofNullable(nameByEmailMap.computeIfAbsent(email,
              (e) -> userRepository.findNameByEmail(e).orElse(null)));
}

Postscript

As was mentioned in the foreword, this guide is not a definitive tutorial for mentioned technologies and tools. Here are some suggestions for improvements.

It is possible to use "out-of-the-box" Spring's support of caching, which in our case, is based on the fact that HazelcastInstance is discovered by Spring and is wrapped into CacheManager. This allows using Spring's @EnableCaching and @Cacheable annotations.

In our simplified Kubernetes configurations, it is possible to replace types of services, like NodePort, into clusterIP, which might allow us to run applications in the cluster and to discover services by names. So, it can help to avoid joggling with hardcoded IPs. Additionally, it is highly recommended to avoid using a default namespace. So, consider using a dedicated namespace in the application and Hazelcast configuration.

Finally, never ever store your secrets in the code. In our code, the database credentials are encoded and written directly into YAML configuration files for demonstration and simplicity only. Do not repeat this in your applications. Well-adopted practice is to store secrets in environment variables or third-party vaults, which are retrieved/substituted by CI/CD pipelines during deployment.

Hazelcast Kubernetes Spring Boot

Opinions expressed by DZone contributors are their own.

Related

Trending