DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Introduction to the Circuit Breaker Pattern
  • AWS Serverless Lambda Resiliency: Part 1
  • Distributed Tracing System (Spring Cloud Sleuth + OpenZipkin)
  • Circuit Breaker Pattern With Netflix-Hystrix: Java

Trending

  • Your AI Agent Tests Are Passing, But Your Agent Is Still Broken
  • Bringing Intelligence Closer to the Source: Why Real-Time Processing is the Heart of Edge AI
  • Implementing Observability in Distributed Systems Using OpenTelemetry
  • RAG Is Not Enough: Advanced Retrieval Architectures Using Vertex AI Search on GCP
  1. DZone
  2. Software Design and Architecture
  3. Microservices
  4. Resilient Microservice Design – Bulkhead Pattern

Resilient Microservice Design – Bulkhead Pattern

The ability of the system to recover from the failure and remain functional makes the system more resilient. It also avoids any cascading failures.

By 
Vinoth Selvaraj user avatar
Vinoth Selvaraj
·
Nov. 04, 20 · Tutorial
Likes (24)
Comment
Save
Tweet
Share
44.3K Views

Join the DZone community and get the full member experience.

Join For Free

Need For Resiliency:

MicroServices are distributed in nature. It has more components and moving parts. In the distributed architecture, dealing with any unexpected failure is one of the biggest challenges to solve. It could be a hardware failure, network failure, etc. The ability of the system to recover from the failure and remain functional makes the system more resilient. It also avoids any cascading failures.

Why Bulkhead?

A ship is split into small multiple compartments using Bulkheads. Bulkheads are used to seal parts of the ship to prevent the entire ship from sinking in case of a flood. Similarly, failures should be expected when we design software. The application should be split into multiple components and resources should be isolated in such a way that failure of one component is not affecting the other.

For ex: let's assume that there are 2 services A and B. Some of the APIs of A depends on B. For some reason, B is very slow. So, When we get multiple concurrent requests to A which depends on B, A’s performance will also get affected. It could block A’s threads. Due to that A might not be able to serve other requests which do NOT depend on B. So, the idea here is to isolate resources / allocate some threads in A for B. So that We do not consume all the threads of A and prevent A from hanging for all the requests!

Sample Application

We are going to use the same application which we had considered as part of the previous articles.

Source Code is here.

To understand the use of bulkhead patterns, Let's consider this in our application. Our product-service has 2 endpoints.

  • /product/{id} – an endpoint that gives more details about the specific product along with ratings and stuff. It depends on the results of the rating-service. Users updating their rating, leaving comments, replying to the comments everything goes via this endpoint.
  • /products – and endpoint which gives a list of products we have in our catalog based on some search criteria. It does not depend on any other services. Users can directly order products (add to cart) from the list.

Product-service is a typical web application with multiple threads. We are going to limit the number of threads for the application to 15. It means product-service can handle up to 15 concurrent users. If all the users are busy with knowing more about the product, leaving comments, checking reviews, etc, users who are searching for the products and trying to order products might experience application slowness. This is a problem.

ProductController

Java
 




xxxxxxxxxx
1
18


 
1
@RestController
2
@RequestMapping("v1")
3
public class ProductController {
4

          
5
    @Autowired
6
    private ProductService productService;
7

          
8
    @GetMapping("/product/{id}")
9
    public ProductDTO getProduct(@PathVariable int id){
10
        return this.productService.getProduct(id);
11
    }
12

          
13
    @GetMapping("/products")
14
    public List<ProductDTO> getProducts(){
15
        return this.productService.getProducts();
16
    }
17

          
18
}



ProductService internally calls the RatingService whose implementation is as shown below.

Java
 




xxxxxxxxxx
1
22


 
1
@Service
2
public class RatingServiceImpl implements RatingService {
3

          
4
    @Value("${rating.service.url}")
5
    private String ratingServiceUrl;
6

          
7
    @Autowired
8
    private RestTemplate restTemplate;
9

          
10
    @Override
11
    public ProductRatingDTO getRatings(int productId) {
12
        String url = this.ratingServiceUrl + "/" + productId;
13
        ProductRatingDTO productRatingDTO = new ProductRatingDTO();
14
        try{
15
            productRatingDTO = this.restTemplate.getForObject(url, ProductRatingDTO.class);
16
        }catch (Exception e){
17
            e.printStackTrace();
18
        }
19
        return productRatingDTO;
20
    }
21

          
22
}



ProductService’s application.yaml is updated as shown below.

YAML
 




xxxxxxxxxx
1


 
1
server:
2
  tomcat:
3
    max-threads: 15



If I run a performance test using JMeter – to simulate more users trying to access specific product details while some users are trying to access a list of products, I get results as shown here. We were able to make only 26 products request. That too with an average response time of 3.6 seconds even when it does not have any dependency.

performance tests

Let's see how bulkhead implementation can save us here!

Bulkhead Implementation:

  • I am using the Resilience4j library.
  • application.yaml changes
    • We allow max 10 concurrent requests to the rating service even when we have 15 threads.
    • max wait duration is for when we get any additional requests for rating service when the existing 10 threads are busy, we wait for only 10 ms and fail the request immediately.
YAML
 




xxxxxxxxxx
1
12


 
1
server:
2
  tomcat:
3
    max-threads: 15
4
  port: 8082
5
rating:
6
  service:
7
    url: http://localhost:8081/v1/ratings
8
resilience4j.bulkhead:
9
  instances:
10
    ratingService:
11
      maxConcurrentCalls: 10
12
      maxWaitDuration: 10ms



RatingServiceImpl changes

  • @Bulkhead uses the instance we have defined in the application.yaml.
  • fallBackMethod is optional. It will be used when we have more than 10 concurrent requests
Java
 




xxxxxxxxxx
1
28


 
1
@Service
2
public class RatingServiceImpl implements RatingService {
3

          
4
    @Value("${rating.service.url}")
5
    private String ratingServiceUrl;
6

          
7
    @Autowired
8
    private RestTemplate restTemplate;
9

          
10
    @Override
11
    @Bulkhead(name = "ratingService", fallbackMethod = "getFallbackRatings", type = Bulkhead.Type.SEMAPHORE)
12
    public ProductRatingDTO getRatings(int productId) {
13
        String url = this.ratingServiceUrl + "/" + productId;
14
        ProductRatingDTO productRatingDTO = new ProductRatingDTO();
15
        try{
16
            productRatingDTO = this.restTemplate.getForObject(url, ProductRatingDTO.class);
17
        }catch (Exception e){
18
            e.printStackTrace();
19
        }
20
        return productRatingDTO;
21
    }
22

          
23
    public ProductRatingDTO getFallbackRatings(int productId, Exception e) {
24
        System.out.println("Falling back : " + productId);
25
        return new ProductRatingDTO();
26
    }
27

          
28
}



Now after starting our services, running the same test produces the below result which is very very interesting.

  • Products requests average response time is 106 milliseconds compared to 3.6 seconds without bulkhead implementation. This is because we do not exhaust the resources of product-service.
  • By using the fallback method any additional requests for the product/1 are responded to with default response.

performance

Summary:

Using the bulkhead pattern, we allocate resources for a specific component so that we do not consume all the resources of the application unnecessarily. Our application remains functional even under unexpected load.

Other design patterns could handle this better along with the bulkhead pattern. Please take a look at these articles.

  • Resilient MicroService Design – Timeout Pattern
  • Resilient MicroService Design – Retry Pattern
  • Resilient MicroService Design – Circuit Breaker Pattern
  • Resilient MicroService Design – Rate Limiter Pattern
microservice Design application Requests Circuit Breaker Pattern

Opinions expressed by DZone contributors are their own.

Related

  • Introduction to the Circuit Breaker Pattern
  • AWS Serverless Lambda Resiliency: Part 1
  • Distributed Tracing System (Spring Cloud Sleuth + OpenZipkin)
  • Circuit Breaker Pattern With Netflix-Hystrix: Java

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook