Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Circuit Breaker, Fallback, and Load Balancing With Netflix OSS and Spring Cloud

DZone's Guide to

Circuit Breaker, Fallback, and Load Balancing With Netflix OSS and Spring Cloud

Look at five scenarios for using Hystrix with other tools from the Netflix OSS stack like Feign and Ribbon and see how performance compares in each situation.

Free Resource

You probably already know about Hystrix and what purpose it is used for. Today, I would like to show you an example of exactly how to use it and show you how you can combine it with other tools from the Netflix OSS stack like Feign and Ribbon. I assume that you have basic knowledge on topics such as microservices, load balancing, and service discovery. If not, I suggest you read some articles about it, like my short introduction to microservices architecture. The code sample used in that article is also used in this. There is also sample source code available on GitHub. For the sample described here, see the hystrix branch. For a basic sample, see the master branch. 

Let’s look at some scenarios for using Fallback and Circuit Breaker. We have Customer Service, which calls the API method from Account Service. There two running instances of Account Service. The requests to Account Service instances are load balanced by Ribbon client 50/50.

Image title

Scenario 1

Hystrix is disabled for th Feign client (1). The auto-retry mechanism is disabled for the Ribbon client on the local instance (2) and other instances (3). Ribbon read timeout is shorter than request max process time (4). This scenario also occurs with the default Spring Cloud configuration without Hystrix. When you call the customer test method, you sometimes receive a full response and sometimes 500 HTTP error code (50/50).

ribbon:
  eureka:
    enabled: true
  MaxAutoRetries: 0 #(2)
  MaxAutoRetriesNextServer: 0 #(3)
  ReadTimeout: 1000 #(4)

feign:
  hystrix:
    enabled: false #(1)

Scenario 2

Hystrix is still disabled for the Feign client (1). The auto-retry mechanism is disabled for the Ribbon client on the local instance (2) but enabled on other instances once (3). You always receive a full response. If your request is received with a delayed response, it is timed out after one second and then Ribbon calls another instance — in that case, not delayed. You can always change MaxAutoRetries to positive value, but that gives us nothing in this sample.

ribbon:
  eureka:
    enabled: true
  MaxAutoRetries: 0 #(2)
  MaxAutoRetriesNextServer: 1 #(3)
  ReadTimeout: 1000 #(4)

feign:
  hystrix:
    enabled: false #(1)

Scenario 3

Here is not a very elegant solution to the problem. We set ReadTimeout on a value bigger than the delay inside the API method (5000 ms).

ribbon:
  eureka:
    enabled: true
  MaxAutoRetries: 0
  MaxAutoRetriesNextServer: 0
  ReadTimeout: 10000

feign:
  hystrix:
    enabled: false

Generally, the configuration from Scenario 2 and 3 is right. You always get the full response. But in some cases, you will wait more than one second (Scenario 2) or more than five seconds (Scenario 3). The delayed instance receives 50% requests from the Ribbon client. But fortunately, there is Hystrix — circuit breaker.

Scenario 4

Let’s enable Hystrix just by removing the feign property. There are no auto retries for Ribbon client (1) and its read timeout (2) is bigger than Hystrix’s timeout (3). 1000ms is also the default value for Hystrix timeoutInMilliseconds property. Hystrix circuit breaker and fallback will work for delayed instances of account service. For some first requests, you receive a fallback response from Hystrix. Then, the delayed instance will be cut off from requests. Most of them will be directed to a not-delayed instance.

ribbon:
  eureka:
    enabled: true
  MaxAutoRetries: 0 #(1)
  MaxAutoRetriesNextServer: 0
  ReadTimeout: 2000 #(2)

hystrix:
  command:
    default:
      execution:
        isolation:
          thread:
            timeoutInMilliseconds: 1000 #(3)

Scenario 5

This scenario is a more advanced development of Scenario 4. Now, Ribbon timeout (2) is lower than Hystrix timeout (3) and also auto retries mechanism is enabled (1) for local instance and for other instances (4). The result is same as for Scenario 2 and 3: you receive a full response, but Hystrix is enabled and it cuts off delayed instance from future requests.

ribbon:
  eureka:
    enabled: true
  MaxAutoRetries: 3 #(1)
  MaxAutoRetriesNextServer: 1 #(4)
  ReadTimeout: 1000 #(2)

hystrix:
  command:
    default:
      execution:
        isolation:
          thread:
            timeoutInMilliseconds: 10000 #(3)

I could imagine a few other scenarios, but the idea here was just to show differences in circuit breaker and fallback when modifying configuration properties for Feign, Ribbon, and Hystrix in application.yml.

Hystrix

Let’s take a closer look at standard Hystrix circuit breaker and usage described in Scenario 4. To enable Hystrix in your Spring Boot application, you have to follow dependencies to pom.xml. The second step is to add the annotation @EnableCircuitBreaker to the main application class and also @EnableHystrixDashboard if you would like to have the UI dashboard available.

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-hystrix</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-hystrix-dashboard</artifactId>
</dependency>

Hystrix fallback is set on the Feign client inside customer service.

@FeignClient(value = "account-service", fallback = AccountFallback.class)
public interface AccountClient {

    @RequestMapping(method = RequestMethod.GET, value = "/accounts/customer/{customerId}")
    List<Account> getAccounts(@PathVariable("customerId") Integer customerId);

}

Fallback implementation is really simple. In this case, I just return an empty list instead of the customer’s account list received from account service.

@Component
public class AccountFallback implements AccountClient {

    @Override
    public List<Account> getAccounts(Integer customerId) {
        List<Account> acc = new ArrayList<Account>();
        return acc;
    }

}

Now, we can perform some tests. Let’s start discovery service, two instances of account service on different ports (-DPORT VM argument during startup), and customer service. The endpoint for tests is /customers/{id}. There is also a JUnit test class which sends multiple requests to this endpoint available in the customer-service module pl.piomin.microservices.customer.ApiTest.

@RequestMapping("/customers/{id}")
public Customer findById(@PathVariable("id") Integer id) {
    logger.info(String.format("Customer.findById(%s)", id));
    Customer customer = customers.stream().filter(it -> it.getId().intValue()==id.intValue()).findFirst().get();
    List<Account> accounts =  accountClient.getAccounts(id);
    customer.setAccounts(accounts);
    return customer;
}

I enabled the Hystrix Dashboard on the account-service main class. If you would like to access it, call http://localhost:2222/hystrix and then type Hystrix’s stream address from customer-servicehttp://localhost:3333/hystrix.stream. When I run a test that sends 1,000 requests to customer service, about 20 (2%) of them were forwarded to a delayed instance of account service, leaving the not-delayed instance. The Hystrix dashboard during that test is visible below. For more advanced Hystrix configuration, refer to its documentation available here.

Image title

Topics:
performance ,tutorial ,circuit breaker ,fallback ,load balancing ,netflix oss ,spring cloud

Published at DZone with permission of Piotr Mińkowski, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}