Resilient Microservices With Istio Circuit Breaker
In this article, we take a look at how to protect services from an unexpected number of requests or a dependent service outage.
Join the DZone community and get the full member experience.Join For Free
Reliability is key to a microservices architecture. Circuit breakers are a design pattern to create resilient microservices by limiting the impact of service failures and latencies. One of the primary goals of the Circuit Breaker pattern is to handle failures gracefully so that no cascading failures occur. In a microservice landscape, failing fast is critical. Circuit breakers do a great job of protecting the service from a heavy load.
If there are failures in your Microservice ecosystem, then you need to fail fast by opening the circuit. This ensures that no additional calls are made to the failing service, so that we return an exception immediately. This pattern also monitors the system for failures and, once things are back to normal, the circuit is closed to allow normal functionality.
In my earlier blog post, I explained Outlier Detection — which is an Istio resiliency strategy to detect unusual host behavior and evict the unhealthy hosts from the set of load balanced healthy hosts inside a cluster. Read more about it here: Istio Circuit Breaker with Outlier Detection.
Hystrix vs. Istio
- The Hystrix library, part of Netflix OSS, has been the leading circuit breaker tooling in the microservices world. Hystrix can be considered as a Whitebox Monitoring tool, whereas Istio can be considered as a Blackbox Monitoring tool, primarily because Istio monitors the system from the outside and does not know how the system works internally. On the other hand, Hystrix libraries are added to each of the individual services to capture the required data.
- You can configure and use advanced resiliency features from Istio without changing the application code. Hystrix implementation requires changing each of your services to include the Hystrix libraries.
- Istio improves the reliability and availability of services in the mesh. However, applications need to handle the errors and take appropriate fallback actions. For example, when all instances in a load balancing pool have failed, Envoy will return HTTP 503. It is the responsibility of the application to implement any fallback logic that is needed to handle the HTTP 503 error code from an upstream service. On the other hand, Hystrix does provide a fallback implementation which is very helpful. Hystrix fallback can be returning an error message, single default value, from cache or even call another service.
- Envoy is completely transparent to the application. The Hystrix library has to be embedded in each of the service calls.
- Istio can be used as a circuit breaker in a polyglot landscape, however, Hystrix is focused primarily on Java applications.
Resiliency and Fault Tolerance Capabilities
Istio adds fault tolerance to your applications without any changes to the code. Some resiliency features it supports are:
- Retries and Timeouts.
- Circuit breakers.
- Health checks.
- Outlier Detection.
- Fault injection.
Circuit Breaker Settings
Envoy provides a set of out-of-the-box opt-in failure recovery features that can be taken advantage of by the services in an application. You can place limits on the number of concurrent connections and requests to upstream services so that systems are not overwhelmed with a large number of requests.
- Maximum Connections: The maximum number of connections to a backend. Any excess connection will be pending in a queue. You can modify this number by changing the
- Maximum Pending Requests: The maximum number of pending requests to a backend. Any excess pending requests will be denied. You can modify this number by changing the
- Maximum Requests: The maximum number of requests in a cluster at any given time. You can modify this number by changing the
While creating a
DestinationRule, you can mention the circuit breaker fields inside the
TrafficPolicy section. Sample code for the
DestinationRule is given below:
apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: name: serviceC spec: host: serviceC subsets: - name: serviceC-v1 labels: version: v1 - name: serviceC-v2 labels: version: v2 trafficPolicy: connectionPool: http: http1MaxPendingRequests: 10 maxRequestsPerConnection: 1 tcp: maxConnections: 1 outlierDetection: baseEjectionTime: 20s consecutiveErrors: 1 interval: 10s maxEjectionPercent: 100
The circuit breaker will short circuit any pending requests or connections that exceed the specified threshold. One of the primary goals of the circuit breaker is to fail fast.
In this article, we looked at how you can protect your services from an unexpected number of requests or a dependent service outage. You can implement a throttling logic to reject incoming requests based on the Circuit Breaker configuration.
Published at DZone with permission of Samir Behara, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.