Understanding the Circuit Breaker: A Key Design Pattern for Resilient Systems
The Circuit Breaker Pattern is a key design pattern for building resilient systems by preventing cascading failures and ensuring graceful degradation.
Join the DZone community and get the full member experience.
Join For FreeReliability is critical, specifically, when services are interconnected, and failures in one component can lead to cascading effect on other services. The Circuit Breaker Pattern is an important design pattern used to build fault tolerant and resilient systems. Particularly in microservices architecture. This article explains the fundamentals of the circuit breaker pattern, its benefits, and how to implement it to protect your systems from failure.
What is the Circuit Breaker Pattern?
The Circuit Breaker Pattern is actually inspired by electrical circuit breakers you see at your home, which is designed to prevent system failures by detecting faults and stopping the flow of electricity when problems occur. In software, this pattern monitors service interactions, preventing continuous calls/retries to a failing/failed service, which could overload the service with problem. by “Breaking” the circuit between services, this pattern allows a system to gracefully handle failures and avoid cascading problems.
How Does It Actually Work?
State Diagram showing the differnt states of CB pattern
The circuit breaker
has three distinct states: Closed
, Open
, and Half-Open
.
Closed State: Normally, the circuit breaker is “closed,” meaning (loop is closed) requests are flowing as usual between services. (In electrical terms wires are connected to allow flow of electricity)
Open State: When the circuit breaker is open, it immediately rejects requests to the failing service, preventing further stress on the service and giving it time to recover. During this time, fallback mechanisms can be triggered, such as returning cached data or default responses.
Half-Open State: Following a defined timeout, the circuit breaker switches to the half-open state and allows for varying numbers of requests from endpoints to determine if the service has been restored. In case of successful requests, the circuit breaker is closed again, but it goes back to the open state in other cases.
The main idea behind this design pattern is to prevent a failing service from pulling down the entire system and to provide a way for recovery once the service becomes healthy.
Electrical analogy to remember the open and close states
Why Use the Circuit Breaker Pattern?
In complex distributed systems, failures are unavoidable. Here are some real reasons why the circuit breaker pattern is essential:
Preventing Cascading Failures: In a microservices architecture, if one service fails and others depend on it, the failure can spread across the entire system. The circuit breaker stops this by isolating the faulty service.
Improving System Stability: By stopping requests to a failing service, the circuit breaker prevents resource burn down and lowers the load on dependent services, helping to stabilize the system.
Better UX: Instead of having requests stuck for too long or return unhandled errors, the circuit breaker allows for graceful degradation by serving fallback responses, improving the user experience even during failures.
Automated Recovery: The half-open state allows the system to automatically test the health of a service and recover without manual intervention.
How to Implement the Circuit Breaker Pattern
The implementation of the circuit breaker pattern depends on the specific stack you’re using, but the standard approach remains same. Below are the high-level overview of how to implement it:
- Set Failure Thresholds: Define the conditions under which the circuit breaker should open. This can be based on consecutive failures, error rates, or timeouts.
- Monitor Requests: Continuously track the success or failure of requests to a service. If the failure threshold is attained then trip the circuit breaker.
- Handle Open State: When the circuit breaker is open, reject further requests to the service and trigger fallback mechanisms.
- Implement Half-Open State: After some timeout let limited requests hit the service to test if it has recovered. If successful close the circuit breaker.
- Provide Fallback Mechanisms: During failures, fallback mechanisms can provide default responses, use cached data, or switch to alternate services.
The following example demonstrates how to implement a circuit breaker in Java using the widely adopted Resilience4j
library:
Resilience4j is a powerful Java library designed to help you implement resilience patterns, such as the Circuit Breaker, Rate Limiter, Retry, Bulkhead, and Time Limiter patterns. One of the main advantages of Resilience4j is its flexibility and easy configuration. Correct configuration of these resilience patterns allows developers to fine tune the systems for maximum fault tolerance, improved stability, and better performance in the face of errors.
import io.github.resilience4j.circuitbreaker.CircuitBreaker;
import io.github.resilience4j.circuitbreaker.CircuitBreakerConfig;
import io.github.resilience4j.circuitbreaker.CircuitBreakerRegistry;
import java.time.Duration;
public class CircuitBreakerExample {
public static void main(String[] args) {
// Create a custom configuration for the Circuit Breaker
CircuitBreakerConfig config = CircuitBreakerConfig.custom()
.failureRateThreshold(50)
.waitDurationInOpenState(Duration.ofSeconds(5))
.ringBufferSizeInHalfOpenState(5)
.ringBufferSizeInClosedState(20)
.build();
// Create a CircuitBreakerRegistry with a custom global configuration
CircuitBreakerRegistry registry = CircuitBreakerRegistry.of(config);
// Get or create a CircuitBreaker from the CircuitBreakerRegistry
CircuitBreaker circuitBreaker = registry.circuitBreaker("myService");
// Decorate the service call with the circuit breaker
Supplier<String> decoratedSupplier = CircuitBreaker
.decorateSupplier(circuitBreaker, myService::call);
// Execute the decorated supplier and handle the result
Try<String> result = Try.ofSupplier(decoratedSupplier)
.recover(throwable -> "Fallback response");
System.out.println(result.get());
}
}
In this example, the circuit breaker is configured to open if 50% of the requests fail. It stays open for 5 seconds before entering the half-open state, during which it allows 5 requests to test the service. If the requests are successful, it closes the circuit breaker, allowing normal operation to resume.
Important Configuration Options for Circuit Breaker in Resilience4j
Resilience4j provides a flexible and robust implementation of the Circuit Breaker Pattern, allowing developers to configure various aspects to tailor the behavior to their application’s needs. Correct configuration is crucial to balancing fault tolerance, system stability, and recovery mechanisms. Below are the key configuration options for Resilience4j’s Circuit Breaker:
1. Failure Rate Threshold: This is the percentage of failed requests that will cause the circuit breaker to transition from a Closed state (normal operation) to an Open state (where requests are blocked). The purpose is to controls when the circuit breaker should stop forwarding requests to a failing service. For example, a threshold of 50% means the circuit breaker will open after half of the requests fail.
CircuitBreakerConfig config = CircuitBreakerConfig.custom()
.failureRateThreshold(50) // Open the circuit when 50% of requests fail
.build();
2. Wait Duration in Open State: The time the circuit breaker remains in the Open
state before it transitions to the Half-Open
state, where it starts allowing a limited number of requests to test if the service has recovered. This prevents retrying failed services immediately, allowing the downstream service time to recover before testing it again.
CircuitBreakerConfig config = CircuitBreakerConfig.custom()
.waitDurationInOpenState(Duration.ofSeconds(30)) // Wait for 30 seconds before transitioning to Half-Open
.build();
3. Ring Buffer Size in Closed State: The number of requests that the circuit breaker records while in the Closed
state (before failure rates are evaluated). This acts as a sliding window for error monitoring. Helps the circuit breaker determine the failure rate based on recent requests. a larger ring buffer size means more data points are considered when deciding whether to open the circuit.
CircuitBreakerConfig config = CircuitBreakerConfig.custom()
.ringBufferSizeInClosedState(50) // Consider the last 50 requests to calculate the failure rate
.build();
4. Ring Buffer Size in Half-Open State: The number of permitted requests in the Half-Open
state before deciding whether to close the circuit or revert to the Open
state based on success or failure rates. determines how many requests will be tested in the half-open state to decide whether the service is stable to close the circuit or still its failing.
CircuitBreakerConfig config = CircuitBreakerConfig.custom()
.ringBufferSizeInHalfOpenState(5) // Test with 5 requests in Half-Open state
.build();
5. Sliding Window Type and Size: Defines how failure rates are measured: either by a count-based
sliding window or time-based
sliding window. provides flexibility in handling failure rates are computed. A count based window is useful in hightraffic systems, whereas a time based window works well in low traffic environments.
CircuitBreakerConfig config = CircuitBreakerConfig.custom()
.slidingWindowType(SlidingWindowType.COUNT_BASED)
.slidingWindowSize(100) // Use a count-based window with the last 100 requests
.build();
6. Minimum Number of Calls: specifies the minimum number of requests required before the failure rate is evaluated. prevents the circuit breaker from opening prematurely when there isnt enough data to calculate a meaningful failure rate, specially during low traffic.
CircuitBreakerConfig config = CircuitBreakerConfig.custom()
.minimumNumberOfCalls(20) // Require at least 20 calls before evaluating failure rate
.build();
7. Permitted Number of Calls in Half-Open State: The number of requests allowed to pass through in the Half-Open
state to check if the service has recovered. After transitioning to the half-open state, this config controls how many requests are allowed to evaluate service recovery. a smaller value can catch issues fastr, while a larger value can promise that temporary issues don’t result in reopening circuit.
CircuitBreakerConfig config = CircuitBreakerConfig.custom()
.permittedNumberOfCallsInHalfOpenState(5) // Test recovery with 5 requests
.build();
8. Slow Call Duration Threshold: Defines the threshold for a slow call. Calls taking longer than this threshold are considered “slow” and can contribute to the failure rate.
CircuitBreakerConfig config = CircuitBreakerConfig.custom()
.slowCallDurationThreshold(Duration.ofSeconds(2)) // Any call over 2 seconds is considered slow
.build();
9. Slow Call Rate Threshold: The percentage of “slow” calls that will trigger the circuit breaker to open, similar to the failure rate threshold. Detects services that are degrading in performance before they fail outright, allowing systems to respond to performance issues early.
CircuitBreakerConfig config = CircuitBreakerConfig.custom()
.slowCallRateThreshold(50) // Open the circuit when 50% of calls are slow
.build();
10. Automatic Transition from Open to Half-Open: Controls how the circuit breaker automatically transitions from the Open
state to the Half-Open
state after a set wait duration. Enables the system to recover automatically by testing the service periodically, avoiding the need for manual intervention.
CircuitBreakerConfig config = CircuitBreakerConfig.custom()
.automaticTransitionFromOpenToHalfOpenEnabled(true) // Enable automatic transition
.build();
11. Fallback Mechanism: Helps configure fallback actions when the circuit breaker is open and requests are blocked. Prevents cascading failures and improves us by serving cached data/default responses.
Try<String> result = Try.ofSupplier(
CircuitBreaker.decorateSupplier(circuitBreaker, service::call)
).recover(throwable -> "Fallback response");
Conclusion
The Circuit Breaker Pattern is a vital tool in building resilient, fault-tolerant systems. By preventing cascading failures, improving system stability, and enabling graceful recovery, it plays a crucial role in modern software architecture, especially in microservices environments. Whether you’re building a large-scale enterprise application or a smaller distributed system, the circuit breaker can be a game-changer in maintaining reliable operations under failure conditions.
Opinions expressed by DZone contributors are their own.
Comments