DZone
Cloud Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > Cloud Zone > AWS Serverless Lambda Resiliency: Part 2

AWS Serverless Lambda Resiliency: Part 2

We continue to address the patterns and considerations for resilience in cloud-native serverless systems with 5 additional scenarios.

Siddhartha Sood user avatar by
Siddhartha Sood
·
Divakar R user avatar by
Divakar R
·
Apr. 30, 22 · Cloud Zone · Opinion
Like (5)
Save
Tweet
3.04K Views

Join the DZone community and get the full member experience.

Join For Free

In this series of articles, we are addressing the patterns for the resilience of cloud-native serverless systems. Please refer to article one for the introduction of this series.

Pattern 3: Lambda Synchronous Invocation and Circuit State Validated by API Gateway and Leveraging Fallback Switchover

In this option, let's consider Lambdas serving synchronous requests through the API gateway. In this option, the API Gateway starts sending traffic to the fallback service when there are issues with the external service.

Circuit is closed:

  • The API Gateway calls the Lambda function.
  • Lambda function calls the external service.
  • The calls to the external service are observed using CloudWatch (based on errors and error rates).

Pattern 3: Circuit Component Diagram: Closed State

Circuit switches to open:

  • Now, let's say, there are issues with the external service.
  • A new request comes in through the API Gateway, which then calls the Lambda function.
  • Lambda function calls the external service.
  • The calls to the external service are observed using CloudWatch.
  • CloudWatch identifies the issues (failure/error/error code) and raises alarm/event.
  • The event source configuration in CloudWatch triggers the Circuit breaker Lambda.
  • Circuit Breaker lambda creates an item in the DynamoDB, which will have a duration for the open circuit. This is set using the item's Time To Live. The Lambda will then use the SDK to set the API gateway to call the fallback service.
  • Circuit is now switched to an open state.

Circuit is open:

  • A new request comes in through the API Gateway.
  • The API gateway invokes the fallback service.

Pattern 3 Circuit Component Diagram: Open State

Circuit switches to closed:

  • The TTL of the item (representing the open state) in DynamoDB expires.
  • The TTL expiry will result in the DynamoDB stream triggering the Circuit Breaker Lambda.
  • The Lambda will use the SDK and point back to the original Lambda function.
  • Circuit is now switched to a closed state.

When may this approach be applicable?

  • This approach works for synchronous services.
  • This option does not invoke the original Lambda function when the circuit is open. Hence the cost involved with Lambda invocations is avoided when the circuit is open. However, there will still be cost involved to invoke the fallback service.
  • Calling a fallback service is viable when the external service has issues.
  • The Lambda function is not aware of the circuit breaker.
  • The circuit is either open or closed. There is no half-open state.

Based on the nature of the Lambda service, the canary deployment feature may be suitable so that you can send part of the traffic to the fallback service and the remaining to the original Lambda. The traffic to the original Lambda can be iteratively increased to avoid full load as soon as the external service recovers. This will enable the circuit to achieve a half-open state.

Pattern 4: Lambda Synchronous Invocation and Circuit State Validated by API Gateway and Leveraging Interceptor Function

Circuit is closed:

  • The API Gateway calls the Interceptor Lambda function.
  • Interceptor Lambda function checks the status of the circuit (which is closed) from DynamoDB.
  • As the circuit is closed, the Interceptor Lambda function further calls the Lambda function.
  • Lambda function calls the external service.
  • The calls to the external service are observed using CloudWatch (based on errors and error rates).

Pattern 4 Circuit Component Diagram: Closed State

Circuit switches to open:

  • Now, let's say, there are issues with the external service.
  • A new request comes in through the API Gateway, which then calls the Interceptor Lambda function.
  • Interceptor Lambda function checks the status of the circuit (which is still closed) in DynamoDB.
  • Interceptor Lambda function calls the Lambda function.
  • Lambda function calls the external service.
  • The calls to the external service are observed using CloudWatch.
  • CloudWatch identifies the issues (failure/error/error code) and raises alarm/event.
  • The event source configuration in CloudWatch triggers the Interceptor Lambda function.
  • Interceptor Lambda function creates an item in the DynamoDB, which will have a duration for the open circuit. We will set the duration using the item's Time To Live.
  • Circuit is now switched to an open state.

Circuit is open:

  • A new request comes in through the API Gateway, which then calls the Interceptor Lambda function.
  • Interceptor Lambda function checks the status of the circuit (which is open) from DynamoDB.
  • As the circuit is open, the Lambda function invokes the fallback service (or alternatively returns an error).

Pattern 4 Circuit Component Diagram: Open State

Circuit switches to half-open:

  • The TTL of the item (representing the open state) in DynamoDB expires.
  • The TTL expiry will result in the DynamoDB stream triggering the Interceptor Lambda function.
  • The Interceptor Lambda function creates an item in the DynamoDB, which will have a duration for which the circuit is half open and an invocation limit on the Lambda function for that specific duration. We set the duration using the item's Time To Live.
  • Circuit is now switched to a half-open state.

Circuit is half-open:

  • A new request comes in through the API Gateway, which then calls the Interceptor Lambda function.
  • Interceptor Lambda function checks the status of the circuit (which is half-open) from DynamoDB.
  • The Interceptor Lambda function checks if the invocation limit is reached as the circuit is half-open. The Interceptor lambda function calls the fallback service if the invocation limit is reached. If the invocation limit is not reached, the lambda function decreases the invocation limit and then calls the Lambda function.
  • If the CloudWatch identifies any issues (failure/error/error code) and raises the alarm/event, which will result in the invocation of the Lambda to re-open the circuit.

Pattern 4 Circuit Component Diagram: Half-Open State

Circuit switches to closed:

  • Let's say the external service handled the requests as per the service objective.
  • The TTL of the item (representing the half-open state) in DynamoDB expires.
  • The TTL expiry will result in the DynamoDB stream triggering the Interceptor Lambda function.
  • The Interceptor Lambda function can now increase the invocation limit OR switch the circuit to Closed. Let's consider the design here where the circuit is closed.
  • Circuit is now switched to a closed state.
  • The Lambda function starts showing normal behavior expected when the circuit is closed.

We can use different approaches to set the open/half-open state time if issues are identified repeatedly. The duration can be based on exponential back-off or random jitter.

When may this approach be applicable?

  • This approach works for synchronous services.
  • It is acceptable from a business standpoint to function with reduced functionality.
  • There is a fallback function service, which can be an alternate implementation with full or reduced functionality.
  • The advantage here is that the functional Lambda has no knowledge of the circuit breaker and it is handled/abstracted from it by the interceptor Lambda. This would mean that circuit breaker logic does not need to be coded within the functional Lambda.

Pattern 5: Lambda Synchronous Invocation and Circuit State Managed Between Cold Start and Warm Start

Circuit is closed:

  • The API Gateway calls the Lambda function.
  • Lambda function checks the status of the circuit (which is closed) from Global variables.
  • As the circuit is closed, the Lambda function calls the external service.

Pattern 5 Circuit Component Diagram: Closed State

Circuit switches to open:

  • Now, let's say, there are issues with the external service.
  • A new request comes in through the API Gateway, which then calls the Lambda function.
  • Lambda function checks the status of the circuit (which is still closed) in Global variables.
  • As the circuit is closed, the Lambda function calls the external service.
  • The Lambda function encounters issues (failure/error/error code) with the external service. When the backend fails, the number of failures is captured in global variables along with the time of those failures. When a certain number of failures happens (configurable) within a defined time (configurable) then the circuit state is opened. The Global variables will have durations for the open circuit/half-open circuit. We will set the durations using the Time To Live.
  • Circuit is now switched to an open state.

Circuit is open:

  • A new request comes in through the API Gateway, which then calls the Lambda function.
  • Lambda function checks the status of the circuit (which is open) from in Global variables.
  • As the circuit is open, the Lambda function invokes fallback (or alternatively returns an error).

Pattern 5 Circuit Component Diagram: Open State

Circuit switches to half-open:

  • The TTL of the Global variables (for the open circuit) setting expires.
  • The TTL of the Global variables (for the half-open circuit) is still active.
  • Circuit is now switched to a half-open state.

Circuit is half-open:

  • A new request comes in through the API Gateway, which then calls the Lambda function.
  • Lambda function checks the status of the circuit (which is half-open) from Global variables.
  • The Lambda function checks if the invocation limit is reached as the circuit is half-open. The lambda function calls the fallback (or alternatively returns an error) if the invocation limit is reached. If the invocation limit is not reached, the lambda function decreases the invocation limit and then calls the external service.
  • If the Lambda function identifies any issues (failure/error/error code) it will re-open the circuit.

Pattern 5 Circuit Component Diagram: Half-Open State

Circuit switches to closed:

  • Let's say the external service handled the requests as per the service objective.
  • The TTL of the item (representing the half-open state) in Global variables expires.
  • Circuit is now switched to a closed state.
  • The Lambda function starts showing normal behavior expected when the circuit is closed.

When may this approach be applicable?

  • This approach works for synchronous services.
  • It is acceptable from a business standpoint to function with reduced functionality.
  • There is a fallback function service, which can be an alternate implementation with full or reduced functionality.
  • The advantage here is that there is no dependency on external circuit breaker state management (through DynamoDB) or requiring an external circuit breaker function. Everything can be managed inside the functional Lambda.
  • The disadvantage is that the circuit breaker would only work for one invocation of the Lambda function that toggles from cold start to warm start.
  • Each new cold start would start a new circuit breaker cycle with the assumption that the circuit is closed.

Pattern 6: Lambda Synchronous Invocation and Circuit State Managed Between Cold Start and Warm Start With Externalized Circuit State Management

Circuit State-Managed Between Cold Start and Warm Start with Externalized Circuit State Management

This combines pattern 5 with pattern 6. Here circuit state would be managed by both within Lambda global variables and externally within DynamoDB. This would provide additional efficiencies where DynamoDB would be invoked only across Lambda cold starts and will not be invoked across Lambda warm starts.

Pattern 7: Lambda Asynchronous Invocation and Circuit State Managed by Invoking Lambda

In this option, let's consider Lambdas serving asynchronous requests through SQS. Here we will use the event source mapping, batch size, and batch window based on the status of the external service. Messages that could not be processed based on the business requirements will be sent to a dead letter queue.

Let's walk through how this option works.

Circuit is closed:

  • The Lambda reads the messages from the SQS and calls the Lambda function.
  • Lambda function calls the external service.
  • The calls to the external service are observed using CloudWatch (based on errors and error rates).

Pattern 7 Circuit Component Diagram: Closed State

Circuit switches to open:

  • Now, let's say, there are issues with the external service:
  • New messages come in through the SQS, which then calls the Lambda function.
  • Lambda function calls the external service.
  • The calls to the external service are observed using CloudWatch.
  • CloudWatch identifies the issues (failure/error/error code) and raises alarm/event.
  • The event source configuration in CloudWatch triggers the Circuit breaker Lambda.
  • Circuit Breaker Lambda function creates an item in the DynamoDB, which will have a duration for the open circuit. We will set the duration using the item's Time To Live.
  • The status of event source mapping is set to disabled.
  • Circuit is now switched to an open state.

Circuit is open:

  • As the circuit is open, the Lambda function will not be invoked when there are new messages in SQS.

Pattern 7 Circuit Component Diagram: Open State

Circuit switches to half-open:

  • The TTL of the item (representing the open state) in DynamoDB expires.
  • The TTL expiry will result in the DynamoDB stream triggering the Circuit Breaker Lambda.
  • The Lambda function creates an item in the DynamoDB, which will have a duration for which the circuit is half-open. The Lambda function will use the SDK to enable the event source mapping and update the batch size and batch window for that specific duration. We set the duration using the item's Time To Live.
  • Circuit is now switched to a half-open state.

Circuit is half-open:

  • New messages come through the SQS, then Lambda, which then calls the Lambda function.
  • The batch size and batch window are now as per the half-open setting.
  • If the CloudWatch identifies any issues (failure/error/error code) and raises the alarm/event, which will result in the invocation of the Lambda to re-open the circuit.
  • SQS moves the messages to the dead letter queue based on a pre-defined setting.


Pattern 7 Circuit Component Diagram: Half-Open State

Circuit switches to closed:

  • Let's say the external service handled the requests as per the service objective.
  • The TTL of the item (representing the half-open state) in DynamoDB expires.
  • The TTL expiry will result in the DynamoDB stream triggering the Circuit Breaker Lambda.
  • The Lambda function can now increase batch size, and batch window values OR switch the circuit to Closed. Let's consider the design here where the circuit is closed.
  • Circuit is now switched to the closed state.
  • The Lambda function starts showing normal behavior expected when the circuit is closed.

We can use different approaches to set the open/half-open state time if issues are identified repeatedly. The duration can be based on exponential back-off or random jitter.

When may this approach be applicable?

  • This approach works for asynchronous services.
  • It is acceptable from a business standpoint to process messages at a lesser rate.
  • Feasibility to implement event correlation to process each message only once (based on requirement).
API AWS Event correlation Time to live Requests Circuit Breaker Pattern

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Debugging Java Collections Framework Issues in Production
  • Implementing RBAC Configuration for Kubernetes Applications
  • Toying With Kotlin’s Context Receivers
  • What Is Cloud-Native Architecture?

Comments

Cloud Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo