Deep Dive Into Terraform Provider Debugging With Delve
High-Volume Security Analytics: Splunk vs. Flink for Rule-Based Incident Detection
Enterprise AI
Artificial intelligence (AI) has continued to change the way the world views what is technologically possible. Moving from theoretical to implementable, the emergence of technologies like ChatGPT allowed users of all backgrounds to leverage the power of AI. Now, companies across the globe are taking a deeper dive into their own AI and machine learning (ML) capabilities; they’re measuring the modes of success needed to become truly AI-driven, moving beyond baseline business intelligence goals and expanding to more innovative uses in areas such as security, automation, and performance.In DZone’s Enterprise AI Trend Report, we take a pulse on the industry nearly a year after the ChatGPT phenomenon and evaluate where individuals and their organizations stand today. Through our original research that forms the “Key Research Findings” and articles written by technical experts in the DZone Community, readers will find insights on topics like ethical AI, MLOps, generative AI, large language models, and much more.
Open Source Migration Practices and Patterns
MongoDB Essentials
Java adoption has shifted from version 1.8 to at least Java 17. Concurrently, Spring Boot has advanced from version 2.x to 3.2.2. The springdoc project has transitioned from the older library 'springdoc-openapi-ui' to 'springdoc-openapi-starter-webmvc-ui' for its functionality. These updates mean that readers relying on older articles may find themselves years behind in these technologies. The author has updated this article so that readers are using the latest versions and don't struggle with outdated information during migration. This is part one of a three-part series. You can check out the other articles below. OpenAPI 3 Documentation With Spring Boot Doing More With Springdoc OpenAPI Extending Swagger and Springdoc Open API In this tutorial, we are going to try out a Spring Boot Open API 3-enabled REST project and explore some of its capabilities. The springdoc-openapi Java library has quickly become very compelling. We are going to refer to Building a RESTful Web Service and springdoc-openapi v2.5.0. Prerequisites Java 17.x Maven 3.x Steps Start by creating a Maven JAR project. Below, you will see the pom.xml to use: XML <?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <parent> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-parent</artifactId> <version>3.2.2</version> <relativePath ></relativePath> <!-- lookup parent from repository --> </parent> <groupId>com.example</groupId> <artifactId>sample</artifactId> <version>0.0.1</version> <name>sample</name> <description>Demo project for Spring Boot with openapi 3 documentation</description> <properties> <java.version>17</java.version> </properties> <dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-validation</artifactId> </dependency> <dependency> <groupId>org.springdoc</groupId> <artifactId>springdoc-openapi-starter-webmvc-ui</artifactId> <version>2.5.0</version> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-maven-plugin</artifactId> </plugin> </plugins> </build> </project> Note the "springdoc-openapi-starter-webmvc-ui" dependency. Now, let's create a small Java bean class. Java package sample; import org.hibernate.validator.constraints.CreditCardNumber; import jakarta.validation.constraints.Email; import jakarta.validation.constraints.Max; import jakarta.validation.constraints.Min; import jakarta.validation.constraints.NotBlank; import jakarta.validation.constraints.NotNull; import jakarta.validation.constraints.Pattern; import jakarta.validation.constraints.Size; public class Person { private long id; private String firstName; @NotNull @NotBlank private String lastName; @Pattern(regexp = ".+@.+\\..+", message = "Please provide a valid email address" ) private String email; @Email() private String email1; @Min(18) @Max(30) private int age; @CreditCardNumber private String creditCardNumber; public String getCreditCardNumber() { return creditCardNumber; } public void setCreditCardNumber(String creditCardNumber) { this.creditCardNumber = creditCardNumber; } public long getId() { return id; } public void setId(long id) { this.id = id; } public String getEmail1() { return email1; } public void setEmail1(String email1) { this.email1 = email1; } @Size(min = 2) public String getFirstName() { return firstName; } public void setFirstName(String firstName) { this.firstName = firstName; } public String getLastName() { return lastName; } public void setLastName(String lastName) { this.lastName = lastName; } public String getEmail() { return email; } public void setEmail(String email) { this.email = email; } public int getAge() { return age; } public void setAge(int age) { this.age = age; } } - This is an example of a Java bean. Now, let's create a controller. Java package sample; import org.springframework.web.bind.annotation.RequestBody; import org.springframework.web.bind.annotation.RequestMapping; import org.springframework.web.bind.annotation.RequestMethod; import org.springframework.web.bind.annotation.RestController; import io.swagger.v3.oas.annotations.media.Content; import io.swagger.v3.oas.annotations.media.ExampleObject; import jakarta.validation.Valid; @RestController public class PersonController { @RequestMapping(path = "/person", method = RequestMethod.POST) @io.swagger.v3.oas.annotations.parameters.RequestBody(required = true, content = @Content(examples = { @ExampleObject(value = INVALID_REQUEST, name = "invalidRequest", description = "Invalid Request"), @ExampleObject(value = VALID_REQUEST, name = "validRequest", description = "Valid Request") })) public Person person(@Valid @RequestBody Person person) { return person; } private static final String VALID_REQUEST = """ { "id": 0, "firstName": "string", "lastName": "string", "email": "abc@abc.com", "email1": "abc@abc.com", "age": 20, "creditCardNumber": "4111111111111111" }"""; private static final String INVALID_REQUEST = """ { "id": 0, "firstName": "string", "lastName": "string", "email": "abcabc.com", "email1": "abcabc.com", "age": 17, "creditCardNumber": "411111111111111" }"""; } - Above is a sample REST Controller. Side Note: Normally I don't like to clutter already annotation-cluttered code with additional annotations, but I do think having ready-made examples like these can be useful. Another reason that forced me to do this was the default examples now generated from Swagger UI appear to be generating some confusing text when using @Pattern. It appears to be a Spring UI issue and not a Springdoc issue. Let's make some entries in src\main\resources\application.properties. Properties files application-description=@project.description@ application-version=@project.version@ logging.level.org.springframework.boot.autoconfigure=ERROR # server.error.include-binding-errors is now needed if we # want to display the errors as shown in this article # this can also be avoided in other ways as we will see # in later articles server.error.include-binding-errors=always The above entries will pass on Maven build-related information to the OpenAPI documentation and also include the new server.error.include-binding-errors property. Finally, let's write the Spring Boot application class: Java package sample; import org.springframework.beans.factory.annotation.Value; import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.SpringBootApplication; import org.springframework.context.annotation.Bean; import io.swagger.v3.oas.models.OpenAPI; import io.swagger.v3.oas.models.info.Info; import io.swagger.v3.oas.models.info.License; @SpringBootApplication public class SampleApplication { public static void main(String[] args) { SpringApplication.run(SampleApplication.class, args); } @Bean public OpenAPI customOpenAPI(@Value("${application-description}") String appDesciption, @Value("${application-version}") String appVersion) { return new OpenAPI() .info(new Info() .title("sample application API") .version(appVersion) .description(appDesciption) .termsOfService("http://swagger.io/terms/") .license(new License().name("Apache 2.0").url("http://springdoc.org"))); } } - Also, note how the API version and description are being leveraged from application.properties. At this stage, this is what the project looks like in Eclipse: The project contents are above. Next, execute the mvn clean package from the command prompt or terminal. Then, execute java -jar target\sample-0.0.1.jar. You can also launch the application by running the SampleApplication.java class from your IDE. Now, let's visit the Swagger UI — http://localhost:8080/swagger-ui.html. Click the green Post button and expand the > symbol on the right of Person under Schemas. Let's expand the last schemas section a bit more: The nice thing is how the contract is automatically detailed leveraging JSR-303 annotations on the model. It out-of-the-box covers many of the important annotations and documents them. However, I did not see it support out of the box @javax.validation.constraints.Email and @org.hibernate.validator.constraints.CreditCardNumber at this point. The issue is that they are not documented in the generated Swagger specs, but those constraints are functional. We will discuss more on this in the subsequent article. For completeness, let's post a request. Press the Try it out button. Press the blue Execute button. Let's feed in a valid input by copying the below or by selecting the valid Input drop-down. JSON { "id": 0, "firstName": "string", "lastName": "string", "email": "abc@abc.com", "email1": "abc@abc.com", "age": 20, "creditCardNumber": "4111111111111111" } Let's feed that valid input into the Request body section. (We can also select "validRequest" from the Examples dropdown as shown below.) Upon pressing the blue Execute button, we see the below: This was only a brief introduction to the capabilities of the dependency: XML <dependency> <groupId>org.springdoc</groupId> <artifactId>springdoc-openapi-starter-webmvc-ui</artifactId> <version>2.5.0</version> </dependency> Troubleshooting Tips Ensure prerequisites. If using the Eclipse IDE, we might need to do a Maven update on the project after creating all the files. In the Swagger UI, if you are unable to access the “Schema” definitions link, it might be because you need to come out of the “try it out “ mode. Click on one or two Cancel buttons that might be visible. Source code Git Clone URL, Branch: springdoc-openapi-intro-update1.
As we delve into the dynamic world of Kubernetes, understanding its core components and functionalities becomes pivotal for anyone looking to make a mark in the cloud computing and containerization arena. Among these components, static pods hold a unique place, often overshadowed by more commonly discussed resources like deployments and services. In this comprehensive guide, we will unveil the power of static pods, elucidating their utility, operational principles, and how they can be an asset in your Kubernetes arsenal. Understanding Static Pods Static pods are Kubernetes pods that are managed directly by the kubelet daemon on a specific node, without the API server observing them. Unlike other pods that are controlled by the Kubernetes API server, static pods are defined by placing their configuration files directly on a node's filesystem, which the kubelet periodically scans and ensures that the pods defined in these configurations are running. Why Use Static Pods? Static pods serve several critical functions in a Kubernetes environment: Cluster Bootstrapping They are essential for bootstrapping a Kubernetes cluster before the API server is up and running. Since they do not depend on the API server, they can be used to deploy the control plane components as static pods. Node-Level System Pods Static pods are ideal for running node-level system components, ensuring that these essential services remain running, even if the Kubernetes API server is unreachable. Simplicity and Reliability For simpler deployments or edge environments where high availability is not a primary concern, static pods offer a straightforward and reliable deployment option. Creating Your First Static Pod Let’s walk through the process of creating a static pod. You'll need access to a Kubernetes node to follow along. 1. Access Your Kubernetes Node First, SSH into your Kubernetes node: ssh your_username@your_kubernetes_node 2. Create a Pod Definition File Create a simple pod definition file. Let’s deploy an Nginx static pod as an example. Save the following configuration in /etc/kubernetes/manifests/nginx-static-pod.yaml: apiVersion: v1 kind: Pod metadata: name: nginx-static-pod labels: role: myrole spec: containers: - name: nginx image: nginx ports: - containerPort: 80 3. Configure the kubelet to Use This Directory Ensure the kubelet is configured to monitor the /etc/kubernetes/manifests directory for pod manifests. This is typically set by the --pod-manifest-path kubelet command-line option. 4. Verify the Pod Is Running After a few moments, use the docker ps command (or crictl ps if you're using CRI-O or containerd) to check that the Nginx container is running: docker ps | grep nginx Or, if your cluster allows it, you can check from the Kubernetes API server with: kubectl get pods --all-namespaces | grep nginx-static-pod Note that while you can see the static pod through the API server, you cannot manage it (delete, scale, etc.) through the API server. Advantages of Static Pods Simplicity: Static pods are straightforward to set up and manage on a node-by-node basis. Self-sufficiency: They can operate independently of the Kubernetes API server, making them resilient in scenarios where the API server is unavailable. Control plane bootstrapping: Static pods are instrumental in the initial setup of a Kubernetes cluster, particularly for deploying control plane components. Considerations and Best Practices While static pods offer simplicity and independence from the Kubernetes API server, they also come with considerations that should not be overlooked: Cluster management: Static pods are not managed by the API server, which means they do not benefit from some of the orchestration features like scaling, lifecycle management, and health checks. Deployment strategy: They are best used for node-specific tasks or cluster bootstrapping, rather than general application deployment. Monitoring and logging: Ensure that your node-level monitoring and logging tools are configured to include static pods. Conclusion Static pods, despite their simplicity, play a critical role in the Kubernetes ecosystem. They offer a reliable method for running system-level services directly on nodes, independent of the cluster's control plane. By understanding how to deploy and manage static pods, you can ensure your Kubernetes clusters are more robust and resilient. Whether you're bootstrapping a new cluster or managing node-specific services, static pods are a tool worth mastering. This beginner's guide aims to demystify static pods and highlight their importance within Kubernetes architectures. As you advance in your Kubernetes journey, remember that the power of Kubernetes lies in its flexibility and the diversity of options it offers for running containerized applications. Static pods are just one piece of the puzzle, offering a unique blend of simplicity and reliability for specific use cases. I encourage you to explore static pods further, experiment with deploying different applications as static pods, and integrate them into your Kubernetes strategy where appropriate. Happy Kubernetes-ing!
Readers of my publications are likely familiar with the idea of employing an API First approach to developing microservices. Countless times I have realized the benefits of describing the anticipated URIs and underlying object models before any development begins. In my 30+ years of navigating technology, however, I’ve come to expect the realities of alternate flows. In other words, I fully expect there to be situations where API First is just not possible. For this article, I wanted to walk through an example of how teams producing microservices can still be successful at providing an OpenAPI specification for others to consume without manually defining an openapi.json file. I also wanted to step outside my comfort zone and do this without using Java, .NET, or even JavaScript. Discovering FastAPI At the conclusion of most of my articles I often mention my personal mission statement: “Focus your time on delivering features/functionality that extends the value of your intellectual property. Leverage frameworks, products, and services for everything else.” – J. Vester My point in this mission statement is to make myself accountable for making the best use of my time when trying to reach goals and objectives set at a higher level. Basically, if our focus is to sell more widgets, my time should be spent finding ways to make that possible – steering clear of challenges that have already been solved by existing frameworks, products, or services. I picked Python as the programming language for my new microservice. To date, 99% of the Python code I’ve written for my prior articles has been the result of either Stack Overflow Driven Development (SODD) or ChatGPT-driven answers. Clearly, Python falls outside my comfort zone. Now that I’ve level-set where things stand, I wanted to create a new Python-based RESTful microservice that adheres to my personal mission statement with minimal experience in the source language. That’s when I found FastAPI. FastAPI has been around since 2018 and is a framework focused on delivering RESTful APIs using Python-type hints. The best part about FastAPI is the ability to automatically generate OpenAPI 3 specifications without any additional effort from the developer’s perspective. The Article API Use Case For this article, the idea of an Article API came to mind, providing a RESTful API that allows consumers to retrieve a list of my recently published articles. To keep things simple, let’s assume a given Article contains the following properties: id : Simple, unique identifier property (number) title : The title of the article (string) url : The full URL to the article (string) year : The year the article was published (number) The Article API will include the following URIs: GET /articles : Will retrieve a list of articles GET /articles/{article_id} : Will retrieve a single article by the id property POST /articles : Adds a new article FastAPI in Action In my terminal, I created a new Python project called fast-api-demo and then executed the following commands: Shell $ pip install --upgrade pip $ pip install fastapi $ pip install uvicorn I created a new Python file called api.py and added some imports, plus established an app variable: Python from fastapi import FastAPI, HTTPException from pydantic import BaseModel app = FastAPI() if __name__ == "__main__": import uvicorn uvicorn.run(app, host="localhost", port=8000) Next, I defined an Article object to match the Article API use case: Python class Article(BaseModel): id: int title: str url: str year: int With the model established, I needed to add the URIs…which turned out to be quite easy: Python # Route to add a new article @app.post("/articles") def create_article(article: Article): articles.append(article) return article # Route to get all articles @app.get("/articles") def get_articles(): return articles # Route to get a specific article by ID @app.get("/articles/{article_id}") def get_article(article_id: int): for article in articles: if article.id == article_id: return article raise HTTPException(status_code=404, detail="Article not found") To save myself from involving an external data store, I decided to add some of my recently published articles programmatically: Python articles = [ Article(id=1, title="Distributed Cloud Architecture for Resilient Systems: Rethink Your Approach To Resilient Cloud Services", url="https://dzone.com/articles/distributed-cloud-architecture-for-resilient-syste", year=2023), Article(id=2, title="Using Unblocked to Fix a Service That Nobody Owns", url="https://dzone.com/articles/using-unblocked-to-fix-a-service-that-nobody-owns", year=2023), Article(id=3, title="Exploring the Horizon of Microservices With KubeMQ's New Control Center", url="https://dzone.com/articles/exploring-the-horizon-of-microservices-with-kubemq", year=2024), Article(id=4, title="Build a Digital Collectibles Portal Using Flow and Cadence (Part 1)", url="https://dzone.com/articles/build-a-digital-collectibles-portal-using-flow-and-1", year=2024), Article(id=5, title="Build a Flow Collectibles Portal Using Cadence (Part 2)", url="https://dzone.com/articles/build-a-flow-collectibles-portal-using-cadence-par-1", year=2024), Article(id=6, title="Eliminate Human-Based Actions With Automated Deployments: Improving Commit-to-Deploy Ratios Along the Way", url="https://dzone.com/articles/eliminate-human-based-actions-with-automated-deplo", year=2024), Article(id=7, title="Vector Tutorial: Conducting Similarity Search in Enterprise Data", url="https://dzone.com/articles/using-pgvector-to-locate-similarities-in-enterpris", year=2024), Article(id=8, title="DevSecOps: It's Time To Pay for Your Demand, Not Ingestion", url="https://dzone.com/articles/devsecops-its-time-to-pay-for-your-demand", year=2024), ] Believe it or not, that completes the development for the Article API microservice. For a quick sanity check, I spun up my API service locally: Shell $ python api.py INFO: Started server process [320774] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://localhost:8000 (Press CTRL+C to quit) Then, in another terminal window, I sent a curl request (and piped it to json_pp): Shell $ curl localhost:8000/articles/1 | json_pp { "id": 1, "title": "Distributed Cloud Architecture for Resilient Systems: Rethink Your Approach To Resilient Cloud Services", "url": "https://dzone.com/articles/distributed-cloud-architecture-for-resilient-syste", "year": 2023 } Preparing To Deploy Rather than just run the Article API locally, I thought I would see how easily I could deploy the microservice. Since I had never deployed a Python microservice to Heroku before, I felt like now would be a great time to try. Before diving into Heroku, I needed to create a requirements.txt file to describe the dependencies for the service. To do this, I installed and executed pipreqs: Shell $ pip install pipreqs $ pipreqs This created a requirements.txt file for me, with the following information: Plain Text fastapi==0.110.1 pydantic==2.6.4 uvicorn==0.29.0 I also needed a file called Procfile which tells Heroku how to spin up my microservice with uvicorn. Its contents looked like this: Shell web: uvicorn api:app --host=0.0.0.0 --port=${PORT} Let’s Deploy to Heroku For those of you who are new to Python (as I am), I used the Getting Started on Heroku with Python documentation as a helpful guide. Since I already had the Heroku CLI installed, I just needed to log in to the Heroku ecosystem from my terminal: Shell $ heroku login I made sure to check all of my updates in my repository on GitLab. Next, the creation of a new app in Heroku can be accomplished using the CLI via the following command: Shell $ heroku create The CLI responded with a unique app name, along with the URL for app and the git-based repository associated with the app: Shell Creating app... done, powerful-bayou-23686 https://powerful-bayou-23686-2d5be7cf118b.herokuapp.com/ | https://git.heroku.com/powerful-bayou-23686.git Please note – by the time you read this article, my app will no longer be online. Check this out. When I issue a git remote command, I can see that a remote was automatically added to the Heroku ecosystem: Shell $ git remote heroku origin To deploy the fast-api-demo app to Heroku, all I have to do is use the following command: Shell $ git push heroku main With everything set, I was able to validate that my new Python-based service is up and running in the Heroku dashboard: With the service running, it is possible to retrieve the Article with id = 1 from the Article API by issuing the following curl command: Shell $ curl --location 'https://powerful-bayou-23686-2d5be7cf118b.herokuapp.com/articles/1' The curl command returns a 200 OK response and the following JSON payload: JSON { "id": 1, "title": "Distributed Cloud Architecture for Resilient Systems: Rethink Your Approach To Resilient Cloud Services", "url": "https://dzone.com/articles/distributed-cloud-architecture-for-resilient-syste", "year": 2023 } Delivering OpenAPI 3 Specifications Automatically Leveraging FastAPI’s built-in OpenAPI functionality allows consumers to receive a fully functional v3 specification by navigating to the automatically generated /docs URI: Shell https://powerful-bayou-23686-2d5be7cf118b.herokuapp.com/docs Calling this URL returns the Article API microservice using the widely adopted Swagger UI: For those looking for an openapi.json file to generate clients to consume the Article API, the /openapi.json URI can be used: Shell https://powerful-bayou-23686-2d5be7cf118b.herokuapp.com/openapi.json For my example, the JSON-based OpenAPI v3 specification appears as shown below: JSON { "openapi": "3.1.0", "info": { "title": "FastAPI", "version": "0.1.0" }, "paths": { "/articles": { "get": { "summary": "Get Articles", "operationId": "get_articles_articles_get", "responses": { "200": { "description": "Successful Response", "content": { "application/json": { "schema": { } } } } } }, "post": { "summary": "Create Article", "operationId": "create_article_articles_post", "requestBody": { "content": { "application/json": { "schema": { "$ref": "#/components/schemas/Article" } } }, "required": true }, "responses": { "200": { "description": "Successful Response", "content": { "application/json": { "schema": { } } } }, "422": { "description": "Validation Error", "content": { "application/json": { "schema": { "$ref": "#/components/schemas/HTTPValidationError" } } } } } } }, "/articles/{article_id}": { "get": { "summary": "Get Article", "operationId": "get_article_articles__article_id__get", "parameters": [ { "name": "article_id", "in": "path", "required": true, "schema": { "type": "integer", "title": "Article Id" } } ], "responses": { "200": { "description": "Successful Response", "content": { "application/json": { "schema": { } } } }, "422": { "description": "Validation Error", "content": { "application/json": { "schema": { "$ref": "#/components/schemas/HTTPValidationError" } } } } } } } }, "components": { "schemas": { "Article": { "properties": { "id": { "type": "integer", "title": "Id" }, "title": { "type": "string", "title": "Title" }, "url": { "type": "string", "title": "Url" }, "year": { "type": "integer", "title": "Year" } }, "type": "object", "required": [ "id", "title", "url", "year" ], "title": "Article" }, "HTTPValidationError": { "properties": { "detail": { "items": { "$ref": "#/components/schemas/ValidationError" }, "type": "array", "title": "Detail" } }, "type": "object", "title": "HTTPValidationError" }, "ValidationError": { "properties": { "loc": { "items": { "anyOf": [ { "type": "string" }, { "type": "integer" } ] }, "type": "array", "title": "Location" }, "msg": { "type": "string", "title": "Message" }, "type": { "type": "string", "title": "Error Type" } }, "type": "object", "required": [ "loc", "msg", "type" ], "title": "ValidationError" } } } } As a result, the following specification can be used to generate clients in a number of different languages via OpenAPI Generator. Conclusion At the start of this article, I was ready to go to battle and face anyone not interested in using an API First approach. What I learned from this exercise is that a product like FastAPI can help define and produce a working RESTful microservice quickly while also including a fully consumable OpenAPI v3 specification…automatically. Turns out, FastAPI allows teams to stay focused on their goals and objectives by leveraging a framework that yields a standardized contract for others to rely on. As a result, another path has emerged to adhere to my personal mission statement. Along the way, I used Heroku for the first time to deploy a Python-based service. This turned out to require little effort on my part, other than reviewing some well-written documentation. So another mission statement bonus needs to be mentioned for the Heroku platform as well. If you are interested in the source code for this article you can find it on GitLab. Have a really great day!
Bedrock is the new Amazon service that democratizes the users' access to the most up-to-date Foundation Models (FM) made available by some of the highest-ranked AI actors. Their list is quite impressive and it includes but isn't limited to: Titan Claude Mistral AI Llama2 ... Depending on your AWS region, some of these FMs might not be available. For example, as per this post, in my region, which is eu-west-3 the only available FMs are Titan and Mistral AI, but things are changing very fast. So, what's the point of using this service which, apparently, doesn't do anything else than give you access to other FMs? Well, the added value of Amazon Bedrock is to expose via APIs all these FMs, giving you the opportunity to easily integrate generative AI in your applications, through ubiquitous techniques like Serverless or REST. This is what this post is trying to demonstrate. So, let's go! A Generative AI Gateway The project chosen in order to illustrate this post is showing a Generative AI Gateway, where the user is given access to a certain number of FMs, each one being specialized in a different type of use case like, for example, text generation, conversational interfaces, text summarization, image generation, etc. The diagram below shows the general architecture of the sample application. The sample application architecture diagram As you can see, the sample application consists of the following components: A web front-end that allows the user to select an FM, to configure its parameters, like the temperature, the max tokens, etc. and to start the dialog with it, for example asking questions. Our application being a Quarkus one, we are using here the quarkus-primefaces extension. An AWS REST Gateway that aims at exposing dedicated endpoints, depending on the chosen FM. Here we're using the quarkus-amazon-lambda-rest extension which, as you'll see soon, is able to automatically generate the SAM (Serverless Application Model) template required to deploy the REST Gateway to AWS. Several REST endpoints processing POST requests and aiming at invoking the chosen FM via a Bedrock client. The FM responses are brought back to our web application, through the REST Gateway. Let's look now in greater detail at the implementation. The REST Gateway The module bedrock-gateway-api of our Maven multi-module project, implements this component. It consists of a Quarkus RESTeasy API exposing several endpoints which are processing POST requests, having the user interaction as input parameters, and returning the FM responses. The input parameters are strings and, in the case where the user requests result in a really large amount of text, they are input files. The endpoints process these POST requests by converting the associated input into an FM-specific syntax, including the following parameters: The temperature: a real number between 0 and 1 which aims at influencing the FM's predictability. A lower value consists of a more predictable output while a higher one will generate a more random response. The top P: a real number between 0 and 1 whose value is supposed to select the most likely tokens in a distribution. A lower value results in a more limited number of choices for the response. The max-tokens: an integer value representing the maximum number of words that the FM will process for any given request. The Bedrock documentation is at your disposal in order to bring you all the required missing details concerning the parameters above. The Bedrock client used to interact with the FM service is instantiated as shown below: Java private final BedrockRuntimeAsyncClient client = BedrockRuntimeAsyncClient.builder().region(Region.EU_WEST_3).build(); This requires using the following Maven artifact: XML <dependency> <groupId>software.amazon.awssdk</groupId> <artifactId>bedrockruntime</artifactId> </dependency> There is a synchronous and an asynchronous Bedrock client and, given the relative latency generally associated with an FM invocation, we have chosen the 2nd one. The Web Front-End The Web front-end is a simple Jakarta Faces application implemented using the PrimeFaces library as well as the Facelets notation in order to define the layouts. If this architecture choice might surprise the reader more to JavaScript/TypeScript-based front-ends, then please have a look at this article. The only special thing to be noticed is the way it uses the Microprofile JAX-RS Client implementation by Quarkus to call the AWS REST Gateway. Java @RegisterRestClient @Path("/bedrock") @Produces(MediaType.TEXT_PLAIN) @Consumes(MediaType.APPLICATION_JSON) public interface BedrockAiEndpoint { @POST @Path("mistral2") Response callMistralFm (BedrockAiInputParam bedrockAiInputParam); @POST @Path("titan2") Response callTitanFm (BedrockAiInputParam bedrockAiInputParam); } This interface is all that's required, Quarkus will generate from it the associated implementation client class. Running the Sample Application The application can be run in two ways: Executing locally the AWS REST Gateway and the associated AWS Lambda endpoints; Executing in the cloud the AWS REST Gateway and the associated AWS Lambda endpoints. Running Locally The shell script named run-local.sh runs locally the AWS REST Gateway together with the associated AWS Lambda endpoints. Here is the code: Shell #!/bin/bash mvn -Durl=http://localhost:3000 clean install sed -i 's/java11/java17/g' bedrock-gateway-api/target/sam.jvm.yaml sam local start-api -t ./bedrock-gateway-api/target/sam.jvm.yaml --log-file ./bedrock-gateway-api/sam.log & mvn -DskipTests=false failsafe:integration-test docker run --name bedrock -p 8082:8082 --rm --network host nicolasduminil/bedrock-gateway-web:1.0-SNAPSHOT ./cleanup-local.sh The first thing that we need to do here is to build the application by running the Maven command. This will result, among others, in a Docker image named nicolasduminil/bedrock-gateway-web which is dedicated to run the web front-end. It also will result in the generation by Quarkus of the SAM template (target\sam.jvm.yam) that creates the AWS CloudFormation stack containing the AWS REST Gateway together with the endpoints AWS Lambda functions. For some reason, the Quarkus quarkus-amazon-lambda-rest extension used for this purpose configures the runtime as being Java 11 and, even after having contacted the support, I didn't find any way to change that. Accordingly, the sed command is used in the script to modify the runtime to be Java 17. Then, the sam cli is used to run the command start-api which will execute locally the gateway with the required endpoints. Next, we are in the position to run the integration tests, on behalf of the Maven failsafe plugin. We couldn't do it while initially running the build as the local stack wasn't deployed yet. Last but not least, the script starts a Docker container running the nicolasduminil/bedrock-gateway-web image, created previously by the quarkus-container-image-jib extension. This is our front end. Now, in order to test it, you can jump to the next section which explains how. Running in the Cloud The script named `deploy.sh`, shown below, deploys in the cloud our application: Shell #!/bin/bash mvn -pl bedrock-gateway-api -am clean install sed -i 's/java11/java17/g' bedrock-gateway-api/target/sam.jvm.yaml RANDOM=$$ BUCKET_NAME=bedrock-gateway-bucket-$RANDOM STACK_NAME=bedrock-gateway-stack echo $BUCKET_NAME > bucket-name.txt aws s3 mb s3://$BUCKET_NAME sam deploy -t bedrock-gateway-api/src/main/resources/template.yaml --s3-bucket $BUCKET_NAME --stack-name $STACK_NAME --capabilities CAPABILITY_IAM API_ENDPOINT=$(aws cloudformation describe-stacks --stack-name $STACK_NAME --query 'Stacks[0].Outputs[0].OutputValue' --output text) mvn -pl bedrock-gateway-web -Durl=$API_ENDPOINT clean install docker run --name bedrock -p 8082:8082 --rm --network host nicolasduminil/bedrock-gateway-web:1.0-SNAPSHOT This time things are a bit more complicated. The Maven build in the script's first line uses the -pl switch to select only the bedrock-gateway-api module. This is because, in this case, we don't know in advance the AWS RESY Gateway URL, which the other module, bedrock-gateway-web needs in order to it the Microprofile JAX-RS client. Next, the sed command serves the same purposes as previously but, in order to deploy our stack in the cloud, we need an S3 bucket. And since the S3 bucket names have to be unique worldwide, we need to generate them randomly and store them in a text file, such that to be able to find them later, when it comes to destroying it. Now, it's time to deploy our CloudFormation stack. Please notice the way we catch the associated URL, by using the --query and the --output option. This is the moment to build the bedrock-gateway-web module as we have now the AWS REST Gateway URL, which we're passing as an environment variable, via the -D option of Maven. At this point, we only have to start our Docker container and start testing. Testing the Application In order to test the application, be it locally or in the cloud, proceed as follows: Clone the repository: Shell $ git clone https://github.com/nicolasduminil/bedrock-gateway.git cdin the root directory: Shell $ cd bedrock-gateway Run the start script (run-local.sh or deploy.sh). The execution might take a while, especially if this is the first time you're running it. Fire your preferred browser to http://localhost:8082. You'll be presented with the screen below: Using the menu bar, select the Titan sandbox. A new screen will be presented to you, as shown below. Using the sliders, configure as you wish the parameters Temperature, Top P and Max tokens. Then type in the text area labeled Prompt your question the chosen FM. Its response will display in the rightmost text area labeled Response. Please use different combinations of parameters to notice the differences between the two FM responses. And in the case you're testing in the cloud, don't forget to run the script cleanup.sh when finished, such that to avoid being invoiced. Have fun!
If you're eager to learn or understand decision trees, I invite you to explore this article. Alternatively, if decision trees aren't your current focus, you may opt to scroll through social media. About Decision Trees Figure 1: Simple Decision tree The image above shows an example of a simple decision tree. Decision trees are tree-shaped diagrams used for making decisions based on a series of logical conditions. In a decision tree, each node represents a decision statement, and the tree proceeds to make a decision based on whether the given statement is true or false. There are two main types of decision trees: Classification trees and Regression trees. A Classification tree categorizes problems by classifying the output of the decision statement into categories using if-else logical conditions. Conversely, a Regression tree classifies the output into numeric values. In Figure 2, the topmost node of a decision tree is called the Root node, while the nodes following the root node are referred to as Internal nodes or branches. These branches are characterized by arrows pointing towards and away from them. At the bottom of the tree are the Leaf nodes, which carry the final classification or decision of the tree. Leaf nodes are identifiable by arrows pointing to them, but not away from them. Figure 2: Nodes of a Decision tree Primary Objective of Decision Trees The primary objective of a decision tree is to partition the given data into subsets in a manner that maximizes the purity of the outcomes. Advantages of Decision Trees Simplicity: Decision trees are straightforward to understand, interpret, and visualize. Minimal data preparation: They require minimal effort for data preparation compared to other algorithms. Handling of data types: Decision trees can handle both numeric and categorical data efficiently. Robustness to non-linear parameters: Non-linear parameters have minimal impact on the performance of decision trees. Disadvantages of Decision Trees Overfitting: Decision trees may overfit the training data, capturing noise and leading to poor generalization on unseen data. High variance: The model may become unstable with small variations in the training data, resulting in high variance. Low bias, high complexity: Highly complex decision trees have low bias, making them prone to difficulties in generalizing new data. Important Terms in Decision Trees Below are important terms that are also used for measuring impurity in decision trees: 1. Entropy Entropy is a measure of randomness or unpredictability in a dataset. It quantifies the impurity of the dataset. A dataset with high entropy contains a mix of different classes or categories, making predictions more uncertain. Example: Consider a dataset containing data from various animals as in Figure 3. If the dataset includes a diverse range of animals with no clear patterns or distinctions, it has high entropy. Figure 3: Animal datasets 2. Information Gain Information gain is the measure of the decrease in entropy after splitting the dataset based on a particular attribute or condition. It quantifies the effectiveness of a split in reducing uncertainty. Example: When we split the data into subgroups based on specific conditions (e.g., features of the animals) like in Figure 3, we calculate information gain by subtracting the entropy of each subgroup from the entropy before the split. Higher information gain indicates a more effective split that results in greater homogeneity within subgroups. 3. Gini Impurity Gini impurity is another measure of impurity or randomness in a dataset. It calculates the probability of misclassifying a randomly chosen element if it were randomly labeled according to the distribution of labels in the dataset. In decision trees, Gini impurity is often used as an alternative to entropy for evaluating splits. Example: Suppose we have a dataset with multiple classes or categories. The Gini impurity is high when the classes are evenly distributed or when there is no clear separation between classes. A low Gini impurity indicates that the dataset is relatively pure, with most elements belonging to the same class. Classifications and Variations Implementation in Python The following is used to predict the Lung_cancer of the patients. 1. Importing necessary libraries for data analysis and visualization in Python: Python import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns # to ensure plots are displayed inline in Notebook %matplotlib inline # Set Seaborn style for plots sns.set_style("whitegrid") # Set default Matplotlib style plt.style.use("fivethirtyeight") 2. Uploading the CSV file containing the data and loading: Python import pandas as pd # Load the data from the CSV file df = pd.read_csv('survey_lung_cancer.csv') Python df.head() # Displaying first five rows of the dataframe EDA (Exploratory Data Analysis): Python sns.countplot(x='LUNG_CANCER', data=df) # Count plot using Seaborn # to visualize the distribution of values in "LUNG_CANCER" column Python # title AGE from matplotlib import pyplot as plt df['AGE'].plot(kind='hist', bins=20, title='AGE') plt.gca().spines[['top', 'right',]].set_visible(False) 3. Iterating through columns, identifying categorical columns, and appending: Python categorical_col = [] for column in df.columns: if df[column].dtype == object and len(df[column].unique()) <= 50: categorical_col.append(column) df['LUNG_CANCER'] = df.LUNG_CANCER.astype("category").cat.codes 4. Removing the column "LUNG_CANCER" for further processing: Python categorical_col.remove('LUNG_CANCER') 5. Encoding categorical variables using LabelEncoder: Python from sklearn.preprocessing import LabelEncoder # creating an instance of the LabelEncoder class # LabelEncoder will be used to transform categorical values into numerical labels label = LabelEncoder() for column in categorical_col: df[column] = label.fit_transform(df[column]) 6. Dataset splitting for Machine Learning, train_test_split: Python from sklearn.model_selection import train_test_split # X contains the features (all columns except 'LUNG_CANCER') # y contains the target variable ('LUNG_CANCER') from the DataFrame df X = df.drop('LUNG_CANCER', axis=1) y = df.LUNG_CANCER # performing the Split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) 7. Function for model evaluation and reporting: Overall, the function below serves as a convenient tool for assessing the performance of classification models and generating detailed reports, facilitating model evaluation and interpretation. Python # import functions from scikit-learn for model evaluation from sklearn.metrics import accuracy_score, confusion_matrix, classification_report # clf: The classifier model to be evaluated # X_train, y_train: The features and target variable of the training set # X_test, y_test: The features and target variable of the testing set def print_score(clf, X_train, y_train, X_test, y_test, train=True): if train: pred = clf.predict(X_train) clf_report = pd.DataFrame(classification_report(y_train, pred, output_dict=True)) print("Train Result:\n_________________________") print(f"Accuracy Score: {accuracy_score(y_train, pred) * 100:.2f}%") print("_________________________") print(f"CLASSIFICATION REPORT:\n{clf_report}") print("_________________________________________________________________________") print(f"Confusion Matrix: \n {confusion_matrix(y_train, pred)}\n") elif train==False: pred = clf.predict(X_test) clf_report = pd.DataFrame(classification_report(y_test, pred, output_dict=True)) print("\nTest Result:\n_________________________") print(f"Accuracy Score: {accuracy_score(y_test, pred) * 100:.2f}%") print("_________________________") print(f"CLASSIFICATION REPORT:\n{clf_report}") print("_________________________________________________________________________") print(f"Confusion Matrix: \n {confusion_matrix(y_test, pred)}\n") Training and evaluation of decision tree classifier: Overall, this code provides a comprehensive evaluation of the decision tree classifier's performance on both the training and testing sets, including the accuracy score, classification report, and confusion matrix for each set. During the training process, the decision tree algorithm uses entropy and information gain to recursively split nodes and build a tree that maximizes information gain at each step. Python from sklearn.tree import DecisionTreeClassifier tree_clf = DecisionTreeClassifier(random_state=42) tree_clf.fit(X_train, y_train) print_score(tree_clf, X_train, y_train, X_test, y_test, train=True) print_score(tree_clf, X_train, y_train, X_test, y_test, train=False) The results above indicate that the decision tree classifier achieved high accuracy and performance on the training set, with some level of overfitting as evident from the difference in performance between the training and testing sets. While the classifier performed well on the testing set, there is room for improvement, particularly in terms of reducing false positives and false negatives. Further tuning of hyperparameters or exploring other algorithms may help improve generalization performance. 8. Visualization of decision tree classifier: Python # Importing Dependencies # Image is used to display images in the IPython environment # StringIO is used to create a file-like object in memory # export_graphviz is used to export the decision tree in Graphviz DOT format # pydot is used to interface with the Graphviz library from IPython.display import Image from six import StringIO from sklearn.tree import export_graphviz import pydot features = list(df.columns) features.remove("LUNG_CANCER") Python dot_data = StringIO() export_graphviz(tree_clf, out_file=dot_data, feature_names=features, filled=True) graph = pydot.graph_from_dot_data(dot_data.getvalue()) Image(graph[0].create_png()) 9. Training and evaluation of Random Forest classifier: Python from sklearn.ensemble import RandomForestClassifier # Creating an instance of the Random Forest classifier with n_estimators=100 # which specifies the number of decision trees in the forest rf_clf = RandomForestClassifier(n_estimators=100) rf_clf.fit(X_train, y_train) print_score(rf_clf, X_train, y_train, X_test, y_test, train=True) print_score(rf_clf, X_train, y_train, X_test, y_test, train=False) This code below will generate heatmaps for both the training and testing sets' confusion matrices. The heatmaps use different shades to represent the counts in the confusion matrix. The diagonal elements (true positives and true negatives) will have higher values and appear lighter, while off-diagonal elements (false positives and false negatives) will have lower values and appear darker. Python import seaborn as sns import matplotlib.pyplot as plt # Create heatmap for training set plt.figure(figsize=(8, 6)) sns.heatmap(cm_train, annot=True, fmt='d', cmap='viridis', annot_kws={"size": 16}) plt.title('Confusion Matrix for Training Set') plt.xlabel('Predicted labels') plt.ylabel('True labels') plt.show() # Create heatmap for testing set plt.figure(figsize=(8, 6)) sns.heatmap(cm_test, annot=True, fmt='d', cmap='plasma', annot_kws={"size": 16}) plt.title('Confusion Matrix for Testing Set') plt.xlabel('Predicted labels') plt.ylabel('True labels') plt.show() XGBoost for Classification Python from xgboost import XGBClassifier from sklearn.metrics import accuracy_score # Instantiate XGBClassifier xgb_clf = XGBClassifier() # Train the classifier xgb_clf.fit(X_train, y_train) # Predict on the testing set y_pred = xgb_clf.predict(X_test) # Evaluate accuracy accuracy = accuracy_score(y_test, y_pred) print("Accuracy:", accuracy) The accuracy above indicates that the model's predictions align closely with the actual class labels, demonstrating its effectiveness in distinguishing between the classes. This code below will generate a bar plot showing the relative importance of the top features in the XGBoost model. The importance is typically calculated based on metrics such as gain, cover, or frequency of feature usage across all trees in the ensemble. Python from xgboost import plot_importance import matplotlib.pyplot as plt # Plot feature importance plt.figure(figsize=(10, 6)) plot_importance(xgb_clf, max_num_features=10) # Specify the maximum number of features to show plt.show() 10. Plotting the first tree in the XGBoost model: Python from xgboost import plot_tree # Plot the first tree plt.figure(figsize=(10, 20)) plot_tree(xgb_clf, num_trees=0, rankdir='TB') # Specify the tree number to plot plt.show() Conclusion In conclusion, this article gives an idea about how decision trees and their advanced variants like Random Forest and XGBoost offer powerful tools for classification and regression machine learning tasks. Through this journey, we've explored the fundamental concepts of decision trees, including entropy, information gain, and Gini impurity, which form the basis of their decision-making process. As we continue to delve deeper into the realm of machine learning, the versatility and effectiveness of decision trees and their variants underscore their significance in solving real-world problems across diverse domains. Whether it's classifying medical conditions, predicting customer behavior, or optimizing business processes, decision trees remain a cornerstone in the arsenal of machine learning techniques, driving innovation and progress in the field.
Reactive programming has become increasingly popular in modern software development, especially in building scalable and resilient applications. Kotlin, with its expressive syntax and powerful features, has gained traction among developers for building reactive systems. In this article, we’ll delve into reactive programming using Kotlin Coroutines with Spring Boot, comparing it with WebFlux, another choice for reactive programming yet more complex in the Spring ecosystem. Understanding Reactive Programming Reactive programming is a programming paradigm that deals with asynchronous data streams and the propagation of changes. It focuses on processing streams of data and reacting to changes as they occur. Reactive systems are inherently responsive, resilient, and scalable, making them well-suited for building modern applications that need to handle high concurrency and real-time data. Kotlin Coroutines Kotlin Coroutines provides a way to write asynchronous, non-blocking code in a sequential manner, making asynchronous programming easier to understand and maintain. Coroutines allow developers to write asynchronous code in a more imperative style, resembling synchronous code, which can lead to cleaner and more readable code. Kotlin Coroutines vs. WebFlux Spring Boot is a popular framework for building Java and Kotlin-based applications. It provides a powerful and flexible programming model for developing reactive applications. Spring Boot’s support for reactive programming comes in the form of Spring WebFlux, which is built on top of Project Reactor, a reactive library for the JVM. Both Kotlin Coroutines and WebFlux offer solutions for building reactive applications, but they differ in their programming models and APIs. 1. Programming Model Kotlin Coroutines: Kotlin Coroutines use suspend functions and coroutine builders like launch and async to define asynchronous code. Coroutines provide a sequential, imperative style of writing asynchronous code, making it easier to understand and reason about. WebFlux: WebFlux uses a reactive programming model based on the Reactive Streams specification. It provides a set of APIs for working with asynchronous data streams, including Flux and Mono, which represent streams of multiple and single values, respectively. 2. Error Handling Kotlin Coroutines: Error handling in Kotlin Coroutines is done using standard try-catch blocks, making it similar to handling exceptions in synchronous code. WebFlux: WebFlux provides built-in support for error handling through operators like onErrorResume and onErrorReturn, allowing developers to handle errors in a reactive manner. 3. Integration With Spring Boot Kotlin Coroutines: Kotlin Coroutines can be seamlessly integrated with Spring Boot applications using the spring-boot-starter-web dependency and the kotlinx-coroutines-spring library. WebFlux: Spring Boot provides built-in support for WebFlux, allowing developers to easily create reactive RESTful APIs and integrate with other Spring components. Show Me the Code The Power of Reactive Approach Over Imperative Approach The provided code snippets illustrate the implementation of a straightforward scenario using both imperative and reactive paradigms. This scenario involves two stages, each taking 1 second to complete. In the imperative approach, the service responds in 2 seconds as it executes both stages sequentially. Conversely, in the reactive approach, the service responds in 1 second by executing each stage in parallel. However, even in this simple scenario, the reactive solution exhibits some complexity, which could escalate significantly in real-world business scenarios. Here’s the Kotlin code for the base service: Kotlin @Service class HelloService { fun getGreetWord() : Mono<String> = Mono.fromCallable { Thread.sleep(1000) "Hello" } fun formatName(name:String) : Mono<String> = Mono.fromCallable { Thread.sleep(1000) name.replaceFirstChar { it.uppercase() } } } Imperative Solution Kotlin fun greet(name:String) :String { val greet = helloService.getGreetWord().block(); val formattedName = helloService.formatName(name).block(); return "$greet $formattedName" } Reactive Solution Kotlin fun greet(name:String) :Mono<String> { val greet = helloService.getGreetWord().subscribeOn(Schedulers.boundedElastic()) val formattedName = helloService.formatName(name).subscribeOn(Schedulers.boundedElastic()) return greet .zipWith(formattedName) .map { it -> "${it.t1} ${it.t2}" } } In the imperative solution, the greet function awaits the completion of the getGreetWord and formatName methods sequentially before returning the concatenated result. On the other hand, in the reactive solution, the greet function uses reactive programming constructs to execute the tasks concurrently, utilizing the zipWith operator to combine the results once both stages are complete. Simplifying Reactivity With Kotlin Coroutines To simplify the complexity inherent in reactive programming, Kotlin’s coroutines provide an elegant solution. Below is a Kotlin coroutine example demonstrating the same scenario discussed earlier: Kotlin @Service class CoroutineHelloService() { suspend fun getGreetWord(): String { delay(1000) return "Hello" } suspend fun formatName(name: String): String { delay(1000) return name.replaceFirstChar { it.uppercase() } } fun greet(name:String) = runBlocking { val greet = async { getGreetWord() } val formattedName = async { formatName(name) } "${greet.await()} ${formattedName.await()}" } } In the provided code snippet, we leverage Kotlin coroutines to simplify reactive programming complexities. The HelloServiceCoroutine class defines suspend functions getGreetWord and formatName, which simulates asynchronous operations using delay. The greetCoroutine function demonstrates an imperative solution using coroutines. Within a runBlocking coroutine builder, it invokes suspend functions sequentially to retrieve the greeting word and format the name, finally combining them into a single greeting string. Conclusion In this exploration, we compared reactive programming in Kotlin Coroutines with Spring Boot to WebFlux. Kotlin Coroutines offer a simpler, more sequential approach, while WebFlux, based on Reactive Streams, provides a comprehensive set of APIs with a steeper learning curve. Code examples demonstrated how reactive solutions outperform imperative ones by leveraging parallel execution. Kotlin Coroutines emerged as a concise alternative, seamlessly integrated with Spring Boot, simplifying reactive programming complexities. In summary, Kotlin Coroutines excels in simplicity and integration, making them a compelling choice for developers aiming to streamline reactive programming in Spring Boot applications.
Origin of Cell-Based Architecture In the rapidly evolving domain of digital services, the need for scalable and resilient architectures (the ability of the system to recover from a failure quickly) has peaked. The introduction of cell-based architecture marks a pivotal shift tailored to meet the surging demands of hyper-scaling (architecture's ability for rapid scaling in response to fluctuating demand). This methodology, essential for rapid scaling in response to fluctuating demands, has become the foundation for digital success. It's a strategy that empowers tech behemoths like Amazon and Facebook, along with service platforms such as DoorDash, to skillfully navigate the tidal waves of digital traffic during peak moments and ensure service to millions of users worldwide without a hitch. Consider the surge Amazon faces on Prime Day or the global traffic spike Facebook navigates during significant events. Similarly, DoorDash's quest to flawlessly handle a flood of orders showcases a recurring theme: the critical need for an architecture that scales vertically and horizontally — expanding capacity without sacrificing system integrity or the user experience. In the current landscape, where startups frequently encounter unprecedented growth rates, the dream of scaling quickly can become a nightmare of scalability issues. Hypergrowth — a rapid expansion that surpasses expectations — presents a formidable challenge, risking a company's collapse if it fails to scale efficiently. This challenge birthed the concept of hyperscaling, emphasizing an architecture's nimbleness in adapting and growing to meet dynamic demands. Essential to this strategy is extensive parallelization and rigorous fault isolation, ensuring companies can scale without succumbing to the pitfalls of rapid growth. Cell-based architecture emerges as a beacon for applications and services where downtime is not an option. In scenarios where every second of inactivity spells significant reputational or financial loss, this architectural paradigm proves invaluable. It is especially crucial for: Applications requiring uninterrupted operation to ensure customer satisfaction and maintain business continuity. Financial services vital for maintaining economic stability. Ultra-scale systems where failure is an unthinkable option. Multi-tenant services requiring segregated resources for specific clients. This architectural innovation was developed in direct response to the increasing need for modern, rapidly expanding digital services. It provides a scalable, resilient framework supporting continuous service delivery and operational superiority. Understanding Cell-Based Architecture What Exactly Is Cell-Based Architecture? Cell-based architecture is a modern approach to creating digital services that are both scalable and resilient, taking cues from the principles of distributed systems and microservices design patterns. This architecture breaks down an extensive system into smaller, independent units called cells. Each cell is self-sufficient, containing a specific segment of the system's functionality, data storage, compute, application logic, and dependencies. This modular setup allows each cell to be scaled, deployed, and managed independently, enhancing the system's ability to grow and recover from failures without widespread impact. Drawing an analogy to urban planning, consider cell-based architecture akin to a well-designed metropolis where each neighborhood operates autonomously, equipped with its services and amenities, yet contributes to the city's overall prosperity. In times of disruption, such as a power outage or a water main break, only the affected neighborhood experiences downtime while the rest of the city thrives. Just as a single neighborhood can experience disruption without paralyzing the entire city, a cell encountering an issue in this architectural framework does not trigger a system-wide failure. This ensures the digital service remains robust and reliable, maintaining high uptime and resilience. Cell-based architecture builds scalable and robust digital services by breaking down an extensive system into smaller, independent units called cells. Each cell is self-contained with its own data storage and computing power similar to how neighborhoods work in a city. They operate independently, so if one cell has a problem, it doesn't affect the rest of the system. This design helps improve the system's stability and ability to grow without causing widespread issues. Fig. 1: Cell-Based Architecture Key Components Cell: Akin to neighborhoods, cells are the foundational building blocks of this architecture. Each cell is an autonomous microservice cluster with resources capable of handling a subset of service responsibilities. A cell is a stand-alone version of the application with its own computing power, load balancer, and databases. This setup allows each cell to operate independently, making it possible to deploy, monitor, and maintain them separately. This independence means that if one cell runs into problems, it doesn't affect the others, which helps the system to scale effectively and stay robust. Cell Router: Cell Routers play a critical role similar to a city's traffic management system. They dynamically route requests to the most appropriate cell based on factors such as load, geographic location, or specific service requirements. By efficiently balancing the load across various cells, cell routers ensure that each request is processed by the cell best suited to handle it, optimizing system performance and the user experience, much like how traffic lights and signs direct the flow of vehicles to ensure smooth transit within a city. Inter-Cell Communication Layer: Despite the autonomy of individual cells, cooperation between them is essential for handling tasks across the system. The Inter-Cell Communication Layer facilitates secure and efficient message exchange between cells. This layer acts as the public transportation system of our city analogy, connecting different neighborhoods (cells) to ensure seamless collaboration and unified service delivery across the entire architecture. It ensures that even as cells operate independently, they can still work together effectively, mirroring how different parts of a city are connected yet function cohesively. Control Plane: The control plane is a critical component of cell-based architecture, acting as the central hub for administrative operations. It oversees tasks such as setting up new cells (provisioning), shutting down existing cells (de-provisioning), and moving customers between cells (migrating). This ensures that the infrastructure remains responsive to the system's and its users' needs, allowing for dynamic resource allocation and seamless service continuity. Why and When to Use Cell-Based Architecture? Why Use It? Cell-based architecture offers a robust framework for efficiently scaling digital services, guaranteeing their resilience and adaptability during expansion. Below is a breakdown of its advantages: Higher Scalability: By defining and managing the capacity of each cell, you can add more cells to scale out (handle growth by adding more system components, such as databases and servers, and spreading the workload evenly). This avoids hitting the resource limits that come with scaling up (accommodating growth by increasing the size of a system's component, such as a database, server, or subsystem). As demand grows, you add more cells, each a contained unit with known capacities, making the system inherently scalable. Safer Deployments: Deployments and rollbacks are smoother with cells. You can deploy changes to one cell at a time, minimizing the impact of any issues. Canary cells can be used to test new deployments under actual conditions with minimal risk, providing a safety net for broader deployment. Easy Testability: Testing large, spread-out systems can be challenging, especially as they get bigger. However, with cell-based architecture, each cell is kept to a manageable size, making it much simpler to test how they behave at their largest capacity. Testing a whole big service can be too expensive and complex. However, testing just one cell is doable because you can simulate the most significant amount of work the cell can handle, similar to the most crucial job a single customer might give your application. This makes it practical and cost-effective to ensure each cell runs smoothly. Lower Blast Radius: Cell-based architecture limits the spread of failures by isolating issues within individual cells, much like neighborhoods in a city. This division ensures that a problem in one cell doesn't affect the entire system, maintaining overall functionality. Each cell operates independently, minimizing any single incident's impact area, or "blast radius," akin to the regional isolation seen in large-scale services. This setup enhances system resilience by keeping disruptions contained and preventing widespread outages.Fig. 2: Cell-based architecture services exhibit enhanced resilience to failures and feature a reduced blast radius compared to traditional services Improved Reliability and Recovery Higher Mean Time Between Failure (MTBF): Cell-based architecture increases the system's reliability by reducing how often problems occur. This design keeps each cell small and manageable, allowing for regular checks and maintenance, smoothing operations and making them more predictable. With customers distributed across different cells, any issues affect only a limited set of requests and users. Changes are tested on just a few cells at a time, making it easy to revert without widespread impact. For example, if you have customers divided across ten cells, a problem in one cell affects only 10% of your customers. This controlled approach to managing changes and addressing issues quickly means the system experiences fewer disruptions, leading to a more stable and reliable service. Lower Mean Time to Recovery (MTTR): Recovery is quicker and more straightforward with cells since you deal with a more minor, contained issue rather than a system-wide problem. Higher Availability: Cell-based architecture can lead to fewer and shorter failures, improving the overall uptime of your service. Even though there might be more potential points of failure (each cell could theoretically fail), the impact of each failure is significantly reduced, and they're easier to fix. When to Use It? Here's a brief guide to help you understand when it's advantageous to use this architectural strategy: High-Stakes Applications: If downtime could severely impact your customers, tarnish your reputation, or result in substantial financial loss, a cell-based approach can safeguard against widespread disruptions. Critical Economic Infrastructure: Cell-based architecture ensures continuous operation for financial services industries (FSI), where workloads are pivotal to economic stability. Ultra-Scale Systems: Systems too large or critical to fail—those that must maintain operation under almost any circumstance—are prime candidates for cell-based design. Stringent Recovery Objectives: Cell-based architecture offers quick recovery capabilities for workloads requiring a Recovery Point Objective (RPO) of less than 5 seconds and a Recovery Time Objective (RTO) of less than 30 seconds. Multi-Tenant Services with Dedicated Needs: For services where tenants demand fully dedicated resources, assigning them their cell ensures isolation and dedicated performance. Although cell-based architecture brings considerable benefits to handling critical workloads, it also comes with its own hurdles, such as heightened complexity, elevated costs, the necessity for specialized tools and practices, and the need for investment in a routing layer. For a more in-depth analysis of these challenges, please see the "Weighing the Scales: Benefits and Challenges." Implementing Cell-Based Architecture This section highlights critical design factors that come into play while designing and implementing a cell-based architecture. Designing a Cell Cell design is a foundational aspect of cell-based architecture, where a system is divided into smaller, self-contained units known as cells. Each cell operates independently with its resources, making the entire system more scalable and resilient. To embark on cell design, identify distinct functionalities within your system that can be isolated into individual cells. This might involve grouping services by their operational needs or user base. Once you've defined these boundaries, equip each cell with the necessary resources, such as databases and application logic, to ensure it can function autonomously. This setup facilitates targeted scaling and recovery and minimizes the impact of failures, as issues in one cell won't spill over to others. Implementing effective communication channels between cells and establishing comprehensive monitoring are crucial steps to maintain system cohesion and oversee cell performance. By systematically organizing your architecture into cells, you create a robust framework that enhances the manageability and adaptability of your system. Here are a few ideas on cell design that can be leveraged to bolster system resilience: Distribute Cells Across Availability Zones: By positioning cells across different availability zones (AZs), you can protect your system against the failure of a single data center or geographic location. This geographical distribution ensures that even if one AZ encounters issues, other cells in different AZs can continue to operate, maintaining overall system availability and reducing the risk of complete service downtime. Implement Redundant Cell Configurations: Creating redundant copies of cells within and across AZs can further enhance resilience. This redundancy means that if one cell fails, its responsibilities can be immediately taken over by a duplicate cell, minimizing service disruption. This approach requires careful synchronization between cells to ensure data consistency but significantly improves fault tolerance. Design Cells for Autonomous Operation: Ensuring that each cell can operate independently, with its own set of resources, databases, and application logic, is crucial. This independence allows cells to be isolated from failures elsewhere in the system. Even if one cell experiences a problem, it won't spread to others, localizing the impact and making it easier to identify and rectify issues. Use Load Balancers and Cell Routers Strategically: Integrating load balancers and cell routers that are aware of cell locations and health statuses can help efficiently redirect traffic away from troubled cells or AZs. This dynamic routing capability allows for real-time adjustments to traffic flow, directing users to the healthiest available cells and balancing the load to prevent overburdening any single cell or AZ. Facilitate Easy Cell Replication and Deployment: Design cells with replication and redeployment in mind. In case of a cell or AZ failure, having mechanisms for quickly spinning up new cells in alternative locations can be invaluable. Automation tools and templates for cell deployment can expedite this process, reducing recovery times and enhancing overall system resilience. Regularly Test Failover Processes: Regular testing of cell failover processes, including simulated failures and recovery drills, can ensure that your system responds as expected during actual outages. These tests can reveal potential weaknesses in your cell design and failover strategies, allowing for continuous improvement of system resilience. By incorporating these ideas into your cell design, you can create a more resilient system capable of withstanding various failure scenarios while minimizing the impact on service availability and performance. Cell Partitioning Cell partitioning is a crucial technique in cell-based architecture. It focuses on dividing a system's workload among distinct cells to optimize performance, scalability, and resilience. It involves categorizing and directing user requests or data to specific cells based on predefined criteria. This process ensures no cell becomes overwhelmed, enhancing system reliability and efficiency. How Cell Partitioning Can Be Done: Identify Partition Criteria: Determine the basis for distributing workloads among cells. Typical criteria include geographic location, user ID, request type, or date range. This step is pivotal in defining how the system categorizes and routes requests to the appropriate cells. Implement Routing Logic: Develop a routing mechanism within the cell router or API gateway that uses the identified criteria to direct incoming requests to the correct cell. This might involve dynamic decision-making algorithms that consider current cell load and availability. Continuous Monitoring and Adjustment: Regularly monitor the performance and load distribution across cells. Use this data to adjust partitioning criteria and routing logic to maintain optimal system performance and scalability. Partitioning Algorithms: Several algorithms can be utilized for effective cell partitioning, each with its strengths and tailored to different types of workloads and system requirements: Consistent Hashing: Requests are distributed based on the hash values of the partition key (e.g., user ID), ensuring even workload distribution and minimal reorganization when cells are added or removed. Range-Based Partitioning: Divides data into ranges (e.g., alphabetical or numerical) and assigns each range to a specific cell. This is ideal for ordered data, allowing efficient query operations. Round Robin: This method distributes requests evenly across all available cells in a cyclic manner. It is straightforward and helpful in achieving a basic level of load balancing. Sharding: Similar to range-based partitioning but more complex, sharding involves splitting large databases into smaller, faster, more easily managed parts, or "shards," each handled by a separate cell. Dynamic Partitioning: Adjusts partitioning in real-time based on workload characteristics or system performance metrics. This approach requires advanced algorithms capable of analyzing system states and making immediate adjustments. By thoughtfully implementing cell partitioning and choosing the appropriate algorithm, you can significantly enhance your cell-based architecture's performance, scalability, and resilience. Regular review and adjustment of your partitioning strategy ensures it continues to meet your system's evolving needs. Implementing a Cell Router In cell-based architecture, the cell router is crucial for steering traffic to the correct cells, ensuring efficient workload management and scalability. An effective cell router hinges on two key elements: traffic routing logic and failover strategies, which maintain system reliability and optimize performance. Implementing Traffic Routing Logic: Start by defining the criteria for how requests are directed to various cells, including the users' geographic location, the type of request, and the specific services needed. The aim is to reduce latency and evenly distribute the load. Employ dynamic routing that adapts to cell availability and workload changes in real time, possibly through integration with a service discovery tool that monitors each cell's status and location. Establishing Failover Strategies: Solid failover processes are essential for the cell router to ensure the system's dependability. Should any cell become unreachable, the router must automatically reroute traffic to the next available cell, requiring minimal manual intervention. This is achieved by implementing health checks across cells to swiftly identify and respond to failures, thus keeping the user experience smooth and the service highly available, even during cell outages. Fig 3. The cell router ensures a smooth user experience by redirecting traffic to healthy cells during outages, maintaining uninterrupted service availability For the practical implementation of a cell router, you can take one of the following approaches: Load Balancers: Use cloud-based load balancers that dynamically direct traffic based on specific request attributes, such as URL paths or headers, according to set rules. API Gateways: An API gateway can serve as the primary entry for all incoming requests and route them to the appropriate cell based on configured logic. Service Mesh: A service mesh offers a network layer that facilitates efficient service-to-service communications and routing requests based on policies, service discovery, and health status. Custom Router Service: Developing a custom service allows routing decisions based on detailed request content, current cell load, or bespoke business logic, offering tailored control over traffic management. Choosing the right implementation strategy for a cell router depends on specific needs, such as the granularity of routing decisions, integration capabilities with existing systems, and management simplicity. Each method provides varying degrees of control, complexity, and adaptability to cater to distinct architectural requirements. Cell Sizing Cell sizing in a cell-based architecture refers to determining each cell's optimal size and capacity to ensure it can handle its designated workload effectively without overburdening. Proper cell sizing is crucial for several reasons: Balanced Load Distribution: Correctly sized cells help achieve a balanced distribution of workloads across the system, preventing any single cell from becoming a bottleneck. Scalability: Well-sized cells can scale more efficiently. As demand increases, the system can add more cells or adjust resources within existing cells to accommodate growth. Resilience and Recovery: Smaller, well-defined cells can isolate failures more effectively, limiting the impact of any single point of failure. This makes the system more resilient and simplifies recovery processes. Cost Efficiency: Optimizing cell size helps utilize resources more efficiently, avoiding unnecessary expenditure on underutilized capacities. How Cell Sizing Is Done? Cell sizing involves a careful analysis of several factors: Workload Analysis: Understand the nature and volume of each cell's workload. This includes peak demand times, data throughput, and processing requirements. Resource Requirements: Based on the workload analysis, estimate the resources (CPU, memory, storage) each cell needs to operate effectively under various conditions. Performance Metrics: Consider key performance indicators (KPIs) that define successful cell operation. This could include response times, error rates, and throughput. Scalability Goals: Define how the system should scale in response to increased demand. This will influence whether cells should be designed to scale up (increase resources in a cell) or scale out (add more cells). Testing and Adjustment: Validate cell size assumptions by testing under simulated workload conditions. Monitoring real-world performance and adjusting as needed is a continuous part of cell sizing. Effective cell sizing often involves a combination of theoretical analysis and empirical testing. Starting with a best-guess estimate based on workload characteristics and adjusting based on observed performance ensures that cells remain efficient, responsive, and cost-effective as the system evolves. Cell Deployment Cell deployment in a cell-based architecture is the process of distributing and managing your application's workload across multiple self-contained units called cells. This strategy ensures scalability, resilience, and efficient resource use. Here's a concise guide on how it's typically done and the technology choices available for effective implementation. How Is Cell Deployment Done? Automated Deployment Pipelines: Start by setting up automated deployment pipelines. These pipelines handle your application's packaging, testing, and deployment to various cells. Automation ensures consistency, reduces errors, and enables rapid deployment across cells. Blue/Green Deployments: Use blue/green deployment strategies to minimize downtime and reduce risk. By deploying the new version of your application to a separate environment (green) while keeping the current version (blue) running, you can switch traffic to the latest version once it's fully ready and tested. Canary Releases: Gradually roll out updates to a small subset of cells or users before making them available system-wide. This allows you to monitor the impact of changes and roll them back if necessary without affecting all users. Technology Choices for Cell Deployment: Container Orchestration Tools: Tools such as Kubernetes, AWS ECS, and Docker Swarm are crucial for orchestrating cell deployments, enabling the encapsulation of applications into containers for streamlined deployment, scaling, and management across various cells. CI/CD Tools: Continuous Integration and Continuous Deployment (CI/CD) tools such as Jenkins, GitLab CI, CircleCI, and AWS Pipeline facilitate the automation of testing and deployment processes, ensuring that new code changes can be efficiently rolled out. Infrastructure as Code (IaC): Tools like Terraform and AWS CloudFormation allow you to define your infrastructure in code, making it easier to replicate and deploy cells across different environments or cloud providers. Service Meshes: Service meshes like Istio or Linkerd provide advanced traffic management capabilities, including canary deployments and service discovery, which are crucial for managing communication and cell updates. By leveraging these deployment strategies and technologies, you can achieve a high degree of automation and control in your cell deployments, ensuring your application remains scalable, reliable, and easy to manage. Cell Observability Cell observability is crucial in a cell-based architecture to ensure you have comprehensive visibility into each cell's health, performance, and operational metrics. It allows you to monitor, troubleshoot, and optimize the system effectively, enhancing overall reliability and user experience. Implementing Cell Observability: To achieve thorough cell observability, focus on three key areas: logging, monitoring, and tracing. Logging captures detailed events and operations within each cell. Monitoring tracks key performance indicators and health metrics in real time. Tracing follows requests as they move through the cells, identifying bottlenecks or failures in the workflow. Technology Choices for Cell Observability: Logging Tools: Solutions like Elasticsearch, Logstash, Kibana (ELK Stack), or Splunk provide powerful logging capabilities, allowing you to aggregate and analyze logs from all cells centrally. Monitoring Solutions: Prometheus, coupled with Grafana for visualization, offers robust monitoring capabilities with support for custom metrics. Cloud-native services like Amazon CloudWatch or Google Operations (formerly Stackdriver) provide integrated monitoring solutions tailored for applications deployed on their respective cloud platforms. Distributed Tracing Systems: Tools like Jaeger, Zipkin, and AWS XRay enable distributed tracing, helping you to understand the flow of requests across cells and identify latency issues or failures in microservices interactions. Service Meshes: Service meshes such as Istio or Linkerd inherently offer observability features, including monitoring, logging, and tracing requests between cells without requiring changes to your application code. By leveraging these tools and focusing on comprehensive observability, you can ensure that your cell-based architecture remains performant, resilient, and capable of supporting your application's dynamic needs. Weighing the Scales: Benefits and Challenges Adopting Cell-Based Architecture transforms the structural and operational dynamics of digital services. Breaking down a service into independently scalable and resilient units (cells) offers a robust framework for managing complexity and ensuring system availability. However, this architectural paradigm also introduces new challenges and complexities. Here's a deeper dive into the technical advantages and considerations. Benefits Horizontal Scalability: Unlike traditional scale-up approaches, Cell-Based Architecture enables horizontal scaling by adding more cells. This method alleviates common bottlenecks associated with centralized databases or shared resources, allowing for linear scalability as user demand increases. Fault Isolation and Resilience: The architecture's compartmentalized design ensures that failures are contained within individual cells, significantly reducing the system's overall blast radius. This isolation enhances the system's resilience, as issues in one cell can be mitigated or repaired without impacting the entire service. Deployment Agility: Leveraging cells allows for incremental deployments and feature rollouts, akin to implementing rolling updates across microservices. This granularity in deployment strategy minimizes downtime and enables a more flexible response to market or user demands. Simplified Operational Complexity: While the initial setup is complex, the ongoing operation and management of cells can be more straightforward than monolithic architectures. Each cell's autonomy simplifies monitoring, troubleshooting, and scaling efforts, as operational tasks can be executed in parallel across cells. Challenges (Considerations) Architectural Complexity: Transitioning to or implementing Cell-Based Architecture demands a meticulous design phase, focusing on defining cell boundaries, data partitioning strategies, and inter-cell communication protocols. This complexity requires a deep understanding of distributed systems principles and may necessitate a development and operational practices shift. Resource and Infrastructure Overhead (Higher Cost): Each cell operates with its set of resources and infrastructure, potentially leading to increased overhead compared to shared-resource models. Optimizing resource utilization and cost-efficiency becomes paramount, especially as the number of cells grows. Inter-Cell Communication Management: Ensuring coherent and efficient communication between cells without introducing tight coupling or significant latency is a critical challenge. Developers must design a communication layer that supports the necessary interactions while maintaining cells' independence and avoiding performance bottlenecks. Data Consistency and Synchronization: Maintaining data consistency across cells, especially in scenarios requiring global state or real-time data synchronization, adds another layer of complexity. Implementing strategies like event sourcing, CQRS (Command Query Responsibility Segregation), or distributed sagas may be necessary to address these challenges. Specialized Tools and Practices: Operating a cell-based architecture requires specialized operational tools and practices to effectively manage multiple instances of workloads. Routing Layer Investment: A robust cell routing layer is essential for directing traffic appropriately across cells, necessitating additional investment in technology and expertise. Navigating the Trade-offs Opting for Cell-Based Architecture involves navigating these trade-offs and evaluating whether scalability, resilience, and operational agility benefits outweigh the complexities of implementation and management. It is most suitable for services requiring high availability, those undergoing rapid expansion, or systems where modular scaling and failure isolation are critical. Best Practices and Pitfalls Best Practices Adopting a cell-based architecture can significantly enhance the scalability and resilience of your applications. Here are streamlined best practices for implementing this approach effectively: Begin With a Solid Foundation Treat Your Current Setup as Cell Zero: Viewing your existing system as the initial cell, gradually introducing traffic routing and distribution across new cells. Launch with Multiple Cells: Implement more than one cell from the beginning to quickly learn and adapt to the operational dynamics of a cell-based environment. Plan for Flexibility and Growth Implement a Cell Migration Mechanism Early: Prepare for the need to move customers between cells, ensuring you can scale and adjust without disruption. Focus on Reliability Conduct a Failure Mode Analysis: Identify and assess potential failures within each cell and their impact, developing strategies to ensure robustness and minimize cross-cell effects. Ensure Independence and Security Maintain Cell Autonomy: Design cells to be self-sufficient, with dedicated resources and clear ownership, possibly by a single team. Secure Communication: Use versioned, well-defined APIs for cell interactions and enforce security policies at the API gateway level. Minimize Dependencies: Keep inter-cell dependencies low to preserve the architecture's benefits, such as fault isolation. Optimize Deployment and Operations Avoid Shared Resources: Each cell should have its data storage to eliminate global state dependencies. Deploy in Waves: Introduce updates and deployments in phases across cells for better change management and quick rollback capabilities. By following these practices, you can leverage cell-based architecture to create scalable, resilient, but also manageable, and secure systems ready to meet the challenges of modern digital demands. Common Pitfalls While cell-based architecture offers significant advantages for scalability and resilience, it also introduces specific challenges and pitfalls that organizations need to be aware of when adopting this approach: Complexity in Management and Operations Increased Operational Overhead: Managing multiple cells can introduce complexity in deployment, monitoring, and operations, requiring robust automation and orchestration tools to maintain efficiency. Consistency Management: Ensuring data consistency across cells, especially in stateful applications, can be challenging and might require sophisticated synchronization mechanisms. Initial Setup and Migration Challenges Complex Migration Process: Transitioning to a cell-based architecture from a traditional setup can be complex, requiring careful planning to avoid service disruption and data loss. Steep Learning Curve: Teams may face a learning curve in understanding cell-based concepts and best practices, necessitating training and potentially slowing initial progress. Design and Architectural Considerations Cell Isolation: Properly isolating cells to prevent failure propagation requires meticulous design, failing which the system might not fully realize the benefits of fault isolation. Optimal Cell Size: Determining the optimal size for cells can be tricky, as overly small cells may lead to inefficiencies, while huge cells might compromise scalability and resilience. Resource Utilization and Cost Implications Potential for Increased Costs: If not carefully managed, the duplication of resources across cells can lead to increased operational costs. Underutilization of Resources: Balancing resource allocation to prevent underutilization while avoiding over-provisioning requires continuous monitoring and adjustment. Networking and Communication Overhead Network Complexity: The cell-based architecture may introduce additional network complexity, including the need for sophisticated routing and load-balancing strategies. Inter-Cell Communication: Ensuring efficient and secure communication between cells, especially in geographically distributed setups, can introduce latency and requires safe, reliable networking solutions. Security and Compliance Security Configuration: Each cell's need for individual security configurations can complicate enforcing consistent security policies across the architecture. Compliance Verification: Verifying that each cell complies with regulatory requirements can be more challenging in a distributed architecture, requiring robust auditing mechanisms. Scalability vs. Cohesion Trade-Off Dependency Management: While minimizing dependencies between cells enhances fault tolerance, it can also lead to challenges in maintaining application cohesion and consistency. Data Duplication: Avoiding shared resources may result in data duplication and synchronization challenges, impacting system performance and consistency. Organizations should invest in robust planning, adopt comprehensive automation and monitoring tools, and ensure ongoing team training to mitigate these pitfalls. Understanding these challenges upfront can help design a more resilient, scalable, and efficient cell-based architecture. Cell-Based Wins in the Real World Cell-based architecture has become essential for managing scalability and ensuring system resilience, from high-growth startups to tech giants like Amazon and Facebook. This architectural model has been adopted across various industries, reflecting its effectiveness in handling large-scale, critical workloads. Here's a brief look at how DoorDash and Slack have implemented cell-based architecture to address their unique challenges. DoorDash's Transition to Cell-Based Architecture Faced with the demands of hypergrowth, DoorDash migrated from a monolithic system to a cell-based architecture, marking a pivotal shift in its operational strategy. This transition, known as Project SuperCell, was driven by the need to efficiently manage fluctuating demand and maintain consistent service reliability across diverse markets. By leveraging AWS's cloud infrastructure, DoorDash was able to isolate failures within individual cells, preventing widespread system disruptions. It significantly enhanced their ability to scale resources and maintain service reliability, even during peak times, demonstrating the transformative potential of adopting a cell-based approach. Slack's Migration to Cell-Based Architecture Slack underwent a major shift to a cell-based architecture to lessen the impact of gray failures and boost service redundancy. Prompted by a review of a network outage, this move revealed the risks of depending solely on a single availability zone. The new cellular structure aims to confine failures more effectively and minimize the extent of potential site outages. With the adoption of isolated services in each availability zone, Slack has enabled its internal services to function independently within each zone, curtailing the fallout from outages and speeding up the recovery process. This significant redesign has markedly improved Slack's system resilience, underscoring cell-based architecture's role in ensuring high service availability and quality. Roblox's Strategic Shift to Cellular Infrastructure Roblox's shift to a cell-based architecture showcases its response to rapid growth and the need to support over 70 million daily active users with reliable, low-latency experiences. Roblox created isolated clusters within their data centers by adopting a cellular infrastructure, enhancing system resilience through service replication across cells. This setup allowed for the deactivation of non-functional cells without disrupting service, effectively containing failures. The move to cellular infrastructure has significantly boosted Roblox's system reliability, enabling the platform to offer always-on, immersive experiences worldwide. This strategy highlights the effectiveness of cell-based architecture in managing large-scale, dynamic workloads and maintaining high service quality as platforms expand. These examples from DoorDash, Slack, and Roblox illustrate the strategic value of cell-based architecture in addressing the challenges of scale and reliability. By isolating workloads into independent cells, these companies have achieved greater scalability, fault tolerance, and operational efficiency, showcasing the effectiveness of this approach in supporting dynamic, high-demand services. Key Takeaways Cell-based architecture represents a transformative approach for organizations aiming to achieve hyper-scalability and resilience in the digital era. Companies like Amazon, Facebook, DoorDash, and Slack have demonstrated their efficacy in managing hypergrowth and ensuring uninterrupted service by segmenting systems into independent, self-sufficient cells. This architectural strategy facilitates dynamic scaling and robust fault isolation and demands careful consideration of increased complexity, resource allocation, and the need for specialized operational tools. As businesses continue to navigate the demands of digital growth, the adoption of cell-based architecture emerges as a strategic solution for sustaining operational integrity and delivering consistent user experiences amidst the ever-evolving digital landscape. Acknowledgments This article draws upon the collective knowledge and experiences of industry leaders and practitioners, including insights from technical blogs, case studies from companies like Amazon, Slack, and Doordash, and contributions from the wider tech community. References https://docs.aws.amazon.com/wellarchitected/latest/reducing-scope-of-impact-with-cell-based-architecture/reducing-scope-of-impact-with-cell-based-architecture.html https://github.com/wso2/reference-architecture/blob/master/reference-architecture-cell-based.md https://newsletter.systemdesign.one/p/cell-based-architecture https://highscalability.com/cell-architectures/ https://www.youtube.com/watch?v=ReRrhU-yRjg https://slack.engineering/slacks-migration-to-a-cellular-architecture/ https://blog.roblox.com/2023/12/making-robloxs-infrastructure-efficient-resilient/
In the contemporary data landscape, characterized by vast volumes of diverse data sources, the necessity of anomaly detection intensifies. As organizations aggregate substantial datasets from disparate origins, the identification of anomalies assumes a pivotal role in reinforcing security protocols, streamlining operational workflows, and upholding stringent quality standards. Through the application of sophisticated methodologies encompassing statistical analysis, machine learning, and data visualization, anomaly detection emerges as a potent instrument for uncovering latent insights, mitigating risks, and facilitating real-time decision-making processes. This article centers on a focused application scenario: the detection of anomalies within a video/audio streaming platform to gauge real-time content delivery quality. Our objective is clear: to assess the quality of streaming video/audio content, ultimately enhancing the customer experience. Central to this discussion is the utilization of Quality of Service (QoS) metrics, complemented by GEO-IP services, to enrich data capture and facilitate proactive monitoring, detection, and intervention. What Is Quality of Service? Quality of service (QoS) refers to the measurement of the precision and reliability of the services provided to a platform, assessed through various metrics. It's a commonly employed concept in networking circles to ensure the optimal performance of a platform. This article focuses on establishing QoS metrics tailored specifically for video or audio content. We achieve this by extracting necessary metrics at the client edge (customer devices) and enhancing their attributes to provide deeper insights for business purposes. Why Quality of Service? The importance of "quality of service" lies in its ability to fulfill the specific needs of consumers. For instance, when customers are enjoying a live sports event through OTT streaming platforms like YouTube, it becomes paramount for the streaming company to assess the video quality across various regions. This necessity extends beyond video streaming to other sectors such as podcasting, audiobooks, and even award streaming services. How QoS Metrics Can Help in Anomaly Detection Integral to anomaly detection, QoS metrics furnish essential data and insights to pinpoint abnormal behavior and potential security risks across applications, systems, and networks. Continuous monitoring of metrics such as buffering ratio, bandwidth, and throughput enables the detection of anomalies through deviations from established thresholds or behavioral patterns, triggering alerts for swift intervention. Furthermore, QoS metrics facilitate root cause analysis by pinpointing underlying causes of anomalies, guiding the formulation of effective corrective actions. We need to design a solution in order to identify anomalies in three states: New York, New Jersey and Tamil Nadu for a streaming platform and ensure smooth streaming quality. We will leverage AWS components to compliment this solution. How Can We Solve This Problem Using Streaming Architecture? To comprehensively analyze the situation, we require additional attributes beyond just geographical location. For instance, in cases of streaming quality issues, organizations must ascertain whether the problem stems from the Internet Service Provider or if it is linked to recent code releases, potentially affecting specific operating systems on devices. Overall, there's a need for a Quality of Service (QoS) API service capable of collecting pertinent data from the client devices and relaying it to an API, which in turn disseminates these attributes to downstream components. With the initial details provided by the client, the downstream components can enhance the dataset. The JSON object below illustrates the basic information transmitted by the client device for a single event. Sample JSON event from client device: JSON { "video_start_time":"2023-09-10 10:30:30", "video_end_time":"2023-09-10 10:30:33", "total_play_time_mins" : "60", "uip":"10.122.9.22", "video_id":"xxxxxxxxxxxxxxx", "device_type":"ios", "device_model":"iphone11" } Architecture Option 1 The application code on the device can call the API Gateway, linked to a Kinesis proxy, which connects to a Kinesis Stream. This setup facilitates near real-time analysis of client data at this layer. Subsequently, data transformation can occur using a Lambda function, followed by storage in S3 for further analysis. This architecture addresses two primary use cases: firstly, the capability to analyze incoming QoS data in near real-time through Kinesis Stream, leveraging AWS tools like Kinesis Analytics for ad-hoc analytics with reduced latency. Secondly, the ability to write data to S3 using a simple Lambda code allows for batch analytics to be conducted. Mentioned approach effectively addresses scalability concerns in a streaming solution by leveraging various AWS components. In our specific use case, enriching incoming data with geo IP locations is essential, since we need information like country, state and ISP's. To achieve this, we can utilize a geo API, such as max mind, to incorporate geo-location, IP address, and other relevant dimensions. Alternatively, let's explore an architecture that assumes analytics are performed every minute, eliminating the need for a streaming layer and focusing solely on a delivery layer. Architecture Option 2 In this scenario, we'll illustrate the process of enriching data with geo and ISP-specific attributes to facilitate anomaly detection. Clients initiate the process by calling the API Gateway and passing along the relevant attributes. These values are then transmitted to the Kinesis Firehose via the Kinesis proxy. A transformation lambda function within the Kinesis Firehose executes a straightforward Python script to retrieve geo IP details from the MaxMind service. Subsequently, Kinesis Firehose batches the data and transfers it to S3. S3 serves as the central repository of truth for anomaly detection, housing all the necessary data for analysis. Below is a sample code snippet for calling the service to retrieve geo-IP details. As depicted, the code primarily centers on retrieving information from the MaxMind .mdb file supplied by the provider. Various methods exist for obtaining geo IP data; in this instance, I've chosen to have the .mdb file accessible via an S3 path. Alternatively, you can opt to retrieve it through API calls. The enriched data is then returned to Kinesis Firehose, where it undergoes batching, compression, and subsequent delivery to S3. Python import base64 import json import geoip2.database s3_city_url = "<maxmind_s3_url_path_for_city_details_mmdb_file>" s3_isp_url = "<maxmind_s3_url_path_for_isp_details_mmdb_file>" opener = ur.URLopener() city_file = opener.open(s3_city_url).read() isp_file = opener.open(s3_isp_url).read() def qos_handler(event, context): def enrichRecord(record): try: decodedata2 = base64.b64decode(record['data']) streaming_event_object = json.loads(decodedata2.decode("utf-8")) reader = geoip2.database.Reader(city_file, mode='RAW_FILE') response_data = reader.city(streaming_event_object['uip']) reader_isp_data = geoip2.database.Reader(isp_file, mode='RAW_FILE') response_isp_data = reader_isp.isp(streaming_event_object['uip']) streaming_event_object['cityname'] = response_data.city.name streaming_event_object['postalcode'] = response_data.postal.code streaming_event_object['metrocode'] = response_data.location.metro_code streaming_event_object['timezone'] = response_data.location.time_zone streaming_event_object['countryname'] = response_data.country.name streaming_event_object['countryisocode'] = response_data.country.iso_code streaming_event_object['origip'] = streaming_event_object['uip'] streaming_event_object['ispname']=response_data.isp jsonData = json.dumps(streaming_event_object) encoded_streaming_data = base64.b64encode(jsonData.encode("utf-8")) return { 'recordId': record['recordId'], 'result': "Ok", 'data': encoded_streaming_data.decode("utf-8") } except Exception as e: print("type of e:",type(e)) print("exception as e:",e) print("event[records]-input:",event['records']) output = list(map(enrichRecord, event['records'])) print("output:",output) return {'records': output} Analytics on Streamed Data After the data reaches S3, we can conduct ad-hoc analytics on it. Various options are available for analyzing the data once it resides in S3. It can be loaded into a data warehousing platform such as Redshift or Snowflake. Alternatively, if a data lake or data mesh serves as the source of truth, the data can be replicated there. During the analysis in S3, we primarily calculate the buffering ratio using the following formula: Plain Text The ratio is obtained by dividing the buffering time by the total playtime. In this example so we are calculating the buffering ratio as below, In our example: "video_start_time":"2023-09-10 10:30:30", "video_end_time":"2023-09-10 10:30:33", "total_play_time_mins" : "60", Buffering_ratio = diff(video_end_time,video_start_time)/total_play_time_mins Buffering_ratio = (3/3600) = 0.083 Detecting Anomalies To continue further, the following attributes will be available as rows in tabular format during the ETL operation at the Data Warehousing (DWH) stage. These values will be stored for each video/audio ID. By establishing a materialized view for the set of records stored over a certain period, we can compute an average value and percentages of the buffering ratio metric mentioned earlier. Sample JSON event with buffering ratio: JSON { "video_start_time":"2023-09-10 10:30:30", "video_end_time":"2023-09-10 10:30:33", "total_play_time_mins" : "60", "uip":"10.122.9.22", "video_id":"xxxxxxxxxxxxxxx", "device_type":"ios", "device_model":"iphone11", "Buffering_raio":"0.083", "uip":"10.122.9.22", "video_id":"xxxxxxxxxxxxxxx", "isp":"isp1", "country":"USA", "state":"NJ" } For simplicity, let's focus on one metric — buffering ratio — to gauge the streaming quality of sports matches or podcasts for customers. After capturing the real-time events and visualizing the tabular data, It is obvious NY exhibits a higher buffering ratio (out of the 3 states the organization is interested in), indicating that viewers may experience sluggish content delivery. This observation prompts further investigation into potential issues related to ISPs or networking by delving into other dimensions gathered from GEO-IP or device attributes. As the first step content providers choose to delve deeper into geographical dimensions at the city level, and they identify that Manhattan in New York has the highest buffering ratio among top 3 cities in NY having higher buffering ratios. Following this, content providers delve into the metrics associated with internet service provider (ISP) details specifically for Manhattan to identify potential causes. This examination uncovers that ISP1 exhibited a higher buffer ratio, and upon further investigation, it appears that ISP1 encountered internet speed issues only in Manhattan. These proactive analyses empower content providers to detect anomalies and evaluate their repercussions on consumers in particular regions, thereby proactively reaching out to consumers. Comparable analyses can be expanded to other factors such as device types and models. These steps demonstrate how anomaly detection can be carried out with robust data engineering, streaming solutions, and business intelligence in place. These data intrun can be us used for Machine learning algorithms as well for enhanced detections. Conclusion This article delved into leveraging QoS metrics for anomaly detection during content streaming in video or audio applications. A particular emphasis was placed on enriching data with GEO-IP details using the MAXMIND service, facilitating issue triage to specific dimensions such as country, state, county, or ISPs. Architectural options were also presented for implementing streaming solutions, accommodating both ad-hoc near real-time and batch analytics to pinpoint anomalies. I trust this article serves as a helpful starting point for exploring anomaly detection approaches within your organization. Notably, the discussed solution extends beyond OTT platforms, being applicable to diverse domains such as the financial sector, where near real-time anomaly detection is essential.
In an era where the pace of software development and deployment is accelerating, the significance of having a robust and integrated DevOps environment cannot be overstated. Azure DevOps, Microsoft's suite of cloud-based DevOps services, is designed to support teams in planning work, collaborating on code development, and building and deploying applications with greater efficiency and reduced lead times. The objective of this blog post is twofold: first, to introduce Azure DevOps, shedding light on its components and how they converge to form a powerful DevOps ecosystem, and second, to provide a balanced perspective by delving into the advantages and potential drawbacks of adopting Azure DevOps. Whether you're contemplating the integration of Azure DevOps into your workflow or seeking to optimize your current DevOps practices, this post aims to equip you with a thorough understanding of what Azure DevOps has to offer, helping you make an informed decision tailored to your organization's unique requirements. What Is Azure DevOps? Azure DevOps represents the evolution of Visual Studio Team Services, capturing over 20 years of investment and learning in providing tools to support software development teams. As a cornerstone in the realm of DevOps solutions, Azure DevOps offers a suite of tools catering to the diverse needs of software development teams. Microsoft provides this product in the Cloud with Azure DevOps Services or on-premises with Azure DevOps Server. It offers integrated features accessible through a web browser or IDE client. At its core, Azure DevOps comprises five key components, each designed to address specific aspects of the development process. These components are not only powerful in isolation but also offer enhanced benefits when used together, creating a seamless and integrated experience for users. Azure Boards It offers teams a comprehensive solution for project management, including agile planning, work item tracking, and visualization tools. It enables teams to plan sprints, track work with Kanban boards, and use dashboards to gain insights into their projects. This component fosters enhanced collaboration and transparency, allowing teams to stay aligned on goals and progress. Azure Repos It is a set of version control tools designed to manage code efficiently. It provides Git (distributed version control) or Team Foundation Version Control (centralized version control) for source code management. Developers can collaborate on code, manage branches, and track version history with complete traceability. This component ensures streamlined and accessible code management, allowing teams to focus on building rather than merely managing their codebase. Azure Pipelines Azure Pipelines automates the stages of the application's lifecycle, from continuous integration and continuous delivery to continuous testing, build, and deployment. It supports any language, platform, and cloud, offering a flexible solution for deploying code to multiple targets such as virtual machines, various environments, containers, on-premises, or PaaS services. With Azure Pipelines, teams can ensure that code changes are automatically built, tested, and deployed, facilitating faster and more reliable software releases. Azure Test Plans Azure Test Plans provide a suite of tools for test management, enabling teams to plan and execute manual, exploratory, and automated testing within their CI/CD pipelines. Furthermore, Azure Test Plans ensure end-to-end traceability by linking test cases and suites to user stories, features, or requirements. They facilitate comprehensive reporting and analysis through configurable tracking charts, test-specific widgets, and built-in reports, empowering teams with actionable insights for continuous improvement. Thus providing a framework for rigorous testing to ensure that applications meet the highest standards before release. Azure Artifacts It allows teams to manage and share software packages and dependencies across the development lifecycle, offering a streamlined approach to package management. This feature supports various package formats, including npm, NuGet, Python, Cargo, Maven, and Universal Packages, fostering efficient development processes. This service not only accelerates development cycles but also enhances reliability and reproducibility by providing a reliable source for package distribution and version control, ultimately empowering teams to deliver high-quality software products with confidence. Below is an example of architecture leveraging various Azure DevOps services: Image captured from Microsoft Benefits of Leveraging Azure DevOps Azure DevOps presents a compelling array of benefits that cater to the multifaceted demands of modern software development teams. Its comprehensive suite of tools is designed to streamline and optimize various stages of the development lifecycle, fostering efficiency, collaboration, and quality. Here are some of the key advantages: Seamless Integration One of Azure DevOps' standout features is its ability to seamlessly integrate with a plethora of tools and platforms, whether they are from Microsoft or other vendors. This interoperability is crucial for anyone who uses a diverse set of tools in their development processes. Scalability and Flexibility Azure DevOps is engineered to scale alongside your business. Whether you're working on small projects or large enterprise-level solutions, Azure DevOps can handle the load, providing the same level of performance and reliability. This scalability is a vital attribute for enterprises that foresee growth or experience fluctuating demands. Enhanced Collaboration and Visibility Collaboration is at the heart of Azure DevOps. With features like Azure Boards, teams can have a centralized view of their projects, track progress, and coordinate efforts efficiently. This visibility is essential for aligning cross-functional teams, managing dependencies, and ensuring that everyone is on the same page. Continuous Integration and Deployment (CI/CD) Azure Pipelines provides robust CI/CD capabilities, enabling teams to automate the building, testing, and deployment of their applications. This automation is crucial to accelerate their time-to-market and improve the quality of their software. By automating these processes, teams can detect and address issues early, reduce manual errors, and ensure that the software is always in a deployable state, thereby enhancing operational efficiency and software reliability. Drawbacks of Azure DevOps While Azure DevOps offers a host of benefits, it's essential to acknowledge and understand its potential drawbacks. Like any tool or platform, it may not be the perfect fit for every organization or scenario. Here are some of the disadvantages that one might encounter: Vendor Lock-In By adopting Azure DevOps services for project management, version control, continuous integration, and deployment, organizations may find themselves tightly integrated into the Microsoft ecosystem. This dependency could limit flexibility and increase reliance on Microsoft's tools and services, making it challenging to transition to alternative platforms or technologies in the future. Integration Challenges Although Azure DevOps boasts impressive integration capabilities, there can be challenges when interfacing with certain non-Microsoft or legacy systems. Some integrations may require additional customization or the use of third-party tools, potentially leading to increased complexity and maintenance overhead. For organizations heavily reliant on non-Microsoft products, this could pose integration and workflow continuity challenges. Cost Considerations Azure DevOps operates on a subscription-based pricing model, which, while flexible, can become significant at scale, especially for larger teams or enterprises with extensive requirements. The cost can escalate based on the number of users, the level of access needed, and the use of additional features and services. For smaller teams or startups, the pricing may be a considerable factor when deciding whether Azure DevOps is the right solution for their needs. Potential for Over-Complexity With its myriad of features and tools, there's a risk of over-complicating workflows and processes within Azure DevOps. Teams may find themselves navigating through a plethora of options and configurations, which, if not properly managed, can lead to inefficiency rather than improved productivity. Organizations must strike a balance between leveraging Azure DevOps' capabilities and maintaining simplicity and clarity in their processes. While these disadvantages are noteworthy, they do not necessarily diminish the overall value that Azure DevOps can provide to an organization. It's crucial for enterprises and organizations to carefully assess their specific needs, resources, and constraints when considering Azure DevOps as their solution. By acknowledging these potential drawbacks, organizations can plan effectively, ensuring that their adoption of Azure DevOps is strategic, well-informed, and aligned with their operational goals and challenges. Conclusion In the landscape of modern software development, Azure DevOps stands out as a robust and comprehensive platform, offering a suite of tools designed to enhance and streamline the DevOps process. Its integration capabilities, scalability, and extensive features make it an attractive choice for any organization or enterprise. However, like any sophisticated platform, Azure DevOps comes with its own set of challenges and considerations. The vendor lock-in, integration complexities, cost factors, and potential for over-complexity are aspects that organizations need to weigh carefully. It's crucial for enterprises to undertake a thorough analysis of their specific needs, resources, and constraints when evaluating Azure DevOps as a solution. The decision to adopt Azure DevOps should be guided by a strategic assessment of how well its advantages align with the organization's goals and how its disadvantages might impact operations. For many enterprises, the benefits of streamlined workflows, enhanced collaboration, and improved efficiency will outweigh the drawbacks, particularly when the adoption is well-planned and aligned with the organization's objectives.
The ExecutorService in Java provides a flexible and efficient framework for asynchronous task execution. It abstracts away the complexities of managing threads manually and allows developers to focus on the logic of their tasks. Overview The ExecutorService interface is part of the java.util.concurrent package and represents an asynchronous task execution service. It extends the Executor interface, which defines a single method execute(Runnable command) for executing tasks. Executors Executors is a utility class in Java that provides factory methods for creating and managing different types of ExecutorService instances. It simplifies the process of instantiating thread pools and allows developers to easily create and manage executor instances with various configurations. The Executors class provides several static factory methods for creating different types of executor services: FixedThreadPool: Creates an ExecutorService with a fixed number of threads. Tasks submitted to this executor are executed concurrently by the specified number of threads. If a thread is idle and no tasks are available, it remains alive but dormant until needed. Java ExecutorService executor = Executors.newFixedThreadPool(5); CachedThreadPool: Creates an ExecutorService with an unbounded thread pool that automatically adjusts its size based on the workload. Threads are created as needed and reused for subsequent tasks. If a thread remains idle for a certain period, it may be terminated to reduce resource consumption. In a cached thread pool, submitted tasks are not queued but immediately handed off to a thread for execution. If no threads are available, a new one is created. If a server is so heavily loaded that all of its CPUs are fully utilized, and more tasks arrive, more threads will be created, which will only make matters worse. Idle time of threads is default to 60s, after which if they don't have any task thread will be terminated. Therefore, in a heavily loaded production server, you are much better off using Executors.newFixedThreadPool, which gives you a pool with a fixed number of threads, or using the ThreadPoolExecutor class directly, for maximum control. Java ExecutorService executor = Executors.newCachedThreadPool(); SingleThreadExecutor: Creates an ExecutorService with a single worker thread. Tasks are executed sequentially by this thread in the order they are submitted. This executor is useful for tasks that require serialization or have dependencies on each other. Java ExecutorService executor = Executors.newSingleThreadExecutor(); ScheduledThreadPool: Creates an ExecutorService that can schedule tasks to run after a specified delay or at regular intervals. It provides methods for scheduling tasks with fixed delay or fixed rate, allowing for periodic execution of tasks. newWorkStealingPool: Creates a work-stealing thread pool with the target parallelism level. This executor is based on the ForkJoinPool and is capable of dynamically adjusting its thread pool size to utilize all available processor cores efficiently. Overall, the Executors class simplifies the creation and management of executor instances. ExecutorService Tasks can be submitted to an ExecutorService for execution. These tasks are typically instances of Runnable or Callable, representing units of work that need to be executed asynchronously. Below are the methods in ExecutorService. 1. execute(Runnable command): Executes the given task asynchronously. Java ExecutorService executor = Executors.newFixedThreadPool(5); executor.execute(() -> { System.out.println("Task executed asynchronously"); }); 2. submit(Callable<T> task): Submits a task for execution and returns a Future representing the pending result of the task. Java ExecutorService executor = Executors.newSingleThreadExecutor(); Future<Integer> future = executor.submit(() -> { // Task logic return 42; }); 3. shutdown(): Initiates an orderly shutdown of the ExecutorService, allowing previously submitted tasks to execute before terminating. 4. shutdownNow(): Attempts to stop all actively executing tasks, halts the processing of waiting tasks, and returns a list of the tasks that were awaiting execution. Java List<Runnable> pendingTasks = executor.shutdownNow(); 5. awaitTermination(long timeout, TimeUnit unit): Blocks until all tasks have completed execution after a shutdown request, or the timeout occurs, or the current thread is interrupted, whichever happens first. Java boolean terminated = executor.awaitTermination(10, TimeUnit.SECONDS); if (terminated) { System.out.println("All tasks have completed execution"); } else { System.out.println("Timeout occurred before all tasks completed"); } 6. invokeAny(Collection<? extends Callable<T>> tasks): Executes the given tasks, returning the result of one that successfully completes. This method is useful when we have multiple tasks to run but we only care about the result of whichever one completes first. All other tasks are terminated. Java ExecutorService executor = Executors.newCachedThreadPool(); Set<Callable<String>> callables = new HashSet<>(); callables.add(() -> "Task 1"); callables.add(() -> "Task 2"); String result = executor.invokeAny(callables); System.out.println("Result: " + result); 7. invokeAll(Collection<? extends Callable<T>> tasks): Executes the given tasks, returning a list of Future objects representing their pending results. Java List<Callable<Integer>> tasks = Arrays.asList(() -> 1, () -> 2, () -> 3); List<Future<Integer>> futures = executor.invokeAll(tasks); for (Future<Integer> future : futures) { System.out.println("Result: " + future.get()); } Implementations The ExecutorService interface is typically implemented by various classes provided by the Java concurrency framework, such as ThreadPoolExecutor, ScheduledThreadPoolExecutor, and ForkJoinPool. Considerations Careful configuration of thread pool size to avoid underutilization or excessive resource consumption. Consider factors such as task submission rate, task priority, resource constraints, and the desired behavior in case of queue overflow. Choose the queue type that best meets your application's requirements for scalability, performance, and resource utilization. Proper handling of exceptions and task cancellation to ensure robustness and reliability. Understanding the concurrency semantics and potential thread safety issues in concurrent code. To create an instance of ExecutorService, we can pass ThreadFactory and task queue to be used while creating the pool. A ThreadFactory is an interface used to create new threads. It provides a way to encapsulate the logic for creating threads, allowing for customization of thread creation behavior. The primary purpose of a ThreadFactory is to decouple the thread creation process from the rest of the application logic, making it easier to manage and customize thread creation. It is preferred to pass custom Thread factory, as helps in setting thread prefix and priority if required. Java static final String prefix = "app.name.task"; ExecutorService executorService = Executors.newFixedThreadPool(5, () -> { Thread t = new Thread(r); t.setName(prefix + "-" + t.getId()); // Customize thread name if needed return t; }); TaskQueues When tasks are submitted to ExecutorService, if none of the threads in pool are available to process the tasks, they get stored in a queue, below are the different queue options to choose from. Unbounded Queue: An unbounded queue, such as LinkedBlockingQueue, has no fixed capacity and can grow dynamically to accommodate an unlimited number of tasks. It is suitable for scenarios where the task submission rate is unpredictable or where tasks need to be queued indefinitely without the risk of rejection due to queue overflow. However, keep in mind that unbounded queues can potentially lead to memory exhaustion if tasks are submitted at a faster rate than they can be processed. Bounded Queue: A bounded queue, such as ArrayBlockingQueue with a specified capacity, has a fixed size limit and can only hold a finite number of tasks. It is suitable for scenarios where resource constraints or backpressure mechanisms need to be enforced to prevent excessive memory usage or system overload. Tasks may be rejected or handled according to a specified rejection policy when the queue reaches its capacity. Priority Queue: A priority queue, such as PriorityBlockingQueue, orders tasks based on their priority or a specified comparator. It is suitable for scenarios where tasks have different levels of importance or urgency, and higher-priority tasks need to be processed before lower-priority ones. Priority queues ensure that tasks are executed in the order of their priority, regardless of their submission order. Synchronous Queue: A synchronous queue, such as SynchronousQueue, is a special type of queue that enables one-to-one task handoff between producer and consumer threads. It has a capacity of zero and requires both a producer and a consumer to be available simultaneously for task exchange to occur. Synchronous queues are suitable for scenarios where strict synchronization and coordination between threads are required, such as handoff between thread pools or bounded resource access. ScheduledThreadPool The ScheduledThreadPoolExecutor inherits thread pool management capabilities from ThreadPoolExecutor and provides functionalities for scheduling tasks to run after a given delay or periodically at defined intervals. Here's a detailed explanation: Runnable and Callable Tasks: You define tasks you want to schedule using these interfaces, similar to a regular ExecutorService. ScheduledFuture: This interface represents the result of a scheduled task submission. It allows checking the task's completion status, canceling the task before execution, and (for Callable tasks) retrieving the result upon completion. Scheduling Capabilities schedule(Runnable task, long delay, TimeUnit unit): Schedules a Runnable task to be executed after a specified delay in the given time unit (e.g., seconds, milliseconds). scheduleAtFixedRate(Runnable command, long initialDelay, long period, TimeUnit unit): Schedules a fixed-rate execution of a Runnable task. The task is first executed after the initialDelay, and subsequent executions occur with a constant period between them. scheduleWithFixedDelay(Runnable command, long initialDelay, long delay, TimeUnit unit): Schedules a fixed-delay execution of a Runnable task. Similar to scheduleAtFixedRate, but the delay is measured between the completion of the previous execution and the start of the next. Key Considerations Thread Pool Management: ScheduledThreadPoolExecutor maintains a fixed-sized thread pool by default. You can configure the pool size during object creation. Delayed Execution: Scheduled tasks are not guaranteed to execute precisely at the specified time. The actual execution time might be slightly different due to factors like thread availability and workload. Missed Executions: With fixed-rate scheduling, if the task execution time exceeds the period, subsequent executions might be skipped to maintain the fixed rate. Cancellation: You can cancel a scheduled task using the cancel method of the returned ScheduledFuture object. However, cancellation success depends on the task's state (not yet started, running, etc.). Java import java.util.concurrent.ScheduledExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.TimeUnit; public class ScheduledThreadPoolExample { public static void main(String[] args) throws InterruptedException { // Create a ScheduledThreadPoolExecutor with 2 threads ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(2); // Schedule a task with a 2-second delay Runnable task1 = () -> System.out.println("Executing task 1 after a delay"); scheduler.schedule(task1, 2, TimeUnit.SECONDS); // Schedule a task to run every 5 seconds with a fixed rate Runnable task2 = () -> System.out.println("Executing task 2 at fixed rate"); scheduler.scheduleAtFixedRate(task2, 1, 5, TimeUnit.SECONDS); // Schedule a task to run every 3 seconds with a fixed delay Runnable task3 = () -> System.out.println("Executing task 3 with fixed delay"); scheduler.scheduleWithFixedDelay(task3, 0, 3, TimeUnit.SECONDS); // Wait for some time to allow tasks to be executed Thread.sleep(15000); // Shutdown the scheduler scheduler.shutdown(); } } Shut Down ExecutorService Gracefully To efficiently shut down an ExecutorService, you can follow these steps: Call the shutdown() method to initiate the shutdown process. This method allows previously submitted tasks to execute before terminating but prevents the submission of new tasks. Call the shutdownNow() method if you want to force the ExecutorService to terminate immediately. This method attempts to stop all actively executing tasks, halts the processing of waiting tasks, and returns a list of the tasks that were awaiting execution but were never started. Await termination by calling the awaitTermination() method. This method blocks until all tasks have completed execution after a shutdown request, or the timeout occurs, or the current thread is interrupted, whichever happens first. Here's an example: Java ExecutorService executor = Executors.newFixedThreadPool(10); // Execute tasks using the executor // Shutdown the executor executor.shutdown(); try { // Wait for all tasks to complete or timeout after a certain period if (!executor.awaitTermination(60, TimeUnit.SECONDS)) { // If the timeout occurs, force shutdown executor.shutdownNow(); // Optionally, wait for the tasks to be forcefully terminated if (!executor.awaitTermination(60, TimeUnit.SECONDS)) { // Log a message indicating that some tasks failed to terminate } } } catch (InterruptedException ex) { // Log interruption exception executor.shutdownNow(); // Preserve interrupt status Thread.currentThread().interrupt(); } In summary, ExecutorService is a versatile framework that helps developers write efficient, scalable, and maintainable concurrent code.
Jira For Product Managers: Useful Features Explained
April 24, 2024 by
Software Testing as a Debugging Tool
April 24, 2024 by CORE
Java Container Application Memory Analysis
April 24, 2024 by
Understanding LLMs: Mixture of Experts
April 24, 2024 by
Explainable AI: Making the Black Box Transparent
May 16, 2023 by CORE
Java Container Application Memory Analysis
April 24, 2024 by
Is Your Roadmap Prioritizing Memory-Safe Programming Languages?
April 24, 2024 by
Low Code vs. Traditional Development: A Comprehensive Comparison
May 16, 2023 by
Java Container Application Memory Analysis
April 24, 2024 by
Jira For Product Managers: Useful Features Explained
April 24, 2024 by
Java Container Application Memory Analysis
April 24, 2024 by
Jira For Product Managers: Useful Features Explained
April 24, 2024 by
Low Code vs. Traditional Development: A Comprehensive Comparison
May 16, 2023 by
Java Container Application Memory Analysis
April 24, 2024 by
Understanding LLMs: Mixture of Experts
April 24, 2024 by
Five IntelliJ Idea Plugins That Will Change the Way You Code
May 15, 2023 by