DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
Microservices Zone is presented by the following partner

Microservices

A microservices architecture is a development method for designing applications as modular services that seamlessly adapt to a highly scalable and dynamic environment. Microservices help solve complex issues such as speed and scalability, while also supporting continuous testing and delivery. This Zone will take you through breaking down the monolith step by step and designing a microservices architecture from scratch. Stay up to date on the industry's changes with topics such as container deployment, architectural design patterns, event-driven architecture, service meshes, and more.

icon
Latest Refcards and Trend Reports
Trend Report
Microservices and Containerization
Microservices and Containerization
Refcard #379
Getting Started With Serverless Application Architecture
Getting Started With Serverless Application Architecture
Refcard #346
Microservices and Workflow Engines
Microservices and Workflow Engines

DZone's Featured Microservices Resources

Integrating AWS Secrets Manager With Spring Boot

Integrating AWS Secrets Manager With Spring Boot

By Kushagra Shandilya
In a microservices architecture, it’s common to have multiple services that need access to sensitive information, such as API keys, passwords, or certificates. Storing this sensitive information in code or configuration files is not secure because it’s easy for attackers to gain access to this information if they can access your source code or configuration files. To protect sensitive information, microservices often use a secrets management system, such as Amazon Secrets Manager, to securely store and manage this information. Secrets management systems provide a secure and centralized way to store and manage secrets, and they typically provide features such as encryption, access control, and auditing. Amazon Secrets Manager is a fully managed service that makes it easy to store and retrieve secrets, such as database credentials, API keys, and other sensitive information. It provides a secure and scalable way to store secrets, and integrates with other AWS services to enable secure access to these secrets from your applications and services. Some benefits of using Amazon Secrets Manager in your microservices include: Centralized management: You can store all your secrets in a central location, which makes it easier to manage and rotate them. Fine-grained access control: You can control who has access to your secrets, and use AWS Identity and Access Management (IAM) policies to grant or revoke access as needed. Automatic rotation: You can configure Amazon Secrets Manager to automatically rotate your secrets on a schedule, which reduces the risk of compromised secrets. Integration with other AWS services: You can use Amazon Secrets Manager to securely access secrets from other AWS services, such as Amazon RDS or AWS Lambda. Overall, using a secrets management system, like Amazon Secrets Manager, can help improve the security of your microservices by reducing the risk of sensitive information being exposed or compromised. In this article, we will discuss how you can define a secret in Amazon Secrets Manager and later pull it using the Spring Boot microservice. Creating the Secret To create a new secret in Amazon Secrets Manager, you can follow these steps: Open the Amazon Secrets Manager console by navigating to the “AWS Management Console,” selecting “Secrets Manager” from the list of services, and then clicking “Create secret” on the main page. Choose the type of secret you want to create: You can choose between “Credentials for RDS database” or “Other type of secrets.” If you select “Other type of secrets,” you will need to enter a custom name for your secret. Enter the secret details: The information you need to enter will depend on the type of secret you are creating. For example, if you are creating a database credential, you will need to enter the username and password for the database. Configure the encryption settings: By default, Amazon Secrets Manager uses AWS KMS to encrypt your secrets. You can choose to use the default KMS key or select a custom key. Define the secret permissions: You can define who can access the secret by adding one or more AWS Identity and Access Management (IAM) policies. Review and create the secret: Once you have entered all the required information, review your settings and click “Create secret” to create the secret. Alternatively, you can also create secrets programmatically using AWS SDK or CLI. Here’s an example of how you can create a new secret using the AWS CLI: Shell aws secretsmanager create-secret --name my-secret --secret-string '{"username": "myuser", "password": "mypassword"}' This command creates a new secret called “my-secret” with a JSON-formatted secret string containing a username and password. You can replace the secret string with any other JSON-formatted data you want to store as a secret. You can also create these secrets from your microservice as well: Add the AWS SDK for Java dependency to your project: You can do this by adding the following dependency to your pom.xml file: XML <dependency> <groupId>com.amazonaws</groupId> <artifactId>aws-java-sdk-secretsmanager</artifactId> <version>1.12.83</version> </dependency> Initialize the AWS Secrets Manager client: You can do this by adding the following code to your Spring Boot application’s configuration class: Java @Configuration public class AwsConfig { @Value("${aws.region}") private String awsRegion; @Bean public AWSSecretsManager awsSecretsManager() { return AWSSecretsManagerClientBuilder.standard() .withRegion(awsRegion) .build(); } } This code creates a new bean for the AWS Secrets Manager client and injects the AWS region from the application.properties file. Create a new secret: You can do this by adding the following code to your Spring Boot service class: Java @Autowired private AWSSecretsManager awsSecretsManager; public void createSecret(String secretName, String secretValue) { CreateSecretRequest request = new CreateSecretRequest() .withName(secretName) .withSecretString(secretValue); CreateSecretResult result = awsSecretsManager.createSecret(request); String arn = result.getARN(); System.out.println("Created secret with ARN: " + arn); } This code creates a new secret with the specified name and value. It uses the CreateSecretRequest class to specify the name and value of the secret and then calls the createSecret method of the AWS Secrets Manager client to create the secret. The method returns a CreateSecretResult object, which contains the ARN (Amazon Resource Name) of the newly created secret. These are just some basic steps to create secrets in Amazon Secrets Manager. Depending on your use case and requirements, there may be additional configuration or setup needed. Pulling the Secret Using Microservices Here are the complete steps for pulling a secret from the Amazon Secrets Manager using Spring Boot: First, you need to add the following dependencies to your Spring Boot project: XML <dependency> <groupId>com.amazonaws</groupId> <artifactId>aws-java-sdk-secretsmanager</artifactId> <version>1.12.37</version> </dependency> <dependency> <groupId>com.amazonaws</groupId> <artifactId>aws-java-sdk-core</artifactId> <version>1.12.37</version> </dependency> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-aws</artifactId> <version>2.3.2.RELEASE</version> </dependency> Next, you need to configure the AWS credentials and region in your application.yml file: YAML aws: accessKey: <your-access-key> secretKey: <your-secret-key> region: <your-region> Create a configuration class for pulling the secret: Java import org.springframework.beans.factory.annotation.Autowired; import org.springframework.cloud.aws.secretsmanager.AwsSecretsManagerPropertySource; import org.springframework.context.annotation.Configuration; import com.amazonaws.services.secretsmanager.AWSSecretsManager; import com.amazonaws.services.secretsmanager.AWSSecretsManagerClientBuilder; import com.amazonaws.services.secretsmanager.model.GetSecretValueRequest; import com.amazonaws.services.secretsmanager.model.GetSecretValueResult; import com.fasterxml.jackson.databind.ObjectMapper; @Configuration public class SecretsManagerPullConfig { @Autowired private AwsSecretsManagerPropertySource awsSecretsManagerPropertySource; public <T> T getSecret(String secretName, Class<T> valueType) throws Exception { AWSSecretsManager client = AWSSecretsManagerClientBuilder.defaultClient(); String secretId = awsSecretsManagerPropertySource.getProperty(secretName); GetSecretValueRequest getSecretValueRequest = new GetSecretValueRequest() .withSecretId(secretId); GetSecretValueResult getSecretValueResult = client.getSecretValue(getSecretValueRequest); String secretString = getSecretValueResult.getSecretString(); ObjectMapper objectMapper = new ObjectMapper(); return objectMapper.readValue(secretString, valueType); } } In your Spring Boot service, you can inject the SecretsManagerPullConfig class and call the getSecret method to retrieve the secret: Java import org.springframework.beans.factory.annotation.Autowired; import org.springframework.stereotype.Service; @Service public class MyService { @Autowired private SecretsManagerPullConfig secretsManagerPullConfig; public void myMethod() throws Exception { MySecrets mySecrets = secretsManagerPullConfig.getSecret("mySecrets", MySecrets.class); System.out.println(mySecrets.getUsername()); System.out.println(mySecrets.getPassword()); } } In the above example, MySecrets is a Java class that represents the structure of the secret in the Amazon Secrets Manager. The getSecret method returns an instance of MySecrets that contains the values of the secret. Note: The above code assumes the Spring Boot application is running on an EC2 instance with an IAM role that has permission to read the secret from the Amazon Secrets Manager. If you are running the application locally or on a different environment, you will need to provide AWS credentials with the necessary permissions to read the secret. Conclusion Amazon Secrets Manager is a secure and convenient way to store and manage secrets such as API keys, database credentials, and other sensitive information in the cloud. By using Amazon Secrets Manager, you can avoid hardcoding secrets in your Spring Boot application and, instead, retrieve them securely at runtime. This reduces the risk of exposing sensitive data in your code and makes it easier to manage secrets across different environments. Integrating Amazon Secrets Manager with Spring Boot is a straightforward process thanks to AWS SDK for Java. With just a few lines of code, you can create and retrieve secrets from Amazon Secrets Manager in your Spring Boot application. This allows you to build more secure and scalable applications that can be easily deployed to the cloud. Overall, Amazon Secrets Manager is a powerful tool that can help you manage your application secrets in a more secure and efficient way. By integrating it with Spring Boot, you can take advantage of its features and benefits without compromising on the performance or functionality of your application. More
Building Micronaut Microservices Using MicrostarterCLI

Building Micronaut Microservices Using MicrostarterCLI

By Ahmed Al-Hashmi
MicrostarterCLI is a rapid development tool. It helps you as a developer generate standard reusable codes, configurations, or patterns you need in your application. In a previous article, I went through a basic example of creating REST and GraphQL endpoints in a Micronaut application. This article demonstrates an example of bootstrapping a Micronaut microservices application using MicrostarterCLI. The application's architecture consists of the following: Fruit Service: A simple CRUD service for Fruit objects. Vegetable Service: A simple CRUD service for Vegetable objects. Eureka Service Discovery Server Consul Configuration Server. Spring Cloud Gateway. Set Up the Environment Download the MicrostarterCLI 2.5.0 or higher zip file. Then, unzip it and configure the folder in the environment variables. Then, you will be able to access the mc.jar, mc.bat, and mc from any folder in the operating system. To verify the configuration, check the Microstarter CLI by running the below command from the command prompt: PowerShell mc --version My environment details are as follows: Operating System Windows 11 Java Version 11 IDE IntelliJ Let's Start Development Step 0: Create a Workspace Directory Create a folder in your system in which you will save the projects. This step is optional, but it's helpful to organize the work. In this article, the workspace is c:\workspace. Step 1: Create Fruit Service The FruitService is a simple service with all CRUD operations to handle the Fruits objects. For simplicity, we will use the H2 database. We will use the MicrostarterCLI to generate all CRUD operation code and configuration as following steps: First, generate a Micronaut application from the Micronaut Launch and extract the project zip file in the c:\workspace folder. Alternatively, we can use the init Command of the MicrostarterCLI to generate the project directly from the Micronaut launch as follows: PowerShell mc init --name FruitService --package io.hashimati After running the init command, go to the FruitService directory. Then, run the entity command to add the required dependencies, configure the service, and generate the necessary CRUD services code. PowerShell cd fruit mc entity -e Fruit Once you run the command, the MicrostarterCLI will start with the configuration question.Answer the configuration question as follows. Option` Answer Description Is the application monolithic? no To specify if the service is monolithic or microservice Enter the server port number between 0 - 65535 -1 To set the server port. This service will use a random port number to run the services. Enter the service id: fruit-service To set the service ID. Select Reactive Framework reactor To user reactor framework in case the developer wants to user reactive data access. Do you want to use Lombok? yes To use the Lombok library to generate the entity class. Select Annotation: Micronaut Micronaut enables the developer to use Micronaut, JAX-RS, or Spring. This step instructs the MicrostarterCLI to use Micronaut annotation in generating the code. Enter the database name: fruits to set the database name. Select Database Type: H2 to specify the database management engine. Select Database Backend: JDBC To specify the database access Select Data Migration: liquibase To use liquibase as a data migration tool to create the database schema. Select Messaging type: none Do you want to add cache-caffeine support? yes To use cache Do you want to add Micrometer feature? yes To collect metrics on the endpoints and service calls. Select Distributed Tracing: Jaeger To use Jaeger for distributed tracing. Do you want to add GraphQL-Java-Tools support? yes To add GraphQL dependency to the project. Do you want to add GRPC support? no If the answer is "yes," MicrostarterCli will prepare the project to be ready for the GRPC services. In this example, will not user GRPC, Do you want to use File Services? no This article will not use storage services. Do you want to configure VIEWS? yes To confirm if you want to add a "view" dependency. Select the views configuration views-thymeleaf To add Thymeleaf dependency. Once you complete the configuration, the MicrostarterCLI will ask you to enter the collection's name or the table. Then, it will prompt you to enter the attribute. Enter the attributes as follows: Attribute Type validation FindBy()Method FindAllBy() method UpdateBy() Method Name String - Yes Yes No quantity Int - No Yes No By the end of the step, the MicrostarterCLI will generate the classes of the entity, service, controller, client, test controllers, liquibase xml files, and configurations. Step 2: Create Vegetable Service The vegetable service will host the CRUD service of the Vegetable objects. To define it, we will repeat the same steps of Step 1 as follows: Step 3: Create Eureka Server In this step, we will create the Eureka service discovery server. The service will be listening on port 8761. The Fruit and Vegetable services will register in the Eureka server as we point their Eureka client to localhost and port 8761. To create the Eureka server project using MicorstarterCLI, run the below command from the c:\workspace Shell mc eureka -version 2.7.8 --javaVersion 11 Step 4: Create a Gateway Service The last component we will create is Spring Cloud Gateway. The gateway service will listen on port 8080. To generate the Gateway project using the MicrostarterCLI, run the gateway command below: Shell mc gateway -version 2.7.8 --javaVersion 11 Step 5: Gateway/Microservice Routes Configurations In this step, we will configure the root routes for both Fruit and Vegetable APIs. As generated by the MIcrostarterCLI, the root route for the Fruit API, as in @Controller annotation of io.hashimati.controllers.FruitController a class of the Fruit Service. The Vegetable API's root route is /api/v1/vegetable as in the io.hashimati.controllers.VegetableController class of the Vegetable Service. To register the routes, we will use the register subcommand of the gateway command. To run the command go to c:\workspace\gateway and run the below command: mc gateway register When the command run, the MicrostarterCLI will prompt you to enter the Service ID, Service Name, and the routes. Run the register subcommand twice to configure the root routes for the Fruit and the Vegetable APIs. You configured the CRUD endpoints for the Fruit and Vegetable server from the Gateway by completing this step. Run and Try To run the service, we will create a run.bat file to launch all the services as below: cd c:\workspace\eureka\ start gradlew bootRun -x test& cd c:\workspace\FruitService start gradlew run -x test& cd c:\workspace\VegetableService start gradlew run -x test& cd c:\workspace\gateway start gradlew bootRun -x test& After running the run.bat file, all services start and run. wait until all the services complete their start-up process. Open this URL. You should see all services registered on the Eureka server. To try the CRUD services, you can use the .http file of IntelliJ. We will create test.http file as follows: HTTP POST http://localhost:8080/api/v1/fruit/save Content-Type: application/json { "name": "Apple", "quantity": 100 } ### GET http://localhost:8080/api/v1/fruit/findAll ### POST http://localhost:8080/api/v1/vegetable/save Content-Type: application/json { "name": "Onion", "quantity": 100 } ### GET http://localhost:8080/api/v1/vegetable/findAll By running from IntelliJ, It works! Conclusion Using MicostarterCLI, you can generate the configuration of the needed component for Microservice Archtication like JWT security, distributed tracing, messaging, and observability. MicrostarterCLI also supports Groovy and Kotlin programming languages. Please visit the project repository for more information. Check out the article example here. Happy coding! More
Kubernetes-Native Development With Quarkus and Eclipse JKube
Kubernetes-Native Development With Quarkus and Eclipse JKube
By Eric Deandrea
Spring Cloud: How To Deal With Microservice Configuration (Part 2)
Spring Cloud: How To Deal With Microservice Configuration (Part 2)
By Mario Casari
Testing Challenges Related to Microservice Architecture
Testing Challenges Related to Microservice Architecture
By Harshit Paul
Monolithic First
Monolithic First

In recent years, Microservices Architecture has become a popular buzzword in the software industry. The idea of breaking down a monolithic application into smaller, independent services that can be deployed and scaled independently sounds appealing. However, before you jump on the Microservices bandwagon, there are a few things to consider. Monolithic architecture is an approach in which an entire application is built as a single, cohesive unit. It's a traditional architecture pattern that has been in use for a long time and has proven to be successful in many applications. With monolithic architecture, all the components of the application are tightly coupled, and it can be challenging to make changes to one component without affecting the others. However, before going to Microservices architecture, it is important to consider whether the application needs it or not. For smaller applications with limited functionality, monolithic architecture can still be a viable option. It is more straightforward to develop, deploy and maintain a monolithic application. One of the primary benefits of monolithic architecture is that it provides a simple and cohesive development experience. It is easier to write code that is well-organized and easy to maintain in a monolithic architecture. Also, it is easier to test and debug a monolithic application as the entire codebase is in a single codebase. This reduces the complexity of managing multiple services. Moreover, to transition from monolithic architecture to Microservices architecture, it is important to have a clear understanding of functional domains. Functional domains refer to the different parts of an application that perform specific tasks or functions. A good understanding of functional domains is essential before deciding which services to split and how they should communicate. A well-defined functional domain can help developers create independent services that can be managed and deployed independently. This makes it easier to maintain and scale the application. Moreover, having a good understanding of functional domains can help developers to avoid coupling between services. Another concept you need to understand is the Bounded Context, which is a critical concept in Domain-Driven Design (DDD), and it plays a crucial role in Microservices architecture. In simple terms, it defines the scope and context of a business domain. It is essential to identify the Bounded Context for a domain before starting to design and develop a Microservices-based solution. In DDD, a Bounded Context is a logical boundary that segregates a domain's components based on their functionalities and sub-domains. It helps to create a clear understanding of the different contexts and components within a domain and their interactions, dependencies, and constraints. Each Bounded Context has its own domain model, which is a representation of the domain's concepts and rules within the context. This model encapsulates the domain logic, which should be loosely coupled from other Bounded Contexts, promoting autonomy and minimizing the risk of affecting other parts of the system when changes are made. By dividing a domain into smaller, well-defined contexts, teams can focus on specific areas of the business logic, reducing the complexity of the system and making it easier to manage, scale, and maintain. Furthermore, Bounded Contexts also facilitate effective collaboration between teams working on different contexts, as they can use a common language and understand the implications of their work within the broader domain. In summary, identifying and defining Bounded Contexts is critical when designing and implementing Microservices. It helps to ensure that the services are loosely coupled, promote autonomy, and have a clear understanding of their roles and responsibilities within the domain. By adhering to Bounded Context principles, teams can develop maintainable, scalable, and robust Microservices-based solutions that align with the domain's business requirements. Finally, when building Microservices, it is important not to communicate between services using REST calls. This can result in a tight coupling between services, making it difficult to maintain and scale the application. Instead, services should communicate using lightweight protocols like gRPC or event-driven architectures like Apache Kafka. This approach reduces the coupling between services and makes it easier to replace or update a service without affecting the rest of the system. In conclusion, before adopting Microservices architecture, it is important to consider whether monolithic architecture is still a viable option. Also, having a clear understanding of functional domains can help in the transition to Microservices architecture. Finally, using lightweight protocols for communication between services can reduce coupling and make it easier to maintain and scale the application.

By Farith Jose Heras García
Microservices 101: Transactional Outbox and Inbox
Microservices 101: Transactional Outbox and Inbox

One of the fundamental aspects of microservice architecture is data ownership. Encapsulation of the data and logic prevents the tight coupling of services. Since they only expose information via public interfaces (like stable REST API) and hide inner implementation details of data storage, they can evolve their schema independently of one another. A microservice should be an autonomous unit that can fulfill most of its assignments with its own data. It can also ask other microservices for missing pieces of information required to complete its tasks and, optionally, store them as a denormalized copy in its storage. Inevitably, services also have to exchange messages. Usually, it’s essential to ensure that the sent message reaches its destination, and losing it could yield serious business implications. Proper implementation of communication patterns between services might be one of the most critical aspects when applying microservices architecture. It’s quite easy to drop the ball by introducing unwanted coupling or unreliable message delivery. What Can Go Wrong? Let’s consider a simple scenario of service A having just finished processing some data. It has committed the transaction that saved a couple of rows in a relational database. Now it needs to notify service B that it has finished its task and new information is available for fetching. The simplest solution would be just to send a synchronous REST request (most probably POST or PUT) to service B directly after a transaction is committed. This approach has some drawbacks. Arguably, the most important one is a tight coupling between services caused by the synchronous nature of the REST protocol. If any of the services are down because of maintenance or failure, the message will not be delivered. This kind of relationship is called temporal coupling because both nodes of the system have to be available throughout the whole duration of the request. Introducing an additional layer — message broker — decouples both services. Now service A doesn’t need to know the exact network location of B to send the request, just the location of the message broker. The broker is responsible for delivering a message to the recipient. If B is down, then it’s the broker’s job to keep the message as long as necessary to pass it successfully. If we take a closer look, we might notice that the problem of temporal coupling persists even with the messaging layer. This time, though, it’s the broker and service A that are using synchronous communication and are coupled together. The service sending the message can assume it was correctly received by the broker only if it gets back an ACK response. If the message broker is unavailable, it can’t obtain the message and won’t respond with an acknowledgment. Messaging systems are very often sophisticated and durable distributed systems, but downtime will still happen from time to time. Failures very often are short-lasting. For example, an unstable node can be restarted and become operational again after a short period of downtime. Accordingly, the most straightforward way to increase the chance for the message to get through is just retrying the request. Many HTTP clients can be configured to retry failed requests. But there’s a catch. Since we’re never sure whether our message reached its destination (maybe the request got through, but just the response was lost?), retrying the request can cause that message to be delivered more than once. Thus, it’s crucial to deduplicate messages on the recipient side. The same principle can be applied to asynchronous communication. For example, Kafka producers can retry the delivery of the message to the broker in case of retriable errors like NotEnoughReplicasException. We can also configure the producer as idempotent, and Kafka will automatically deduplicate repeated messages. Unfortunately, there’s bad news: even retrying events doesn’t guarantee that the message will reach its target service or the message broker. Since the message is stored only in memory, then if service A crashes before it’s able to successfully transfer the message, it will be irretrievably lost. A situation like this can leave our system in an inconsistent state. On the one hand, the transaction on service A has been successfully committed, but on the other hand, service B will never be notified about that event. A good example of the consequences of such failure might be communication between two services when the first one has deducted some loyalty points from a user account, and now it needs to let the other service know it must send a certain prize to the customer. If the message never reaches the other service, the user will never get their gift. So maybe the solution for this problem will be sending a message first, waiting for ACK, and only then committing the transaction? Unfortunately, it wouldn’t help much. The system can still fail after sending the message, but just before the commit. The database will detect that the connection to the service is lost and abort the transaction. Nevertheless, the destination service will still receive a notification that the data was modified. That’s not all. The commit can be blocked for some time if there are other concurrent transactions holding locks on database objects the transaction is trying to modify. In most relational databases, data altered within a transaction with an isolation level equal to or stronger than the read committed (default in Postgres) will not be visible until the completion of a transaction. If the target service receives a message before the transaction is committed, it may try to fetch new information but will only get stale data. Additionally, by making a request before the commit, service A extends the duration of its transaction, which may potentially block other transactions. This sometimes may pose a problem too, for example, if the system is under high load. So are we doomed to live with the fact that our system will get into an inconsistent state occasionally? The old-school approach for ensuring consistency across services would use a pattern like distributed transaction (for example, 2PC), but there’s also another neat trick we could utilize. Transactional Outbox The problem we’re facing is related to an issue in that we can’t atomically both perform an external call (to the message broker, another service, etc.) and commit the ACID transaction. In the happy path scenario, both tasks will succeed, but problems start when one of them fails for any reason. I will try to explain how we can overcome these issues by introducing a transactional outbox pattern. As the first step, we need to introduce a table that stores all messages that are intended for delivery — that’s our message outbox. Then instead of directly doing requests, we just save the message as a row to the new table. Doing an INSERT into the message outbox table is an operation that can be a part of a regular database transaction. If the transaction fails or is rolled back, no message will be persisted in the outbox. In the second step, we must create a background worker process that, in scheduled intervals, will be polling data from the outbox table. If the process finds a row containing an unsent message, it now needs to publish it (send it to an external service or broker) and mark it as sent. If delivery fails for any reason, the worker can retry the delivery in the next round. Marking the message as delivered involves executing the request and then a database transaction (to update the row). That means we are still dealing with the same problems as before. After a successful request, the transaction can fail, and the row in the outbox table won’t be modified. Since the message status is still pending (it wasn’t marked), it will be re-sent, and the target will get a message twice. That means that the outbox pattern doesn’t prevent duplicate requests — these still have to be handled on the recipient side (or message broker). The main improvement of the transactional outbox is that the intent to send the message is now persisted in durable storage. If the service dies before it’s able to make a successful delivery, the message will stay in the outbox. After restarting, the background process will fetch the message and send the request again. Eventually, the message will reach its destination. Ensured message delivery with possible duplicated requests means we’ve got an at-least-once processing guarantee, and recipients won’t lose any notifications (unless in case of some catastrophic failures causing data loss in the database). Neat! Not surprisingly, though, this pattern comes with some weak points. First of all, implementing the pattern requires writing some boilerplate code. The code for storing the message in the outbox should be hidden under a layer of abstraction, so it won’t interfere with the whole codebase. Additionally, we’d need to implement a scheduled process in that we’ll be getting messages from the outbox. Secondly, polling the outbox table can sometimes put significant stress on your database. The query to fetch messages is usually as plain as a simple SELECT statement. Nevertheless, it needs to be executed at a high interval (usually below 1s, very often way below). To reduce the load, the check frequency can be decreased, but if the polling happens too rarely, it will impact the latency of message delivery. You can also decrease the number of database calls by simply increasing the batch size. Nonetheless, with a large number of messages selected, if the request fails, none of them will be marked delivered. The throughput of the outbox can be boosted by increasing the parallelism. Multiple threads or instances of the service can be each picking up a bunch of rows from the outbox and sending them concurrently. To prevent different readers from taking up the same message and publishing it more than once, you need to block rows that are just being handled. A neat solution is locking a row with the SELECT ... FOR UPDATE SKIP LOCKED statement (SKIP LOCKED is available in some relational databases, for example, in Postgres in MySQL). Other readers can then fetch other unblocked rows. Last but not least, if you’re sending massive amounts of messages, the outbox table will very quickly bloat. To keep its size under control, you can create another background process that will delete old and already sent messages. Alternatively, you can simply remove the message from the table just after the request is acknowledged. A more sophisticated approach for getting data from the outbox table is called database log tailing. In relational databases, every operation is recorded in WAL (write-ahead-log). It can be later queried for new entries concerning rows inserted into the message outbox. This kind of processing is called CDC (capture data change). To use this technique, your database has to offer CDC capabilities, or you’d need to use some kind of framework (like Debezium). I hope I have convinced you that the outbox pattern can be a great asset to increase the robustness and reliability of your system. If you need more information, a remarkable source to learn more about transaction outbox and its various applications is the book titled Microservices Patterns by Chris Richardson. Inbox Pattern A properly implemented outbox pattern ensures that the message will eventually reach its target. To get this guarantee end to end with messaging system, the broker also needs to assure at least one delivery to the consumer (like Kafka or RabbitMQ do). In most circumstances, we do not only want to simply deliver the message. It’s also important to ensure that the task that was triggered by the message is completed. Hence, it’s essential to acknowledge the message receipt only after the task's completion! If the task fails (or the whole service crashes), the message will not be acked and will get re-delivered. After receiving the message again, the service can retry processing (and then ack the message if the task is finished). If we do things the other way around: ack the message first and only then start processing, we’ll lose at least one guarantee. If the service crashes when performing the task, the message will be effectively lost. Since it is already acknowledged, there will be no retries of delivery. The at-least-once processing guarantee means that the recipient will, from time to time, get repeated messages. For that reason, it’s crucial to actively eliminate duplicate messages. Alternatively, we can make our process idempotent, which means that no matter how many times it is re-run, it should not further modify the system’s state. Message delivery can be retried in case of explicit failure of processing. In this case, the recipient could return NACK (negative acknowledgment) to the broker or respond with an error code when synchronous communication is used. This is a clear sign that the task was not successful, and the message has to be sent again. The more interesting scenario happens when the work takes more time to finish than expected. In such a situation, the message, after some time, can be considered lost by the sender (for example, an HTTP request can hit the timeout, visibility timeout of SQS can pass, etc.) and scheduled for redelivery. The recipient will get the message again even if the first task is still alive. Since the message is not yet acked, even a proper deduplication mechanism won’t prevent triggering multiple concurrent processes caused by these repeated messages. Timeouts can be fine-tuned, so they take into consideration the prolonged time when the task is handled. On the other hand, finding out the appropriate value is sometimes difficult, especially when completing an assignment by a client takes an unpredictable amount of time. Repeated work is very often not a big deal with correctly working duplicate elimination. We can simply get the message and reject the result after task completion (or upsert if it’s idempotent). But what if the processing involves some costly action? For example, after receiving the request, the client could spin up an additional VM to handle the assignment. To avoid unnecessary waste of resources, we could adopt another communication pattern: the transaction inbox. The inbox pattern is quite similar to the outbox pattern (but let’s say it works backward). As the first step, we create a table that works as an inbox for our messages. Then after receiving a new message, we don’t start processing right away but only insert the message to the table and ACK. Finally, the background process picks up the rows from the inbox at a convenient pace and spins up processing. After the work is complete, the corresponding row in the table can be updated to mark the assignment as complete (or just removed from the inbox). If received messages have any kind of unique key, they can be deduplicated before being saved to the inbox. Repeated messages can be caused by the crash of the recipient just after saving the row to the table but before sending a successful ack. Nevertheless, that is not the only potential source of duplicates since, after all, we’re dealing with at least one guarantee. Again, the throughput of processing can be improved by increasing parallelism. With multiple workers concurrently scanning the inbox table for new tasks, you need to remember to lock rows that were already picked up by other readers. Another possible optimization: instead of waiting for the worker process to select the task from the inbox, we can start it asynchronously right after persisting it in the table. If the process crashes, the message will still be available in the inbox. The inbox pattern can be very helpful in case the ordering of messages is important.Sometimes, the order is guaranteed by a messaging system (like Kafka with idempotence configuration turned on), but that is not the case for every broker. Similarly, HTTP requests can interleave if the client doesn’t make them in a sequence in a single thread. Fortunately, if messages contain a monotonically increasing identifier, the order can be restored by the worker process while reading from the inbox. It can detect missing messages, hold on until they arrive, and then handle them in sequence. What are the disadvantages? Similar to the outbox pattern: increased latency, additional boilerplate, and more load on the database. Very often, the recipient service can cope without the inbox. If the task doesn’t take long to finish or completes a predictable amount of time, it can just ack the message after processing. Otherwise, it might be worthwhile to spend some effort to implement the pattern. Wrapping Up As I stated at the beginning of this article: setting up proper and reliable communication channels between microservices is not a piece of cake! Thankfully, we can accomplish a lot by using correct patterns. It’s always important to consider what guarantees your system needs and then apply appropriate solutions. And always remember to deduplicate your messages :)

By Krzysztof Atlasik
Journey to Event Driven, Part 1: Why Event-First Programming Changes Everything
Journey to Event Driven, Part 1: Why Event-First Programming Changes Everything

Recently, all types of businesses are beginning to use automated systems to perform their functions effectively. With ever-growing and changing business requirements, there is a need for modern, robust applications that can constantly adapt to business needs. To create such applications successfully, you should choose the right architecture. One of the most efficient modern architectures is event-driven architecture. In this article, we will explain in detail how it works, the ways to implement it, and its main advantages. Why Does the Old Architecture Not Meet Modern Requirements? Modern applications are very different from those that were ten years ago. Now there are opportunities to store data in the cloud, split data storage into parts, and move data in real-time between different parts of the globe. Modern applications need to run continuously and seamlessly and be elastic, global, and cloud-based. Legacy architectures cannot efficiently perform the tasks that come with today’s business requirements. Today, ever-growing businesses use microservices, IoT, event hubs, cloud computing, machine learning, and more to meet their needs. However, to build modern, robust, scalable applications that can manage large amounts of data in real time, you need to start from the ground up and choose the right modern application development architecture. Event-driven architecture is the perfect choice for this. Applications built on the event-driven architecture are more flexible, scalable, contextual, and responsive. What Is an Event? An event is any significant occurrence or change in the state of some system. Events exist everywhere and occur all the time. They can be triggered by the user (such as a mouse click or keystroke), an external source (such as the output of a sensor), or originate from within the system (such as when a program is loaded). Examples of events are client requests, sensor readings, packages delivered to the destination, denial of unauthorized access attempts, sending an email to a user, blocking an account, etc. An application that runs on an event-driven architecture allows sending information about an event to all interested people and systems as soon as the event occurs. This allows the business to react faster and benefit from the event. For example, you can attract potential users before other competitors, change production, reallocate resources, etc. Therefore, an event-driven architecture is a better architectural approach than other ones that wait for the system to periodically request updates. An event can have two parts: The title of the event includes information such as the name of the event, the timestamp for the event, and the type of the event. The event body provides information about the detected event state change. Event Flow Levels An event-driven architecture can be built on four logical layers. Below we will describe in detail what actions occur on each layer. Event Producer The event producer captures the fact of an event and presents it as an event message. Event producers can be users, various programs, or physical sensors. Therefore, an important task when designing and implementing an event producer is to transform data collected from a diverse set of data sources into a single, standardized form of data. Event Channel The second logical layer is the event channel. It is a mechanism for propagating information collected from an event emitter to an event handler or receiver. An example of an event channel could be a TCP/IP connection or any type of input file. Multiple event channels can be open at the same time and read asynchronously. After that, events are stored in a queue and waiting for them to be processed using the event handling mechanism. Event Handling Mechanism The event handling mechanism performs event identification, selection, and execution of the appropriate reaction or a series of reactions. Downstream Event-Driven Activity This logical level shows the consequences of executing an event. This can happen in different ways, for example, an application can display some kind of notification on the screen. Depending on the level of automation provided by the event handling mechanism, this level may not be required. Event Handling There are several different ways to handle events: simple, streaming, complex, or online. They can be used individually. However, often in a mature event-driven architecture, they are used together. Let’s take a closer look at each of these event-handling methods. Simple Event Handling When an event occurs that changes a particular measurable state of the system, simple event handling is performed. The occurrence of an observable event triggers follow-up actions. Simple event handling is typically used to control the flow of work in real time. This reduces delay times and costs. Event Streaming When an event stream is processed, both normal and observable events occur. Ordinary events are checked for visibility and communicated to information subscribers. Event stream processing is typically used to manage the flow of real-time information within an enterprise. This allows the company to make timely decisions. Complex Event Processing Integrated event processing is used to detect and respond to business anomalies, threats, and opportunities. It allows you to evaluate the totality of events, and then make further decisions. Complex event processing allows you to look at visible and normal event patterns to infer that a complex event has occurred. This way of handling an event requires the use of sophisticated event interpreters, event template definition and matching, and correlation methods. Online Event Processing This method uses asynchronous distributed event logs to process complex events and manage persistent data. It allows you to compose related events of the same scenario in dissimilar systems. Online event processing provides high consistency and flexible distribution patterns with high scalability. What Is Event-Driven Architecture? Event-driven architecture is a modern software application design architecture. It uses events to start and communicate between unrelated services. The core structure of the solution in this architecture is the collection, transmission, processing, and saving of events. An event-driven architecture is loosely coupled because event producers don’t know which event consumers are listening for the event, and the event doesn’t know what the consequences of raising it are. It provides minimal communication between services, making it a good choice for modern distributed application architectures. It is often used in modern applications built with microservices. Unlike traditional architectures, which treat data as packets of information to be inserted and updated at intervals, and respond to user-initiated requests rather than new information, the event-driven architecture allows applications to respond to events as they occur. Event-driven architecture is versatile and works well with unpredictable non-linear events. Therefore, it allows you to create and react to a large number of events in real time. This architecture is used by a lot of modern applications, such as customer interaction systems that need to use real-time customer data. How Does It Work? The event-driven architecture consists of three key components: Event producers collect event data and store it in storage. Event routers filter and send events to consumers. Event consumers receive an event and, if necessary, respond to it. Event producers do not know about the consumers or the results of event processing. Once an event is detected, it is propagated from the event producer to event consumers via event channels. The event processing router processes the event asynchronously. When an event has occurred, event consumers are informed about it. They may handle the event or may only be affected by it. The event processing router responds to the event and dispatches the action to the appropriate consumers. Event-Driven Architecture Models An event-driven architecture can use a publish/subscribe model or an event stream model. Publish/Subscribe Model When an event is published, it is sent to each subscriber thanks to the messaging infrastructure that keeps track of subscriptions. After that, it cannot be replayed, and new subscribers do not see it. Event Streaming Model In this model, events are logged. They are strictly ordered and durable. Clients do not subscribe to a stream but can read events from any part of the stream. Unlike the publish/subscribe model, the client can join at any time and can replay events. There are several variations of event handling: Simple event handling: The event immediately fires an action in the consumer. Complex event processing: The consumer processes a series of events looks for patterns in them and performs some actions under certain conditions. It uses the following technologies: Azure Stream Analytics or Apache Storm. Processing the flow of events: Stream processors process or transform a stream of events. To do this, streaming platforms such as Memphis.dev, Azure IoT Hub, or Apache Kafka are used. Related Patterns When implementing an application based on an event-driven architecture, you should use specific design patterns. Below, we will look at the main patterns associated with event-driven architecture. Event Sourcing This pattern is used in applications that need to keep a history of business facts. Traditional domain-specific implementations keep only the last committed state while replacing the previous one. Thus, if you need to know how data has changed over time, you need to add historical records. in order to do this, you should create a log table. An event source stores the state of an object as a sequence of state-changing events that are ordered in time. When the state of the system changes, the application fires a state change notification event. The state change event is stored in the event log in chronological order. The event log is often used by business analysts to gain insight into the operation of a business. The event log stores three pieces of information: The type of event or collection Event sequence number Data as a serialized object Strangler If you need to port a monolithic application to microservices, then you should use this pattern. It allows you to gradually replace the feature set of a monolithic application with microservices without having to rewrite applications from scratch. At the same time, you can keep the microservices and the application running in parallel. One problem that arises when implementing this pattern is determining where writes and reads occur and how data should be replicated between contexts. Decompose by Subdomain When implementing an application, you will have a question about how to divide it into microservices. Therefore, a good option for identifying and classifying business functions and, accordingly, microservices is to use Domain-Driven Design subdomains. DDD treats an application as a domain, which consists of several subdomains. Each subdomain corresponds to a separate part of the application. Examples of subdomains are product catalog, inventory management, order management, delivery management, etc. Database per Service The database template for service allows each service to store data privately and be accessible only through its API. The services are loosely coupled, which limits the impact on other services when the database schema changes. Database technology is selected based on business requirements. There are several different ways to keep the persistent data of service private. As an example, if you are using a relational database then you can use one of the following options: Private-tables-per-service: Each service owns a set of tables that only that service should have access to. Schema-for-each-service: Each service has a database schema that is private to that service. Database-server-for-service: Each service has its database server. In addition, you can create additional barriers. For example, assign different database user IDs to each service and use a database access control mechanism such as grants. Command Query Responsibility Segregation (CQRS) A domain model encapsulates and structures domain data and maintains the correctness of that data as it changes. That being said, it can be overwhelmed by the management of complex aggregated objects, concurrent updates, and multiple end-to-end views. Therefore, there is a need to reorganize this model. The CQRS pattern can be used to refactor a model to decouple aspects of data usage. The CQRS pattern allows you to separate data read operations from data update operations. An operation can read data or it can write data, but not both. This makes it easier to implement read and write operations and makes them independent. The full CQRS template uses separate databases and APIs for reading and writing. You can implement this pattern step by step: Stage 0: Typical application data access Stage 1: Separate APIs for reading and writing. Stage 2: Separate reading and writing models. Stage 3: Separate databases for reading and writing. Saga The Saga pattern allows you to split long-running transactions into sub-transactions that can be interleaved with other transactions. Using this pattern solves the problem of having one origin per microservice when dealing with long-running transactions. A saga is a sequence of local transactions where each transaction updates data within a single service, so each next step can be triggered by the completion of the previous event. Dead Letter Queue When using event-driven microservices, in some cases it is necessary to call the service using an HTTP or RPC call. The call may not go through. Subsequently, you can use the dead-letter queue pattern to retry and handle failed calls. Transactional Outbox Often there is a need for a service to update the database and publish a message at the same time. Hence, the question arises: how do you do this reliably, to avoid data inconsistencies and errors? The transactional outbox template is suitable for this. To implement this pattern, you can use a service that updates the database. For relational databases, it inserts the message into the outgoing message table as part of a local transaction. Alternatively, or non-relational ones, it adds a message to the attribute of the record being updated. A separate message relay process is used to publish messages to the message broker. When Should You Use Event-Driven Architecture? The event-driven architecture is often used in modern applications that use microservices and in applications with disconnected components. Thus, it allows you to increase flexibility and speed of data movement. To use this architecture effectively, your application must meet several important criteria. To ensure that each event is processed, you must have a reliable event source that can arrange for the delivery of each event. Since the number of incoming events can be very large, to improve the efficiency of the application, you need to be able to process them asynchronously. In order to restore the system state, your event source must be deduplicated and ordered. There are types of software where an event-driven architecture makes the most sense. Below we will list some of them. An event-driven architecture is effective for coordinating systems across teams working and deploying across regions and accounts. You can use an event router to transfer data between systems. This allows you to design, scale, and deploy services independently of other teams. You can use an event-driven architecture to monitor the status of resources and send notifications. It allows you to monitor and receive alerts about any anomalies, changes, and updates. This architecture can be used when a lot of systems need to operate in response to an event. It allows you to fork and process events in parallel without having to write your code to send to each consumer. The router will send the event to systems, each of which can process the event in parallel for different purposes. An event-driven architecture can be used to exchange information between systems that run on different stacks without coupling. The event router establishes indirection and communication between systems so they can exchange messages and data while remaining independent. This architecture is effective when multiple subsystems need to process the same events. Using an event-driven architecture allows you to create applications that can process real-time data with minimal latency. If your system must perform complex event handling, then you should choose this architecture. Examples of such event handling might be pattern matching or aggregation over time windows. The event-driven architecture allows the implementation of applications that transfer large amounts of data at high data rates, such as the Internet of Things. Benefits of Using Event-Driven Architecture Using an event-driven architecture in systems and applications allows organizations to improve the scalability and responsiveness of applications and access to the data and context needed to make effective decisions. In this architecture, events are captured as they occur, allowing event producers and event consumers to exchange information in real time. You can create a flexible system that can adapt to change and make real-time decisions with an event-driven architecture. This allows you to make manual and automated decisions using all available data that reflects the current state of your systems. There are also some other advantages of using event-driven architecture. Let’s list the main ones. You can separate your services so that they are not connected, but work only with the event router. This allows you to create compatible services, which don’t depend on each other. If one service goes down, the rest will continue to run. In this case, the event router acts as a buffer that can withstand workload spikes. The event router allows you to automatically filter and send events to consumers. It also eliminates the need for tight coordination between producer and consumer services. This reduces the amount of code that needs to be written to poll, filter, and route events, as well as speeds up the development process. The event router can be used to audit your application and define policies. These policies can determine which users and systems have permission to access your data and can publish and subscribe to the router. You can also encrypt events. Using an event-driven architecture allows you to reduce network bandwidth consumption, use less CPU and other resources since all events occur on demand, and you do not need to pay for continuous polling to check for an event. This allows you to cut costs. In this architecture, producers and consumers are not connected. In addition, it allows you to easily add new consumers to the system since there is no point in integration. Wrapping Up In this article, we talked about why traditional architectures do not meet the requirements for modern business applications. Therefore, a new event-driven architecture has emerged that represents new opportunities for technology and business. In the new architecture, we drop the event-command pattern and just emit events. Specifically, this approach allows you to create flexible systems that are easily scalable, can work around the clock without interruption, and are easy to evolve.

By Sveta Gimpelson
Is It Time To Go Back to the Monolith?
Is It Time To Go Back to the Monolith?

History repeats itself. Everything old is new again and I’ve been around long enough to see ideas discarded, rediscovered, and return triumphantly to overtake the fad. In recent years SQL has made a tremendous comeback from the dead. We love relational databases all over again. I think the monolith will have its space odyssey moment again. Microservices and serverless are trends pushed by the cloud vendors, designed to sell us more cloud computing resources. Microservices make very little sense financially for most use cases. Yes, they can ramp down. But when they scale up, they pay the costs in dividends. The increased observability costs alone line the pockets of the “big cloud” vendors. I recently led a conference panel that covered the subject of microservices vs. monoliths. The consensus in the panel (even with the pro-monolith person), was that monoliths don’t scale as well as microservices. This is probably true for the monstrous monoliths of old that Amazon, eBay, et al. replaced. Those were indeed huge code bases in which every modification was painful and their scaling was challenging. But that isn’t a fair comparison. Newer approaches usually beat the old approaches. But what if we build a monolith with newer tooling, would we get better scalability? What would be the limitations and what does a modern monolith even look like? Modulith To get a sense of the latter part you can check out the Spring Modulith project. It’s a modular monolith that lets us build a monolith using dynamic isolated pieces. With this approach, we can separate testing, development, documentation, and dependencies. This helps with the isolated aspect of microservice development with little of the overhead involved. It removes the overhead of remote calls and the replication of functionality (storage, authentication, etc.). The Spring Modulith isn’t based on Java platform modularization (Jigsaw). They enforce the separation during testing and in runtime, this is a regular Spring Boot project. It has some additional runtime capabilities for modular observability, but it’s mostly an enforcer of “best practices." This value of this separation goes beyond what we’re normally used to with microservices but also has some tradeoffs. Let’s give an example. A traditional Spring monolith would feature a layered architecture with packages like this: com.debugagent.myapp com.debugagent.myapp.services com.debugagent.myapp.db com.debugagent.myapp.rest This is valuable since it can help us avoid dependencies between layers; e.g., the DB layer shouldn’t depend on the service layer. We can use modules like that and effectively force the dependency graph in one direction: downwards. But this doesn’t make much sense as we grow. Each layer will fill up with business logic classes and database complexities. With a Modulith, we’d have an architecture that looks more like this: com.debugagent.myapp.customers com.debugagent.myapp.customers.services com.debugagent.myapp.customers.db com.debugagent.myapp.customers.rest com.debugagent.myapp.invoicing com.debugagent.myapp.invoicing.services com.debugagent.myapp.invoicing.db com.debugagent.myapp.invoicing.rest com.debugagent.myapp.hr com.debugagent.myapp.hr.services com.debugagent.myapp.hr.db com.debugagent.myapp.hr.rest This looks pretty close to a proper microservice architecture. We separated all the pieces based on the business logic. Here the cross-dependencies can be better contained and the teams can focus on their own isolated area without stepping on each other's toes. That’s a lot of the value of microservices without the overhead. We can further enforce the separation deeply and declaratively using annotations. We can define which module uses which and force one-way dependencies, so the human resources module will have no relation to invoicing. Neither would the customer module. We can enforce a one-way relationship between customers and invoicing and communicate back using events. Events within a Modulith are trivial, fast, and transactional. They decouple dependencies between the modules without the hassle. This is possible to do with microservices but would be hard to enforce. Say invoicing needs to expose an interface to a different module. How do you prevent customers from using that interface? With modules, we can. Yes. A user can change the code and provide access, but this would need to go through a code review and that would present its own problems. Notice that with modules we can still rely on common microservice staples such as feature flags, messaging systems, etc. You can read more about the Spring Modulith in the docs and in Nicolas Fränkel's blog. Every dependency in a module system is mapped out and documented in code. The Spring implementation includes the ability to document everything automatically with handy up-to-date charts. You might think, dependencies are the reason for Terraform. Is that the right place for such “high-level” design? An Infrastructure as Code (IaC) solution like Terraform could still exist for a Modulith deployment, but they would be much simpler. The problem is the division of responsibilities. The complexity of the monolith doesn’t go away with microservices as you can see in the following image (taken from this thread). We just kicked that can of worms down to the DevOps team and made their lives harder. Worse, we didn’t give them the right tools to understand that complexity so they have to manage this from the outside. That’s why infrastructure costs are rising in our industry, where traditionally prices should trend downwards. When the DevOps team runs into a problem they throw resources at it. This isn’t the right thing to do in all cases. Other Modules We can use Standard Java Platform Modules (Jigsaw) to build a Spring Boot application. This has the advantage of breaking down the application and a standard Java syntax, but it might be awkward sometimes. This would probably work best when working with external libraries or splitting some work into common tools. Another option is the module system in Maven. This system lets us break our build into multiple separate projects. This is a very convenient process that saves us from the hassle of enormous projects. Each project is self-contained and easy to work with. It can use its own build process. Then as we build the main project everything becomes a single monolith. In a way, this is what many of us really want. What About Scale? We can use most of the microservice scaling tools to scale our monoliths. A great deal of the research related to scaling and clustering was developed with monoliths in mind. It’s a simpler process since there’s only one moving part: the application. We replicate additional instances and observe them. There’s no individual service that’s failing. We have fine-grained performance tools and everything works as a single unified release. I would argue that scaling is simpler than the equivalent microservices. We can use profiling tools and get a reasonable approximation of bottlenecks. Our team can easily (and affordably) set up staging environments to run tests. We have a single view of the entire system and its dependencies. We can test an individual module in isolation and verify performance assumptions. Tracing and observability tools are wonderful. But they also affect production and sometimes produce noise. When we try to follow through on a scaling bottleneck or a performance issue, they can send us down the wrong rabbit hole. We can use Kubernetes with monoliths just as effectively as we can use it with microservices. Image size would be larger but if we use tools like GraalVM, it might not be much larger. With this, we can replicate the monolith across regions and provide the same fail-over behavior we have with microservices. Quite a few developers deploy monoliths to Lambdas. I’m not a fan of that approach as it can get very expensive, but it works. The Bottleneck But there’s still one point where a monolith hits a scaling wall: the database. Microservices achieve a great deal of scale thanks to the fact that they inherently have multiple separate databases. A monolith typically works with a single data store. That is often the real bottleneck of the application. There are ways to scale a modern DB. Clustering and distributed caching are powerful tools that let us reach levels of performance that would be very difficult to match within a microservice architecture. There’s also no requirement for a single database within a monolith. It isn’t out of the ordinary to have an SQL database while using Redis for cache. But we can also use a separate database for time series or spatial data. We can use a separate database for performance as well, although in my experience this never happened. The advantages of keeping our data in the same database are tremendous. The Benefits The fact that we can complete a transaction without relying on “eventual consistency” is an amazing benefit. When we try to debug and replicate a distributed system, we might have an interim state that’s very hard to replicate locally or even understand fully from reviewing observability data. The raw performance removes a lot of the network overhead. With properly tuned level 2 caching we can further remove 80-90% of the read IO. This is possible in a microservice but would be much harder to accomplish and probably won’t remove the overhead of the network calls. As I mentioned before, the complexity of the application doesn’t go away in a microservice architecture. We just moved it to a different place. In my experience so far, this isn’t an improvement. We added many moving pieces into the mix and increased overall complexity. Returning to a smarter and simpler unified architecture makes more sense. Why Use Microservices The choice of programming language is one of the first indicators of affinity to microservices. The rise of microservices correlates with the rise of Python and JavaScript. These two languages are great for small applications. Not so great for larger ones. Kubernetes made scaling such deployments relatively easy, thus it added gasoline to the already growing trend. Microservices also have some capability of ramping up and down relatively quickly. This can control costs in a more fine-grained way. In that regard, microservices were sold to organizations as a way to reduce costs. This isn’t completely without merit. If the previous server deployment required powerful (expensive) servers this argument might hold some water. This might be true for cases where usage is extreme; a sudden very high load followed by no traffic. In these cases, resources might be acquired dynamically (cheaply) from hosted Kubernetes providers. One of the main selling points for microservices is the logistics aspect. This lets individual Agile teams solve small problems without fully understanding the “big picture." The problem is, it enables a culture where each team does “its own thing." This is especially problematic during downsizing where code rot sets in. Systems might still work for years but be effectively unmaintainable. Start With Monolith, Why Leave? One point of consensus in the panel was that we should always start with a monolith. It’s easier to build and we can break it down later if we choose to go with microservices. But why should we? The complexities related to individual pieces of software make more sense as individual modules, not as individual applications. The difference in resource usage and financial waste is tremendous. In this time of cutting down costs, why would people still choose to build microservices instead of a dynamic, modular monolith? I think we have a lot to learn from both camps. Dogmatism is problematic as is a religious attachment to one approach. Microservices did wonders for Amazon. To be fair their cloud costs are covered. On the other hand, the internet was built on monoliths. Most of them aren’t modular in any way. Both have techniques that apply universally. I think the right choice is to build a modular monolith with proper authentication infrastructure that we can leverage in the future if we want to switch to microservices. You can check out a video version of this post here:

By Shai Almog CORE
What Is Policy-as-Code? An Introduction to Open Policy Agent
What Is Policy-as-Code? An Introduction to Open Policy Agent

In the cloud-native era, we often hear that "security is job zero," which means it's even more important than any number one priority. Modern infrastructure and methodologies bring us enormous benefits, but, at the same time, since there are more moving parts, there are more things to worry about: How do you control access to your infrastructure? Between services? Who can access what? Etc. There are many questions to be answered, including policies: a bunch of security rules, criteria, and conditions. Examples: Who can access this resource? Which subnet egress traffic is allowed from? Which clusters a workload must be deployed to? Which protocols are not allowed for reachable servers from the Internet? Which registry binaries can be downloaded from? Which OS capabilities can a container execute with? Which times of day can the system be accessed? All organizations have policies since they encode important knowledge about compliance with legal requirements, work within technical constraints, avoid repeating mistakes, etc. Since policies are so important today, let's dive deeper into how to best handle them in the cloud-native era. Why Policy-as-Code? Policies are based on written or unwritten rules that permeate an organization's culture. So, for example, there might be a written rule in our organizations explicitly saying: For servers accessible from the Internet on a public subnet, it's not a good practice to expose a port using the non-secure "HTTP" protocol. How do we enforce it? If we create infrastructure manually, a four-eye principle may help. But first, always have a second guy together when doing something critical. If we do Infrastructure as Code and create our infrastructure automatically with tools like Terraform, a code review could help. However, the traditional policy enforcement process has a few significant drawbacks: You can't be guaranteed this policy will never be broken. People can't be aware of all the policies at all times, and it's not practical to manually check against a list of policies. For code reviews, even senior engineers will not likely catch all potential issues every single time. Even though we've got the best teams in the world that can enforce policies with no exceptions, it's difficult, if possible, to scale. Modern organizations are more likely to be agile, which means many employees, services, and teams continue to grow. There is no way to physically staff a security team to protect all of those assets using traditional techniques. Policies could be (and will be) breached sooner or later because of human error. It's not a question of "if" but "when." And that's precisely why most organizations (if not all) do regular security checks and compliance reviews before a major release, for example. We violate policies first and then create ex post facto fixes. I know, this doesn't sound right. What's the proper way of managing and enforcing policies, then? You've probably already guessed the answer, and you are right. Read on. What Is Policy-as-Code (PaC)? As business, teams, and maturity progress, we'll want to shift from manual policy definition to something more manageable and repeatable at the enterprise scale. How do we do that? First, we can learn from successful experiments in managing systems at scale: Infrastructure-as-Code (IaC): treat the content that defines your environments and infrastructure as source code. DevOps: the combination of people, process, and automation to achieve "continuous everything," continuously delivering value to end users. Policy-as-Code (PaC) is born from these ideas. Policy as code uses code to define and manage policies, which are rules and conditions. Policies are defined, updated, shared, and enforced using code and leveraging Source Code Management (SCM) tools. By keeping policy definitions in source code control, whenever a change is made, it can be tested, validated, and then executed. The goal of PaC is not to detect policy violations but to prevent them. This leverages the DevOps automation capabilities instead of relying on manual processes, allowing teams to move more quickly and reducing the potential for mistakes due to human error. Policy-as-Code vs. Infrastructure-as-Code The "as code" movement isn't new anymore; it aims at "continuous everything." The concept of PaC may sound similar to Infrastructure as Code (IaC), but while IaC focuses on infrastructure and provisioning, PaC improves security operations, compliance management, data management, and beyond. PaC can be integrated with IaC to automatically enforce infrastructural policies. Now that we've got the PaC vs. IaC question sorted out, let's look at the tools for implementing PaC. Introduction to Open Policy Agent (OPA) The Open Policy Agent (OPA, pronounced "oh-pa") is a Cloud Native Computing Foundation incubating project. It is an open-source, general-purpose policy engine that aims to provide a common framework for applying policy-as-code to any domain. OPA provides a high-level declarative language (Rego, pronounced "ray-go," purpose-built for policies) that lets you specify policy as code. As a result, you can define, implement and enforce policies in microservices, Kubernetes, CI/CD pipelines, API gateways, and more. In short, OPA works in a way that decouples decision-making from policy enforcement. When a policy decision needs to be made, you query OPA with structured data (e.g., JSON) as input, then OPA returns the decision: Policy Decoupling OK, less talk, more work: show me the code. Simple Demo: Open Policy Agent Example Pre-requisite To get started, download an OPA binary for your platform from GitHub releases: On macOS (64-bit): curl -L -o opa https://openpolicyagent.org/downloads/v0.46.1/opa_darwin_amd64 chmod 755 ./opa Tested on M1 mac, works as well. Spec Let's start with a simple example to achieve an Access Based Access Control (ABAC) for a fictional Payroll microservice. The rule is simple: you can only access your salary information or your subordinates', not anyone else's. So, if you are bob, and john is your subordinate, then you can access the following: /getSalary/bob /getSalary/john But accessing /getSalary/alice as user bob would not be possible. Input Data and Rego File Let's say we have the structured input data (input.json file): { "user": "bob", "method": "GET", "path": ["getSalary", "bob"], "managers": { "bob": ["john"] } } And let's create a Rego file. Here we won't bother too much with the syntax of Rego, but the comments would give you a good understanding of what this piece of code does: File example.rego: package example default allow = false # default: not allow allow = true { # allow if: input.method == "GET" # method is GET input.path = ["getSalary", person] input.user == person # input user is the person } allow = true { # allow if: input.method == "GET" # method is GET input.path = ["getSalary", person] managers := input.managers[input.user][_] contains(managers, person) # input user is the person's manager } Run The following should evaluate to true: ./opa eval -i input.json -d example.rego "data.example" Changing the path in the input.json file to "path": ["getSalary", "john"], it still evaluates to true, since the second rule allows a manager to check their subordinates' salary. However, if we change the path in the input.json file to "path": ["getSalary", "alice"], it would evaluate to false. Here we go. Now we have a simple working solution of ABAC for microservices! Policy as Code Integrations The example above is very simple and only useful to grasp the basics of how OPA works. But OPA is much more powerful and can be integrated with many of today's mainstream tools and platforms, like: Kubernetes Envoy AWS CloudFormation Docker Terraform Kafka Ceph And more. To quickly demonstrate OPA's capabilities, here is an example of Terraform code defining an auto-scaling group and a server on AWS: With this Rego code, we can calculate a score based on the Terraform plan and return a decision according to the policy. It's super easy to automate the process: terraform plan -out tfplan to create the Terraform plan terraform show -json tfplan | jq > tfplan.json to convert the plan into JSON format opa exec --decision terraform/analysis/authz --bundle policy/ tfplan.json to get the result.

By Tiexin Guo
Event Driven 2.0
Event Driven 2.0

The amount of data that needs to be processed, filtered, connected, and stored is constantly growing. Companies that can process the data quickly have an advantage. Ideally, it should happen in real-time. Event-driven architectures are used for this. Such architectures allow exchanging messages in real-time, storing them in a single database, and using distributed systems for sending and processing messages. In this article, we will talk about how event-driven architecture differs from other popular message-driven architecture. In addition, we will learn the most popular event-driven architecture patterns, and their main advantages and disadvantages. Event-Driven vs. Message-Driven Architecture An event is some action that took place at a certain point in time. It is generated by the service and does not have specific recipients. Any system component can be an event consumer. A message is a fixed packet of data that is sent from one service to another. An event is a type of message that signals a change in the state of the system. In an event-driven architecture, the component that generates the event tells other components where it will be stored. That way, any component can save, process, and respond to an event, and the producer won’t know who the consumer is. In message-driven systems, components that create messages send them to a specific address. After sending a message, the component immediately receives control and does not wait for the message to be processed. In a message-driven architecture, if a similar message needs to be sent to multiple recipients, the sender must send it to each recipient separately. In contrast, in an event-driven architecture, a producer generates an event once and sends it to a processing system/ After that, the event can be consumed by any number of subscribers connecting to that system. Event-Driven Patterns There are different approaches to implementing an event-driven architecture. Often, when designing a program, several approaches are used together. In this section, we will talk about the most popular patterns that allow you to implement an event-driven architecture, their advantages, and their application areas. Global Event Streaming Platform It is vital for today’s companies to respond to events in real-time. Customers expect the business to respond immediately to various events. So, there is a need to develop such software architectures that will meet modern business requirements and will be able to process data as event streams, and not only data that is in a state of rest. A great solution is to use a global event streaming platform. It allows you to process business functions as event streams. In addition, this platform is fault-tolerant and scalable. All events that occur in the system are recorded in the data streaming platform once. External systems read these events and process them in real time. Event streaming platforms consist of a diverse set of components. Their creation requires significant resources and engineering experience. Such a pattern is quite popular and is used in many industries. Central Event Store The central event store ensures the publication and storage of events in a single database. It is a single endpoint for events of various types. This allows applications and services to respond to events in real-time without delays or data loss. Applications can easily subscribe to a variety of events, reducing development costs. The central event store is used for a variety of purposes: Publication of changes for consumer services. Reconstruction of past states and conducting business analytics. Search for events. Saving all program changes as a sequence of events. A single endpoint for notification based on application state changes. Monitoring system status. Using a central event store, you can create new applications and use existing events without having to republish them. Event-First and Event Streaming Applications Event streaming allows you to receive data from various sources, such as applications, databases, various sensors, Internet devices, etc., process, clean, and use them without first saving them. This method of event processing provides fast results and is very important for companies that transfer large amounts of data and require to receive information quickly. Key benefits of event streaming platforms: Improving the customer experience: Customers can instantly learn about changes in the status of their orders, which improves their experience and, accordingly, increases the company’s income. Risk reduction: Systems that use streaming events allow the detection of fraud on the Internet, can stop a suspicious transaction, or block a card. Reliability: Event streaming platforms allow for robust systems that can effectively handle subscriber failures. Feedback in real time: Users can see the results of their operations immediately after their execution, without waiting a minute. Event streaming systems require less infrastructure and data to support, so they are simple and fast to build. Applications that use this architecture pattern are used in various industries, such as financial trading, risk and fraud detection in financial systems, the Internet of Things, retail, etc. CQRS Command Query Responsibility Segregation (CQRS) is the principle of separating data structures for reading and writing information. It is used to increase the performance, security, and scalability of the software. The application of CQRS is quite useful in complex areas where a single model for reading and writing data is too complex. When it is separated, it is greatly simplified. This is especially noticeable when the number of reading and writing operations is significantly different. Different databases can be used for reading and storing data. In this case, they must be synchronized. To do this, the recording model must publish an event every time it updates the database. Advantages of CQRS: Independent scaling of reading and write workloads. Separate optimized data schemes for reading and writing. The use of more flexible and convenient models thanks to the separation of concerns. Ability to avoid complex queries by storing the materialized view in the read database. The CQRS template is useful for the following scenarios: In systems where many users simultaneously access the same data. When data read performance should be configured separately from data write performance. This is especially important when the number of reads greatly exceeds the number of writes. When the read-and-write models are created by different development teams. With frequent changes in business rules. In cases where the logic of the system and the user interface are quite simple, this template is not recommended. Event Sourcing An event sourcing stores the system state as a sequence of events. Whenever the system state changes, a new event is added to the event list. However, this system state can be restored by reprocessing events in the future. All events are stored in the event store, which is an event database. A good example of such an architecture is a version control system. Its event store is a log of all commits, and a working copy of the source tree is the system state. The main advantages of event search: Events can be published each time the system state changes. Ensuring a reliable change audit log. The ability to determine the state of the system at any time. The ability to easily transition from a monolithic application to a microservice architecture through the use of loosely coupled objects that exchange events. However, to reconstruct the state of business objects, you need to send standard requests, which is difficult and inefficient. Therefore, the system must use Command Request Responsibility Distribution (CQRS) to implement requests. This means that applications must process the final agreed-upon data. Automated Data Provisioning This pattern provides fully self-service data provisioning. The user must determine what data and in what format he needs, as well as where it should be stored. For example, in a database, distributed cache, microservice, etc. The selected repository can be used together with the central repository. The system provides the client with infrastructure along with pre-loaded data and manages the flow of events. The processor processes and filters data streams according to user requirements. The use of cloud infrastructure makes such a system faster and more practical. Systems that use automated data provisioning are used in finance, retail, and the Internet, both on-premises and in the cloud. Advantages and Disadvantages of Event-Driven Architecture Even though event-driven architecture is quite popular now, is developing rapidly, and helps to solve many business problems, there are also some disadvantages of using this approach. In this section, we list the main advantages and disadvantages of event-driven architecture. Advantages Autonomy: The loose coupling of components that use this architecture allows event producers and consumers to function independently of each other. This connection allows you to use different programming languages and technologies to develop different components. In addition, producers and consumers can be added and removed from the system without affecting other participants. Fault tolerance: Events are published immediately after they occur. Various services and programs subscribe to these events. If the consumer shuts down, events continue to be published and queued. When the consumer reconnects, it will be able to handle these events. Real-time user interaction for a better user experience. Economy: The consumer receives the message immediately after the producer has published it, eliminating the need for constant polling to verify the event. This reduces CPU consumption and network bandwidth usage. Disadvantages Error handling is difficult. Because event producers and consumers can be many and loosely connected, it is difficult to trace the actions between them and identify the root cause of a malfunction. Inability to predict events occurring in different periods. Since events are asynchronous, the order in which they occur cannot be predicted. Additionally, there may be duplicates, each of which may require a contextual response. This requires additional testing time and deep system analysis to prevent data loss. The weak connection between the systems that generate the events and the systems that receive them can be a point of vulnerability that attackers can exploit. Conclusion Event-driven architectures have been around for a long time and were used to pass messages. In connection with modern business needs, they are actively developing and offering better opportunities for connection and management. Event-driven architectures are indispensable for real-time event processing, as required by many modern systems. To implement such an architecture, there are several different patterns, each of which has its advantages and application features. When designing a system, you need to clearly define its functions and applications to choose the right pattern. Although event-driven architecture is widely used and has many advantages, its features can sometimes have a bad effect on the system, which must be taken into account during its design.

By Sveta Gimpelson
Multi-Tenant Architecture for a SaaS Application on AWS
Multi-Tenant Architecture for a SaaS Application on AWS

SaaS applications are the new normal nowadays, and software providers are looking to transform their applications into a Software As a Service application. For this, the only solution is to build a multi-tenant architecture SaaS application. Have you ever wondered how Slack, Salesforce, AWS (Amazon Web Services), and Zendesk can serve multiple organizations? Does each one have its unique and custom cloud software per customer? For example, have you ever noticed that, on Slack, you have your own URL “yourcompanyname.slack.com?” Most people think that, in the background, they created a particular environment for each organization—application or codebase—and believe that Slack customers have their own server/app environment. If this is you, you might have assumed they have a repeatable process to run thousands of apps across all their customers. Well, no. The real solution is a multi-tenant architecture on AWS for a SaaS application. Let’s start with this impressive fact: 70% of all web apps are considered SaaS applications according to IDC Research. So, if you know about SaaS architecture and multi-tenant, you are probably covering 70% of the web app architecture landscape that would be available in the future. “70% of all web apps are SaaS, but only a few of them are multi-tenant.” This research is intended to showcase an overview of the strategies, challenges, and constraints that DevOps and software developers are likely to face when architecting a SaaS multi-tenant application.There are two concepts that are important for us to understand before starting: The next points are what we will explore in a multi-tenant architecture for your SaaS application, and my contributions will be: What Is Multi-Tenant Architecture? First of all, you need to understand what single tenant and multi-tenant architecture is: Single-tenant architecture (siloed model): is a single architecture per organization where the application has its own infrastructure, hardware, and software ecosystem. Let’s say you have ten organizations; in this case, you would need to create ten standalone environments, and your SaaS application, or company, will function as a single tenant architecture. Additionally, it implies more costs, maintenance, and a level of difficulty to update across the environments. Multi-tenant architecture: is an ecosystem or model where a single environment can serve multiple tenants utilizing a scalable, available, and resilient architecture. The underlying infrastructure is completely shared, logically isolated, and with fully centralized services. The multi-tenant architecture evolves according to the organization or subdomain (organization.saas.com) that is logged into the SaaS application; and is totally transparent to the end-user. Bear in mind that in this paper, we will discuss two multi-tenant architecture models, one for the application layer and one for the database layer. Multi-Tenant Benefits The adoption of a multi-tenant architecture approach will bring extensive valuable benefits for your SaaS application. Let’s go through the next contributions: A reduction of server infrastructure costs utilizing a multi-tenant architecture strategy: Instead of creating a SaaS environment per customer, you include one application environment for all your customers. This enables your AWS hosting costs to be dramatically reduced from hundreds of servers to a single one. One single source of trust: Let’s say again you have a customer using your SaaS. Imagine how many code repositories you would have per customer. At least 3-4 branches per customer, which would be a lot of overhead and misaligned code releases. Even worse, visualize the process of deploying your code to the entire farm of tenants; it is extremely complicated. This is unviable and time-consuming. With a multi-tenant SaaS architecture, you avoid this type of conflict, where you’ll have one codebase (source of trust), and a code repository with a few branches (dev/test/prod). By following the below practice—with a single command (one-click-deployment)—you will quickly perform the deployment process in a few seconds. Cost reductions of development and time-to-market: Cost reduction considers a sequence of decisions to make, such as having a single codebase, a SaaS platform environment, a multi-tenant database architecture, a centralized storage, APIs, and following The Twelve-Factor Methodology. All of them will allow you to reduce development labor costs, time-to-market, and operational efficiencies. SaaS Technology Stack for an Architecture on AWS To build a multi-tenant architecture, you need to integrate the correct AWS web stack, including OS, language, libraries, and services to AWS technologies. This is just the first step towards creating a next-generation multi-tenant architecture. Even though we will surface a few other multi-tenant architecture best practices, this article will be primarily oriented to this AWS SaaS web stack. Let’s dive into our SaaS Technology Stack on AWS: Programming Language It doesn’t really matter which language platform you select. What is vital is that your application can scale, utilize multi-tenant architecture best practices, cloud-native principles, and a well-known language by the open-source community. The latest trends to build SaaS applications are Python + React + AWS. Another “variant” is Node.js + React + AWS, but in the end, the common denominators are always AWS and React. If you are a financial company, ML or AI, with complex algorithms or backend work, I’ll say you should go for Python. On the other hand, if you are using modern technologies like real-time chats, mini feeds, streaming, etc. then go for Node.js. There is a market in the banking sector that is leveraging Java, but that’s for established enterprises. Any new SaaS application better goes with the mentioned web stack. Again, this is just what I’ve noticed as a trend, and what the community is demanding. Note: This data comes from a survey we performed a few months ago for financial services and SaaS companies. Ideal Languages Cloud Provider As a team of DevOps experts, I’ve noticed a cloud variation in the last two years, and which corresponds to these percentages: 70% of our DevOps implementations are based on AWS, 25% with Azure, and 5% go to GCP and digital ocean. Each year the trend is similar, with the exception that Azure is gradually growing with the years. Those are not only my words, but also ideas supported by multiple DevOps partners. So, I strongly recommend deploying your SaaS application under AWS. It has a number of benefits; every day there is a new service available for you, and a new feature that facilitates your development and deployment. Totally recommended to deploy your SaaS on AWS. Microservices If you are planning to leverage the cloud, you must leverage cloud-native principles. One of these principles is to incorporate microservices with Docker. Make sure your SaaS application is under microservices, which brings multiple benefits, including flexibility and standardization, easier to troubleshoot, problems isolation, and portability. Just like the cloud, Docker and microservices have transformed the IT ecosystem and will stay for a long while. Container Orchestration Platform This is a complicated and abstract decision; there are three options in AWS to manage, orchestrate, and create a microservice cluster environment: Amazon ECS: It is the natural Amazon container orchestration system in the AWS ecosystem. (Highly recommended for startups, small SaaS, and medium SaaS). Amazon Fargate: Almost serverless and price and management is per task. Minimal operational effort vs. ECS. There are some studies conducted by our DevOps team; in terms of performance. Fargate can be slower than ECS, so for this particular case, I would recommend Amazon ECS, instead of Fargate. Another thought is that if your team is pure developers and not planning to hire a DevOps engineer, perhaps Fargate is the way to go. Amazon EKS: It is a managed service that makes Kubernetes on AWS easy to manage. Use Amazon EKS instead of deploying a Kubernetes cluster on an EC2 instance, set up the Kubernetes networking, and worker nodes. (Recommended for large SaaS apps and a sophisticated DevOps and web development Team). Database The inherent database will be PostgreSQL with Amazon RDS. However, I strongly recommend that if you have a senior development team, and are projecting a high-traffic for your SaaS application—or even hundreds of tenants—you’d better architect your database with MongoDB. In addition to this, utilize the best practices that will be mentioned below about multi-tenant database. In this case, I would go for Amazon RDS with PostgreSQL or DynamoDB (MongoDB). “If you are projecting a high-traffic for your SaaS application, you’d better architect your database with MongoDB.” GraphQL or Amazon AppSync GraphQL is a query language and an alternative to a RESTful API for your database services. This new and modern ecosystem is adopted as a middleman among the client and the database server. It allows you to retrieve database data faster, mitigate the over-fetching in databases, retrieve the accurate data needed from the GraphQL schema, and maintaining the speed of development by iterating more quickly than a RESTful service. Adopting a monolithic backend application into a multi-tenant microservice architecture is the perfect time to leverage GraphQL or AppSync. Hence, when transforming your SaaS application, don’t forget to include GraphQL! Note: I didn’t include this service in the AWS SaaS architecture diagram, because it is implemented in multiple ways, and it would require an in-depth explanation on this topic. Automation You need a mechanism to trigger or launch new tenants/organizations and attach it to your multi-tenant SaaS architecture. Let’s say you have a new client that just subscribed to your SaaS application, how do you include this new organization inside your environment, database, and business logic? You need an automated process to launch new tenants; this is called Infrastructure as Code (IaC). This script/procedure should live within a git/bitbucket repository, one of the fundamental DevOps principles. A strong argument to leverage automation and IaC is that you need a mechanism to automate your SaaS application for your code deployments. In the same lines, automate the provisioning of new infrastructure for your Dev/Test environments. Infrastructure as Code and Automation Tools It doesn’t matter which Infrastructure as Code tool to use, they are both useful (Terraform and CloudFormation); they do the same job, and are highly known by the DevOps community. I don’t have a winner, they are both good. Terraform (from Hashicorp): A popular cloud-agnostic tool. Used widely for all DevOps communities. It is easier to find DevOps with this skill. Amazon CloudFormation: It is easier to integrate with Amazon Web Services, AWS built-in Automation tool. Whenever there is a new Amazon technology just released, the compatibility with AWS and CloudFormation is released sooner than Terraform. Trust on an AWS CloudFormation expert to automate and release in a secure manner. Message Queue System (MQS) The common MQS are Amazon SQS, RabbitMQ, or Celery. What I suggest here is to utilize the service that requires you less operation, in this case, is Amazon SQS. There are multiple times you need asynchronous communication. From delaying or scheduling a task, to increasing reliability and persistence with critical web transactions, decoupling your monolithic or micro-service application, and, most importantly: using a queue system to communicate event-driven serverless applications (Amazon Lambda functions). Caching System AWS ElastiCache is a caching and data storage system that is fully scalable, available, and managed. It aims to improve the application performance of distributed cache data and in-memory data structure stores. It’s an in-memory key-value store for Memcached and Redis engines. With a few clicks, you can run this AWS component entirely self-managed. It is essential to include a caching system for your SaaS application. Cloud Storage System Amazon S3 and Amazon CloudFront CDN for your static content. All static content, including images, media and HTML, will be hosted on Amazon S3—the cloud system with infinite storage and elasticity. In front of Amazon S3 we will include AWS CloudFront, integrating this pair of elements is vital, in order to cache the entire static content and reduce bandwidth costs. SaaS Web Stack: Multi-Tenant SaaS Architecture Example on AWS Types of Multi-Tenant SaaS Architectures One of the most important questions among the multi-tenant adoption would be which multi-tenant architecture suits better for your SaaS application on AWS. We will explore the two layers needed to enable your application to act as a real SaaS platform since it is paramount to decide which multi-tenant architecture you’ll incorporate in your SaaS platfrom, the application, and database layer. These two types of multi-tenant architectures are: The application layer multi-tenancy. The database layer multi-tenancy. The Application Layer Multi-Tenancy The application layer is an architectural design that enables hosting for tenants and is primarily delivered for Software as a Service applications (SaaS apps). In this first model, the application layer is commonly shared among multiple customers. Monolithic Architecture for SaaS If you haven’t seen this article before—or if you have already developed and architected your own SaaS application—I’m sure you have fallen into this approach. The monolithic components include EC2 instances in the web tier, app tier, and Amazon RDS with MySQL for your database. The monolithic architecture is not a bad approach, with the exception that you are wasting resources massively in the mentioned tiers. At least around 50% and 70% of your CPU/RAM usage is wasted due to the nature of the monolithic (cloud) architecture. Monolithic Architecture Diagram Microservices Architecture for SaaS With Containers and Amazon ECS Microservices are a recommended type of architecture since they provide a balance between modernization and maximum use of available cloud resources (EC2 instances and compute units). As well as it introduces a decomposed system with more granular services (microservices). We won’t touch on much about the microservice benefits since it’s widely expressed in the community. However, I’ll recommend you to utilize the formula of multi-tenant architecture + AWS Services + microservices + Amazon ECS as the container orchestrator; they can be the perfect match. Mainly, consider that Amazon ECS gives fewer efforts to configure your cluster and more NoOps for your DevOps team. “By 2022, 90% of all new apps will feature microservices architectures that improve the ability to design, debug, update, and leverage third-party code; 35% of all production apps will be cloud-native.” —Source: Forbes, 2019 With a talented team, the best multi-tenant SaaS architecture approach would be this use case scenario. Along the same lines, it covers the SaaS software and architecture’s main attributes, including agility, innovation, repeatability, reduced cycle time, cost efficiency, and manageability. The Perfect Match Multi-tenant architecture + AWS Services + microservices + Amazon ECS (as the container orchestrator). Microservices Architecture Diagram Kubernetes Architecture for SaaS With Amazon EKS You may be wondering: what about Kubernetes or Amazon EKS? Well, Kubernetes is another alternative of microservice architecture that adds an extra layer of complexity in the SaaS equation. However, you can overcome this complexity by leveraging Amazon EKS, Amazon Elastic Container Service for Kubernetes; the managed Kubernetes service from Amazon, which is a de facto service by the Kubernetes community. What highlights of this component from the rest of the architectures is that it provides the use of namespaces. This attribute aids to isolate every tenant and its own environment within the corresponding Kubernetes cluster. In this sense, you don’t have to create different clusters per each tenant (you could, but, to satisfy a different approach). By using ResourceQuota, you can limit the resources used per namespace and avoid creating noise to the other tenants. Another point to consider is that if you would like to isolate your namespaces, you need to include Kubernetes network policies because, by default, the networking is open and can communicate across namespaces and containers. Here is a comparison of Amazon ECS vs Kubernetes. If you have a SaaS enterprise, I’ll recommend better to control your microservice via Amazon EKS or Kubernetes since it allows you to have more granular changes. So, what would a Kubernetes multi-tenant architecture look like? Here is a simple Kubernetes multi-tenant architecture and siloed by its respective namespaces. Kubernetes Architecture Diagram A simple multi-tenant architecture with Kubernetes and siloed by Kubernetes namespaces. Serverless Architecture for SaaS on AWS The dream of any AWS architect is to create a multi-tenant SaaS architecture with a serverless approach. That’s a dream that can come true as a DevOps or SaaS architect, but it especially adds a fair amount of complexity as a tradeoff. Additionally, it requires a reasonable amount of collaboration time with your dev team, extensive changes of code application, and a transformative serverless mindset. Given that, in a few years, it will be the ultimate solution, and it all depends on the talent, capabilities, and use case. A Serverless SaaS architecture enables applications to obtain more agility, resilience, and fewer development efforts, a truly NoOps ecosystem. At a high-level, what are the new parts of this next-generation serverless SaaS architecture? Every call becomes an isolated tenant call, either going to a logical service (Lambda function) or going to the database data coming from the Amazon API Gateway as an entry point in the serverless SaaS application. Now that you have decoupled every logical service, the authentication and authorization module needs to be handled by a third-party service like Amazon Cognito, which will be the one in charge to identify the tenant, user, tier, IAM tenant role, and bring back an STS token with these aspects. Particularly, the API Gateway will route all the tenant functions to the correct Lambda functions matching the STS Token. Here is a diagram of a multi tenant architecture example for AWS SaaS applications that are using serverless. Serverless Architecture Diagram The Database Layer Multi-Tenancy The multi-tenancy concept comes with different architecture layers. We have already advocated the multi-tenancy application layer and its variants. Now, it is time to explore multi-tenancy in the database layer, which is another aspect to discover. Paradoxically, the easiest and cost-effective multi-tenant database architecture is the pure and real database multi-tenancy. The database layer is right the opposite of the previous model, the application layer. Over here, the DB layer is kept in common among tenants, and the application layer is isolated. As a next step, you need to evaluate what multi-tenant database architecture to pursue with tables, schemas, or siloed databases. When choosing your database architecture, there are multiple criterias to assess: Scalability: Number of tenants, storage per-tenant, workload. Tenant isolation Database costs: Per tenant costs. Development complexity: Changes in schemas, queries, etc. Operational complexity: Database clustering, update tenant data, database administration, and maintenance. Single Database: A Table Per Tenant A table per tenant single database refers to a pure database multi-tenancy and pooled model. This database architecture is the common and default solution by DevOps or software architects. It is very cost-effective when having a small startup or a few dozen organizations. It consists of leveraging a table per each organization within a database schema. There are specific trade-offs for this architecture, including the sacrifice of data isolation, noise among tenants, and performance degradation—meaning one tenant can overuse compute and ram resources from another. Lastly, every table name has its own tenantID, which is very straightforward to design and architect. In regard to data isolation and compliance, let’s say that one of your developers makes a mistake and brings the wrong tenant information to your customer. Imagine the data breach—please ensure never to expose information from more than one tenant. That’s why compliant SaaS applications, this architecture model is not recommended, however, is used widely because of its cost-effectiveness. Alternative Single-Tenant Database Architecture A shared table in a single schema in a single schema in a single database. Perfect for DynamoDB. (We didn’t cover this approach—FYI). Single Database: A Schema Per Tenant A schema per tenant single database, also known as the bridge model, is a multi-tenant database approach, which is still very cost-effective and more secure than the pure tenancy (DB pooled model), since you are with a single database, with the exception of the database schema isolation per tenant. If you are concerned about data partitioning, this solution is slightly better than the previous one (a table per tenant). Similarly, it is simple to manage across multiple schemas in your application code configuration. One important distinction to notice is that with more than 100 schemas or tenants within a database, it can provoke a lag in your database performance. Hence, it is recommended to split the database into two (add the second database as a replica). However, the best database tool for this approach is PostgreSQL, which supports multiple schemas without much complexity. Lastly, this strategy, of a schema per tenant, shares resources, compute, and storage across all its tenants. As a result, it provokes noisy tenants that utilize more resources than expected. Database Server Per Tenant Also call the siloed model, where you need a database instance per customer. Expensive, but the best for isolation and security compliance. This technique is significantly more costly than the rest of multi-tenant database architectures, but it complies with security regulations; the best for performance, scalability, and data isolation. This pattern uses one database server per tenant, meaning if the SaaS app has 100 tenants, therefore there will be 100 database servers, which is extremely costly. When PCI, HIPAA, or SOC2 is needed, it is vital to utilize a database siloed model, or at least find a workaround with the correct IAM roles and the best container orchestration—either Kubernetes or Amazon ECS namespaces—a VPC per tenant and encryption everywhere. Multi-Tenant Database Architecture Tools Amazon RDS with PostgreSQL (best option). DynamoDB (a great option for a single-tenant database with a single shared table). Amazon RDS with MySQL. GraphQL, as described previously, use it in front of any of these databases to increase speed on data retrieval, speed on development, and alternative to RESTful API, which helps relieve requests from the backed servers to the client. Application Code Changes Once you have selected your multi-tenant strategy in every layer, let’s start considering what is needed to change in the code level, in terms of code changes. If you have decided to adopt Django (from Python) for your SaaS application, you need a few tweak changes to align your current application with your multi-tenant architecture from the database and application layer. Fortunately, web application languages and frameworks are able to capture the URL or subdomain that is coming from the request. The ability to obtain this information (subdomain) at runtime is critical to handling dynamic subdomains for your multi-tenant architecture. We won’t cover in-depth what lines of codes we need to include in your Django application—or in any other framework—but, at least, I’ll let you know what items should be considered in this section. Python Django Multi-Tenancy in a Nutshell Add an app called tenant.py: a class for tenantAwareModel with multiple pool classes. How to identify tenants: you need to give each tenant a subdomain; to do so, you need to modify a few DNS changes, Nginx/Apache tweaks, and add a utility method (utils.py). Now, whenever you have a request, you can use this method to get the tenant. Determine how to extract the tenant utilizing the host header: (subdomain). Admin isolation Note: Previous code suggestions could change depending on the architecture. Wildcard DNS Subdomain: URL-Based SaaS Platform Basically, every organization must have its own subdomain, and they are quite useful for identifying organizations. Per tenant, it is a unique dedicated space, environment, and custom application (at least logically); for example, “org1.saas.com,” “org2.saas.com,” and so on. This URL structure will dynamically provision your SaaS multi-tenant application, and this DNS change will facilitate the identification, authentication, and authorization of every tenant. However, another workaround is called path-based per tenant, which is not recommended, for example, “app.saas.com/org1/…,“ “app.saas.com/org2\…,” and so on. So, the following is required in this particular section: A wildcard record should be in place in your DNS management records. This wildcard subdomain redirects all routes to your multi-tenant architecture (either to the load balancer, application server, or cluster end-point). Similarly, a CNAME record labeled (*) pointing to your “app.saas.com” or “saas.com/login.” An asterisk (*) means a wildcard to your app domain. As a final step, another (A) record pointing your “app.saas.com“ domain to your Amazon ECS cluster, ALB, or IP. DNS Records Entries “*.saas.com” CNAME “app.saas.com” “app.saas.com” A 1.2.3.4 OR “app.saas.com” A (alias) balancer.us-east-1.elb.amazonaws.co Note: An (A) Alias record is when you are utilizing an ALB/ELB (Load Balancer) from AWS. Web Server Setup With NGINX Configuration Let’s move down to your web server, specifically Nginx. In this stage, you will need to configure your Nginx.conf and server blocks (virtual hosts). Set up a wildcard vhost for your Nginx web server. Make sure it is an alias (ServerAlias) and a catch-all wildcard site. You don’t have to create a subdomain VirtualHost in Nginx per tenant; instead, you need to set up a single wildcard VirtualHost for all your tenants. Naturally, the wildcard pattern will match your subdomains and route accordingly to the correct and unique patch of your SaaS app document root. SSL Certificates Just don’t forget to deal with the certificates under your tenant subdomains. You would need to add them either in the Cloudfront CDN, load balancer, or in your web server. Note: This solution can be accomplished using the Apache webserver. Follow the 12-factor Methodology Framework Following the 12-factor methodology represents the pure DevOps and cloud-native principles, including immutable infrastructure, dev/test and prod parity with Docker, CI/CD principles, stateless SaaS application, and more. Multi-Tenant SaaS Architecture Best Practices How is your SaaS platform going to scale? The multi tenant SaaS architecture best practices are: Amazon AutoScaling, either with EC2 instances or microservices. Database replication with Amazon RDS, Amazon Aurora, or DynamoDB. Application load balancer. Including a CloudFront CDN for your static content. Amazon S3 for all your static/media content. Caching system including Redis/Memcached or its equivalent in the AWS cloud—Amazon ElastiCache. Multi-availability zone set up for redundancy and availability. Code Deployments With CI/CD Another crucial aspect to consider is how to deploy your code releases across tenants and your multiple environments (dev, test, and prod). You will need a Continuous Integration and Continuous Delivery (CI/CD) process to streamline your code releases across all environments and tenants. If you follow-up on my previous best practices, it won’t be difficult. Tools to embrace CI/CD CI/CD tools: Jenkins, CircleCi, or AWS Code pipelines (along with Codebuild and CodeDeploy). My advice: If you want a sophisticated DevOps team and a widely known tool, go for Jenkins; otherwise, go for CircleCI. If you want to keep leveraging AWS technologies exclusively, then go for AWS Code pipelines. But, if you’re looking for compliance, banks, or regulated environments, go for Gitlab. DevOps Automation: Automate Your New Tenant Creation Process How are you creating new tenants per each subscription? Identify the process of launching new tenants into your SaaS environment. You need to trigger a script to launch or attach the new multi-tenant environment to your existing multi-tenant architecture, meaning to automate the setup of new tenants. Consider that it can be after your customer gets registered in your onboarding page, or you need to trigger the script manually. Automation Tools Terraform (Recommended) Amazon CloudFormation (Trust on an AWS CloudFormation certified team) Ansible. Note: Ensure you utilize Infrastructure as Code principles in this aspect. Siloed Compute and Siloed Storage How will your architecture be isolated from other tenants? You just need to identify the next: every layer of the SaaS application needs to be isolated. The customer workflow is touching multiple layers, pages, backend, networking, front-end, storage, and more bits, so… How is your isolation strategy? Take in mind the next aspect: IAM Roles per function or microservices. Amazon S3 security policies. VPC isolation. Amazon ECS/Kubernetes namespace isolation. Database isolation (tenant per table/schema/silo database). Tenant Compute Capacity Have you considered how many SaaS tenants it can support per environment? Just think, you have 99 tenants, compute/database load is almost to the limits, do you have a ready environment to support the new tenants? What about the databases? You have a particular customer that wants an isolated tenant environment for its SaaS application. How would you support an extra tenant environment that is separated from the rest of the multi-tenant architecture? Would you do it? What are the implications? Just consider a scenario for this aspect. Tenant Clean-Up What are you doing with the tenants that are idle or not used anymore? Perhaps a clean-up process for any tenant that has been inactive for a prolonged period, or removes unused resources and tenants by hand, but you need a process or automation script. Final Thoughts Multi-tenant architecture and SaaS applications under AWS. What a topic that we just discovered! Now, you understand the whole multi-tenant SaaS architecture cycle from end-to-end, including server configuration, code, and what architecture pursues per every IT layer. As you can notice, there is no global solution for this ecosystem. There are multiple variants per each IT layer, either all fully multi-tenant, partially tenant, or just silo tenants. It falls more on what you need, budget, complexity, and the expertise of your DevOps team. I strongly recommend going for microservices (ECS/EKS), partially multi-tenant SaaS in the app, and database layer. As well, include cloud-native principles, and finally, adopt the multi-tenant architecture best practices and considerations described in this article. That being said, brainstorm your AWS SaaS architecture firstly by thinking on how to gain agility, cost-efficiency, IT labor costs, and leveraging a nearshore collaboration model, which adds another layer of cost-savings. In regard, automation with Terraform and CloudFormation is our best choice. Even better, most of our AWS/DevOps projects are following PCI, HIPAA, and SOC2 regulations. If you are a fintech, healthcare, or SaaS company, well, you know this type of requirement should be included in your processes.

By Alfonso Valdes
Introduction to Kubernetes Event-Driven Auto-Scaling (KEDA)
Introduction to Kubernetes Event-Driven Auto-Scaling (KEDA)

Manual scaling is slowly becoming a thing of the past. Currently, autoscaling is the norm, and organizations that deploy into Kubernetes clusters get built-in autoscaling features like HPA (Horizontal Pod Autoscaling) and VPA (Vertical Pod Autoscaling). But these solutions have limitations. For example, it's difficult for HPA to scale back the number of pods to zero or (de)scale pods based on metrics other than memory or CPU usage. As a result, KEDA (Kubernetes Event-Driven Autoscaling) was introduced to address some of these challenges in autoscaling K8s workloads. In this blog, we will delve into KEDA and discuss the following points: What is KEDA?KEDA architecture and componentsKEDA installation and demoIntegrate KEDA in CI/CD pipelines What Is KEDA? KEDA is a lightweight, open-source Kubernetes event-driven autoscaler that DevOps, SRE, and Ops teams use to scale pods horizontally based on external events or triggers. KEDA helps to extend the capability of native Kubernetes autoscaling solutions, which rely on standard resource metrics such as CPU or memory. You can deploy KEDA into a Kubernetes cluster and manage the scaling of pods using custom resource definitions (CRDs). Tabular comparison between different Kubernetes autoscaling features: VPA, KEDA, and HPA Built on top of Kubernetes HPA, KEDA scales pods based on information from event sources such as AWS SQS, Kafka, RabbitMQ, etc. These event sources are monitored using scalers, which activate or deactivate deployments based on the rules set for them. KEDA scalers can also feed custom metrics for a specific event source, helping DevOps teams observe metrics relevant to them. What Problems Does KEDA Solve? KEDA helps SREs and DevOps teams with a few significant issues they have: Freeing Up Resources and Reducing Cloud Cost KEDA scales down the number of pods to zero in case there are no events to process. This is harder to do using the standard HPA, and it helps ensure effective resource utilization and cost optimization, ultimately bringing down the cloud bills. Interoperability With DevOps Toolchain As of now, KEDA supports 59 built-in scalers and four external scalers. External scalers include KEDA HTTP, KEDA Scaler for Oracle DB, etc. Using external events as triggers aids efficient autoscaling, especially for message-driven microservices like payment gateways or order systems. Furthermore, since KEDA can be extended by developing integrations with any data source, it can easily fit into any DevOps toolchain. KEDA interoperability KEDA Architecture and Components As mentioned in the beginning, KEDA and HPA work together to achieve autoscaling. Because of that, KEDA needs only a few components to get started. KEDA Components Refer to Fig. A and let us explore some of the components of KEDA. Fig. A KEDA architecture Event SourcesThese are the external event/trigger sources by which KEDA changes the number of pods. Prometheus, RabbitMQ, and Apache Pulsar are some examples of event sources. ScalersEvent sources are monitored using scalers, which fetch metrics and trigger the scaling of Deployments or Jobs based on the events. Metrics AdapterThe metrics adapter takes metrics from scalers and translates or adapts them into a form that HPA/controller component can understand. ControllerThe controller/operator acts upon the metrics provided by the adapter and brings about the desired deployment state specified in the ScaledObject (refer below). KEDA CRDs KEDA offers four custom resources to carry out the auto-scaling functions- ScaledObject, ScaledJob, TriggerAuthentication, and ClusterTriggerAuthentication. ScaledObject and ScaledJobScaledObject represents the mapping between event sources and objects and specifies the scaling rules for a Deployment, StatefulSet, Jobs, or any Custom Resource in a K8s cluster. Similarly, ScaledJob is used to specify scaling rules for Kubernetes Jobs.Below is an example of a ScaledObject which configures KEDA autoscaling based on Prometheus metrics. Here, the deployment object 'keda-test-demo3' is scaled based on the trigger threshold (50) from Prometheus metrics. KEDA will scale the number of replicas between a minimum of 1 and a maximum of 10 and scale down to 0 replicas if the metric value drops below the threshold. apiVersion: keda.sh/v1alpha1 kind: ScaledObject metadata: name: prometheus-scaledobject namespace: demo3 spec: scaleTargetRef: apiVersion: argoproj.io/v1alpha1 kind: Rollout name: keda-test-demo3 triggers: - type: prometheus metadata: serverAddress: http://<prometheus-host>:9090 metricName: http_request_total query: envoy_cluster_upstream_rq{appId="300", cluster_name="300-0", container="envoy", namespace="demo3", response_code="200" } threshold: "50" idleReplicaCount: 0 minReplicaCount: 1 maxReplicaCount: 10 TriggerAuthentication and ClusterTriggerAuthenticationThey manage authentication or secrets to monitor event sources.Now let us see how all these KEDA components work together and scale K8s workloads. How Do KEDA Components Work? It is easy to deploy KEDA on any Kubernetes cluster, as it doesn't need overwriting or duplication of existing functionalities. Once deployed and the components are ready, the event-based scaling starts with the external event source (refer to Fig. A). The scaler will continuously monitor for events based on the source set in ScaledObject and pass the metrics to the metrics adapter in case of any trigger events. The metrics adapter then adapts the metrics and provides them to the controller component, which then scales up or down the deployment based on the scaling rules set in ScaledObject. Note that KEDA activates or deactivates a deployment by scaling the number of replicas to zero or one. It then triggers HPA to scale the number of workloads from one to n based on the cluster resources. KEDA Deployment and Demo KEDA can be deployed in a Kubernetes cluster through Helm charts, operator hub, or YAML declarations. For example, the following method uses Helm to deploy KEDA. #Adding the Helm repo helm repo add kedacore https://kedacore.github.io/charts #Update the Helm repo helm repo update #Install Keda helm chart kubectl create namespace keda helm install keda kedacore/keda --namespace keda To check if KEDA Operator and Metrics API server are up or not after the deployment, you can use the following command: kubectl get pod -n keda Now, watch the below video to see a hands-on demo of autoscaling using KEDA. The demo uses a small application called TechTalks and uses RabbitMQ as the message broker. Integrate KEDA in CI/CD Pipelines KEDA makes autoscaling of K8s workloads very easy and efficient. In addition, the vendor-agnostic approach of KEDA ensures flexibility regarding event sources. As a result, it can help DevOps and SRE teams optimize the cost and resource utilization of their Kubernetes cluster by scaling up or down based on event sources and metrics of their choice. Integrating KEDA in CI/CD pipelines enables DevOps teams to quickly respond to trigger events in their application's resource requirements, further streamlining the continuous delivery process. And KEDA supports events generated by different workloads such as StatefulSet, Job, Custom Resource, and Job. All these help reduce downtime and improve the applications' efficiency and user experience.

By Jyoti Sahoo
Event Streams Are Nothing Without Action
Event Streams Are Nothing Without Action

Each data point in a system that produces data on an ongoing basis corresponds to an Event. Event Streams are described as a continuous flow of events or data points. Event Streams are sometimes referred to as Data Streams within the developer community since they consist of continuous data points. Event Stream Processing refers to the action taken on generated Events. This article discusses Event Streams and Event Stream Processing in great depth, covering topics such as how Event Stream Processing works, the contrast between Event Stream Processing and Batch Processing, its benefits and use cases, and concluding with an illustrative example of Event Stream Processing. Event Streams: An Overview Coupling between services is one of the most significant difficulties associated with microservices. Conventional architecture is a “don’t ask, don’t tell” architecture in which data is collected only when requested. Suppose there are three services in issue, A, B, and C. Service A asks the other services, “What is your present state?” and assumes they are always ready to respond. This puts the user in a position if the other services are unavailable. Retries are utilized by microservices as a workaround to compensate for network failures or any negative impacts brought on by changes in the network topology. However, this ultimately adds another layer of complexity and increases the expense. In order to address the problems with the conventional design, event-driven architecture adopts a “tell, don’t ask” philosophy. In the example above, Services B and C publish Continuous Streams of data, such as Events, and Service A subscribes to these Event Streams. Then, Service A may evaluate the facts, aggregate the outcomes, and locally cache them. Utilizing Event Streams in this manner has various advantages, including: Systems are capable of closely imitating actual processes. Increased usage of scale-to-zero functions (serverless computing) as more services are able to stay idle until required. Enhanced adaptability The Concept of Event Stream Processing Event Stream Processing (ESP) is a collection of technologies that facilitate the development of an Event-driven architecture. As previously stated, Event Stream Processing is the process of reacting to Events created by an Event-driven architecture. One may behave in a variety of ways, including: Conducting Calculations Transforming Data Analyzing Data Enriching Data You may design a pipeline of actions to convert Event data, which will be detailed in the following part, which is the heart of Event Stream Processing. The Basics of Event Stream Processing Event Stream Processing consists of two separate technologies. The first form of technology is a system that logically stores Events, and the second type is software used to process Events. The first component is responsible for data storage and saves information based on a timestamp. As an illustration of Streaming Data, recording the outside temperature every minute for a whole day is an excellent example. In this scenario, each Event consists of the temperature measurement and the precise time of the measurement. Stream Processors or Stream Processing Engines constitute the second component. Most often, developers use Apache Kafka to store and process Events temporarily. It also enables the creation of Event Streams-based pipelines in which processed Events are transferred to further Event Streams for additional processing. Event Stream Processing vs. Batch Processing With the development of technology, businesses deal with a much bigger number of data than they did ten years ago. Therefore, more sophisticated data processing technologies are necessary to keep up with this rate of change. A conventional application is responsible for the collection, storage, and processing of data, as well as the storage of the processed outputs. Typically, these procedures occur in batches, so your application must wait until it has sufficient data to begin processing. The amount of time your application may have to wait for data is unacceptable for time-sensitive or real-time applications that need quick data processing. In order to solve this difficulty, Event Streams enter the fray. In Event Stream Processing, every single data point or Event is handled instantaneously, meaning there is no backlog of data points, making it perfect for real-time applications. In addition, Stream Processing enables the detection of patterns, the examination of different degrees of attention, and the simultaneous examination of data from numerous Streams. Spreading the operations out across time, Event Stream Processing requires much less hardware than Batch Processing. The Benefits of Using Event Stream Processing Event Stream Processing is used when quick action must be taken on Event Streams. As a result, Event Stream Processing will emerge as the solution of choice for managing massive amounts of data. This will have the greatest impact on the prevalent high-speed technologies of today, establishing Event Stream Processing as the solution of choice for managing massive amounts of data. Several advantages of incorporating Event Stream Processing into your workflow are as follows: Event Stream Pipelines can be developed to fulfill advanced Streaming use cases. For instance, using an Event Stream Pipeline, one may enhance Event data with metadata and modify such objects for storage. Utilizing Event Stream Processing in your workflow enables you to make choices in real time. You can simply expand your infrastructure as the data volume grows. Event Stream Processing offers continuous Event Monitoring, enabling the creation of alerts to discover trends and abnormalities. You can examine and handle massive volumes of data in real time, allowing you to filter, aggregate, or filter the data prior to storage. Event Streams Use Cases As the Internet of Things (IoT) evolves, so does the demand for real-time analysis. As data processing architecture becomes more Event-driven, ESP continues to grow in importance. Event Streaming is used in a variety of application cases that span several sectors and organizations. Let’s examine a few industries that have profited from incorporating Event Stream Processing into their data processing methodologies. Besides helping big sectors, it also addresses specific problems we face on a daily basis. Here are some examples of how this can be used. Use Case 1: Pushing GitHub Notifications Using Event Streams Event streams are a great way to stay up-to-date on changes to your codebase in real time. By configuring an event stream and subscribing to the events you’re interested in, you can receive push notifications whenever there is an activity in your repository. We hope this use case has will help you understand how to use event streams in GitHub push notifications. Here we are taking an example of creating a chrome extension that makes use of event dreams to provide real-time GitHub push notifications. The GitHub Notifier extension for Google Chrome allows you to see notifications in real-time whenever someone interacts with one of your GitHub repositories. This is a great way to stay on top of your project’s activity and be able to respond quickly to issues or pull requests. The extension is available for free from the Google Chrome store. Simply install it and then sign in with your GitHub account. Once you’ve done that, you’ll start receiving notifications whenever someone mentions you, comments on one of your repositories, or even when someone stars one of your repositories. You can also choose to receive notifications for specific events, such as new releases or new Pull Requests. Stay up-to-date on all the latest activity on your GitHub repositories with GitHub Notifier! Use Case 2: Internet of Things in Industry (IIot) In the context of automating industrial processes, businesses may incorporate an IIoT solution by including a number of sensors that communicate data streams in real-time. These sensors may be installed in the hundreds, and their data streams are often pooled by IoT gateways, which can deliver a continuous stream of data further into the technological stack. Enterprises would need to apply an event stream processing approach in order to make use of the data, analyze it to detect trends, and swiftly take action on them. This stream of events would be consumed by the event streaming platform, which would then execute real-time analytics. For instance, we may be interested in tracking the average temperature over the course of 30 seconds. After that, we want the temperature only to be shown if it surpasses 45 °C. When this condition is satisfied, the warning may be utilized by other programs to alter their processes in real-time to prevent overheating. There are many technologies that can help automate the processes. Camunda’s Workflow Engine is one of them which implements this process automation and executes processes that are defined in Business Process Model and Notation (BPMN), the global standard for process modeling. BPMN provides an easy-to-use visual modeling language for automating your most complex business processes. If you want to get started with Camunda workflow, the Camunda connectors is a good starting point. Use Case 3: Payment Processing Rapid payment processing is an excellent use of event stream processing for mitigating user experience concerns and undesirable behaviors. For instance, if a person wishes to make a payment but encounters significant delays, they may refresh the page, causing the transaction to fail and leaving them uncertain as to whether their account has been debited. Similarly, when dealing with machine-driven payments, the delay may have a large ripple impact, particularly when hundreds of payments are backed up. This might result in repeated attempts or timeouts. To support the smooth processing of tens of thousands of concurrent requests, we may leverage event streaming processing to guarantee a consistent user experience throughout. A payment request event may be sent from a topic to an initial payments processor, which then changes the overall amount of payments being processed at the moment. A subsequent event is then created and forwarded to a different processor, which verifies that the payment can be completed and changes the user’s balance. A final event is then generated, and the user’s balance is updated by another processor. Use Case 4: Cybersecurity Cybersecurity systems collect millions of events in order to identify new risks and comprehend relationships between occurrences. For the purpose of reducing false positives, cybersecurity technologies use event streaming processing to augment threats and give context-rich data. They do this by following a sequence of processes, including: Collect events from diverse data sources, such as consumer settings, in real time. Filter event streams so that only relevant data enters the subjects to eliminate false positives or benign assaults. Leverage streaming apps in real-time to correlate events across several source interfaces. Forward priority events to other systems, such as security information and event management (SIEM) systems or security automation, orchestration, and response (SAO&R) systems (SOAR). Use Case 5: Airline Optimization We can create real-time apps to enhance the experience of passengers before, during, and after flights, as well as the overall efficiency of the process. We can effectively coordinate and react if we make crucial events, such as customers scanning their boarding passes at the gate, accessible across all the back-end platforms used by airlines and airports. For example, based on this one sort of event, we can enable three distinct use cases, including: Accurately predicting take-off times and predicting delays Reduce the amount of assistance necessary for connecting passengers by giving real-time data Reduce the impact of a single flight’s influence on the on-time performance of the other flights. Use Case 6: E-Commerce Event stream processing can be used in an e-commerce application to facilitate “viewing through to purchasing.” To do this, we may build an initial event stream to capture the events made by shoppers, with 3 separate event kinds feeding the stream. Customer sees item A customer adds an item to their shopping cart A customer puts in an order. We may assist our use cases by applying discrete processes or algorithms, such as: An hourly sales calculator that parses the stream for ‘Customer puts order’ events and keeps a running tally of total revenues for each hour. A product look-to-book tracker that reads “Customer sees item” from the stream and keeps track of the overall number of views for each product. Additionally, it parses ‘Customer puts order’ events from the stream and keeps track of the total number of units sold for each product. A new ‘Customer abandons’ cart event is created and posted to a new topic when an abandoned cart detector — which reads all three kinds of events and uses the algorithm described previously to identify customers who have abandoned their shopping cart — detects abandoned carts. Conclusion In a world that is increasingly driven by events, Event Stream Processing (ESP) has emerged as a vital practice for enterprises. Event streams are becoming an increasingly important data source as more and more companies move to a streaming architecture. The benefits of using event streams include real-time analytics, faster response times, and improved customer experience. They offer many benefits over traditional batch processing. In addition, there are a number of use cases for event streams that can help you solve specific business problems. If you’re looking for a way to improve your business performance, consider using event stream processing.

By Avraham Neeman
Dagster: A New Data Orchestrator To Bring Data Closer to Business Value
Dagster: A New Data Orchestrator To Bring Data Closer to Business Value

In this article, we are going to analyze a new data orchestrator: Dagster. In our opinion, this is the first generation of data orchestrators that bring data pipelines closer to critical business processes that would really be business data processes for mission-critical solutions. To describe Dagster's capabilities and use cases, we are going to provide some context about patterns and some historical information that is necessary to understand what business value it brings. In the last decade, many trends have been around orchestration and choreography patterns. We are going to provide a simple description of these patterns: Orchestration: It is a well-defined workflow orchestrated and centralized by an orchestration system. An orchestration system is like a musical orchestra where there is a conductor that provides direction to musicians to set the tempo and ensure correct entries. There are three main characteristics of orchestration: Provide a centralized workflow that allows visualizing easily the business or data process. The workflow is managed by a single layer which is the most critical component. If the orchestration system is down there is no business service, without a conductor there is no choral concert. They are very flexible to be integrated into different architecture patterns such as APIs, event-driven, RPC, or data processes. Choreography: It is based on event-driven or streaming architecture, the goal is that every component in the architecture works uncouple, and has its own responsibility to make decisions about the actions it has to take. There are three main characteristics of choreography: It has to be based on an event-driven or streaming pattern. There is no single and centralized layer so there is no single point of failure unless you have a single message broker. Provide more scalability, flexibility, and also more complexity to understand the process. Orchestrators and business process management software have always been close to the business layer, which increased their popularity in the company's strategic tech roadmap. The first generation of BPMs started around the year 2000 and were technologies for software engineers. A few years later between 2014 and 2018, event-driven and choreography patterns started to increase in popularity, first with Netflix, and then with the appearance of streaming platforms such as Apache Kafka. The data world always remains a bit late compared to the software world. Although in my opinion, we are moving towards a scenario where the operational and analytical worlds will not be isolated. Architectures and team topologies where the analytical and operational layer work as two different minds are not working when companies need to apply a data-driven approach where data is part of the core of the decision-making process. What Happened to the World of Data When the concept of big data started to become popular, the first data orchestrators appeared, such as Apache Ozzie (2010 by Yahoo), based on DAG XML configurations, a scheduled workflow, and very focused on the Hadoop ecosystem. A little later, Apache Airflow (2015 by Airbnb) appeared based on Python. It provided more capabilities such as moving from DAG XML configuration to programmatic configuration, and more integrations outside Hadoop ecosystems, but is also a scheduled workflows system. In the middle of both appeared Luigi (2012 by Spotify): based on Python, but pipeline-oriented instead of DAG, but including interesting software best practices such as A/B testing. <workflow-app name="useooziewf" xmlns="uri:oozie:workflow:0.1"> ... <decision name="mydecision"> <switch> <case to="reconsolidatejob"> ${fs:fileSize(secondjobOutputDir) gt 10 * GB} </case> <case to="rexpandjob"> ${fs:fileSize(secondjobOutputDir) lt 100 * MB} </case> <case to="recomputejob"> ${ hadoop:counters('secondjob')[RECORDS][REDUCE_OUT] lt 1000000 } </case> <default to="end"/> </switch> </decision> ... </workflow-app> Apache Airflow was the first real evolution in the data orchestrator, but in our opinion has some points of improvement that make it a product very focused on the traditional world of data and not on the new reality in which data is becoming the center of decision-making in companies. It has a poor UI interface, totally oriented toward data engineers. It is mainly oriented to executing tasks, without knowing what those tasks do. Everything is a task, increasing the complexity in terms of maintainability and comprehension. At the end of 2021 introduced the concept of sensors which are a special type of operator designed to wait for an external event such as a Kafka event, JMS message, or time-based. import requests from airflow.decorators import dag, task @dag( dag_id="example_complex", schedule_interval="0 0 * * *", start_date=pendulum.datetime(2022, 1, 1, tz="UTC"), catchup=False, dagrun_timeout=datetime.timedelta(minutes=60), ) def ExampleComplex(): ... @task def get_data(): ... @task def merge_data(): ... dag = ProcessEmployees() All these developments in orchestration solutions have something in common: the challenges faced by these companies (Yahoo, Airbnb, or Spotify) to manage the complexity in their data pipelines. Data-Driven: The New Era of Data This first release of the orchestrator tools was very focused on data engineers' experience and based on traditional analytical and operational platforms architecture such as data lakes, data hubs, or experiments data science workspaces. David and I (Miguel) began our journey into the world of data around the year 2017. Before that, we had been working on operational mission-critical solutions close to business processes and based on event-driven solutions. At that moment we found out about some of the tools such as Oozie or Airflow that were ETLs tools focused on scheduled tasks/workflow with no cloud solution offering, enterprise complex implementation and maintenance, poor scalability, and poor user experience. Our first thought at that moment was these are the tools that we would have to be content with right now but that we would not use in the next years. Nowadays the data-driven approach has changed everything, every day the border between analytical and operational workloads is more diffuse than ever. There are many business-critical processes based on analytical and operational workloads. Probably many of these critical processes would be very similar to the non-critical data pipeline a few years ago. In 2020, Zhamak Dehghani published an article about the principles of data mesh, some of them quite well-known. She wrote one sentence particularly significant for us regarding operational and analytical planes: "I do believe that at some point in the future, our technologies will evolve to bring these two planes even closer together, but for now, I suggest we keep their concerns separate." Our opinion is that these planes are closer than we think and in terms of business value, they are more achievable on a small scale than the logical architecture of data mesh itself. For instance, consider the fashion retail sector and critical processes such as allocation, e-commerce, or logistic solutions. All these processes that years ago were operational and many of them used traditional applications like SAP or Oracle, today, they need sophisticated data processes that include big data ingestion, transformation, analytics, and machine learning model to provide real-time recommendations and demand forecasts to allow data-driven decision-making. Of course, traditional data pipelines/workflows are needed and traditional solutions based on isolated operational and analytics platforms will continue to provide value for some reports, batch analytical processes, experiments, and other analytical innovations. But today there are other kinds of data processes; business data processes that have different needs and provide more business value. We need data solutions that provide the following capabilities: Better software development experience and management of data as code applying all the best practices such as isolated environments, unit testing, A/B, data contracts, etc. Friendly and rich integration ecosystem not only with the big data world but with the data world in general. Cloud-based solutions that provide easy scalability, low maintenance, and easy integration with IAM solutions. Much better user experience not only for developers but also for data scientists, data analysts, business analysts, and operational teams Introducing Dagster Dagster is a platform for data flow orchestration that goes beyond what we understand as a data traditional orchestrator. The project was started in 2018 by Nick Shrock and was conceived as a result of a need identified by him while working at Facebook. One of the goals of Dagster has been to provide a tool that removes the barrier between pipeline development and pipeline operation, but during this journey, he came to link the world of data processing with business processes. Dagster provides significant improvements over previous solutions. Oriented to data engineers, developers, and data/business operations engineers: Its versatility and abstraction allow us to design pipelines in a more developer-oriented way, applying software best practices and managing data, and data pipelines as code. Complies with the First-principles approach to data engineering, the full development lifecycle: development, deployment, monitoring, and observability It includes a new and differential concept, which is a software-defined asset. An asset is a data object or machine learning modeled in Dagster and persisted in a data repository. Dagit UI is a web-based interface for viewing and interacting with Dagster objects. A very intuitive, and user-friendly interface that allows a very simple operation. It is an open-source solution that at the same time offers a SaaS cloud solution that accelerates the implementation of the solution. A really fast learning curve that enables development teams to deliver value very early on Let’s Talk a Little About the Dagster Concepts We will explain basic concepts giving simple definitions, and examples that allow us to build a simple business data pipeline in the next article of this series. Common Components of an Orchestrator At a high level, these are the basic components of any orchestrator: Job: Main unit of execution and monitoring; instantiation of a Graph with configurations and parameters Ops: Basically, they are the tasks we want to execute, they contain the operations and should perform simple tasks such as executing a database query (to ingest or retrieve data), initiating a remote job (Spark, Python, etc.), or sending an event. Graph: Set of interconnected ops of sub-graphs, ops can be assembled into a graph to accomplish complex tasks Software-Defined Assets An asset is an object in persistent storage, such as a table, file, or persisted machine learning model. A software-defined asset is a Dagster object that couples an asset to the function and upstream assets used to produce its contents. This is a declarative data definition, in Python, that allows us to: Define data contracts as code and how those will be used in our pipelines. Define the composition of our entity, regardless of its physical structure through a projection of the data. Decouple the business logic to compute assets from the I/O logic to read to and write from the persistent storage. Apply testing best practices including the ability to use mocks when we are writing a unit test. The ability to define partitions opens up a world of possibilities for running our processes and also at the time of re-launching. Sometimes we only have to reprocess a partition that for some reason has been processed incorrectly or incompletely. An amazing capability is being able to use external tools such as DBT, Snowflake, Airbyte, Fivetran, and many others tools to define our assets. It is amazing because it allows us to be integrated into our global platform and not only in the big data ecosystem. Options for Launching Jobs In this case, the great differential is the capabilities that these sensors provide us with for: Schedules: It is used to execute a job at a fixed interval. Sensors: Allow us to run based on some external event such as Kafka event, new files in S3, specific asset materialization, or a data partition update. Partitions: Allow us to run based on changes in a subset of the data of an asset, for example, records in a data set that fall within a particular time window. Backfills: It provides the ability to re-execute the data pipeline on only the set of partitions we are interested in. For example, relaunch the pipeline that calculates the aggregation of store sales per country but only for the partition with the data of the USA stores. The combination of the capabilities offered by sensors, partitions, backfills, IO managers, and assets represents a very significant paradigm shift in the world of data pipeline orchestration. IO Managers It provides integration components with the persistent repositories that allow us to persist and load assets and Op outputs to/from S3, Snowflake, or other data repositories. These components contain all the integration logic with each of the external channels we use in our architecture. We can use existing integrations, extend them to include specific logic, or develop new ones. Lineage The use of assets provides a base layer of lineage, data observability, and data quality monitoring. In our opinion, data lineage is a very complex and key aspect that nowadays goes beyond traditional tables and includes APIs, topics, and other data sources. This is why we believe that although Dagster provides great capabilities, it should be one more source in the overall lineage of our platform and not the source of truth. Debugging and Observability Another differentiating capability of Dagster is that it provides data observability when we are using software-definition assets. Data operators or data engineers have several features to analyze a data pipeline: Pipeline status, ops status, and timing Logs with error and info traces Assets can include metadata that displays information with the link to access the data materialization. It even provides us with the capability to send relevant information to different channels such as slack, events, or persistence in a report in S3. These capabilities allow engineers to have self-autonomy and not feel the orchestrator as a black box. Metadata Dagster allows meta-information to be added practically at all levels and most importantly it is very accessible by data ops or any other operational user. Having metadata is very important but it must also be usable and that is where Dagster makes a difference. Operation teams have less process context and more cognitive changes because they do not participate in developments but at the same time manage multiple production flows. As soon as our data workflow becomes part of the business critical mission providing this meta-information is a must. Dagster for Critical Business Process Dagster allows us to have a data processing-oriented tool that we can integrate into our business processes in the critical path to provide the greatest business value. Let's think about stock replenishment retail processes from warehouses to the stores or other channels such as partners, or e-commerce. This is the process of replenishment of items from the distribution warehouses to the stores, the goal is that the right products are available in an optimal quantity and in the right place to meet the demand from customers. Improves the customer experience by ensuring that products are available across all channels and avoiding “out of stock." Increase profitability by avoiding replenishment of products with low sales probability. It is a business process where analytics and machine learning models have a lot of impacts. We can define this as a data-driven business process. Stock replenishment is a complex operational business-critical process that has demand forecasting and stock inventory as its main pillars: Demand forecast requires an advanced machine-learning process based on historical data to project future demand. The stock inventory provides the quantity of stock available at each location such as stores, warehouses, distributors, etc. Sales provide information on how demand for the products is behaving based on indicators such as pricing, markdowns, etc. These processes can be launched weekly, daily, or several times a day depending on whether the replenishment is done from a central warehouse or, for example, a warehouse close to the physical stores. This process requires some information updated in near real-time, but it is a data process that runs in batch or micro-batch mode. Dagster is the first data orchestrator that truly enables delivering business value from a purely operational point of view at the operational layer. Dagster Common Use Cases There are other more traditional use cases in Dagster such as: Data pipelines with different sources and as destinations for analytical systems such as Data Warehouses or Data Lakes Machine learning training model Analytical processes that integrate machine learning models Of course, Dagster represents an evolution for this type of process for the same criteria as mentioned above. Conclusions As the next few years are expected to be really exciting in the world of data, we are evolving especially in the construction of tools that allow us to generate a more real and closer impact on the business. Dagster is another step in that direction. The challenges are not new data architectures or complex analytical systems; the challenges are in providing real business value as soon as possible. The risks are to think that tools like Dagster and architectures like data mesh will bring value by themselves. These tools provide us with capabilities that we did not have years ago and allow us to design features that meet the needs of our customers. We need to learn from the mistakes we have made, applying continuous improvement and a critical thinking approach to be better. Though there is no "one ring to rule them all," Dagster is a fantastic tool and great innovation, like other tools such as Dbt, Apache Kafka, DataHub Data Catalog, and many more, but if we believe that one tool can solve all our needs, we will build a new generation of monoliths. Dagster, although a great product, is just another piece that complements our solutions to add value.

By Miguel Garcia

Top Microservices Experts

expert thumbnail

Nuwan Dias

VP and Deputy CTO,
WSO2

‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎
expert thumbnail

Christian Posta

VP, Global Field CTO,
Solo.io

Christian Posta (@christianposta) is Field CTO at solo.io and well known in the community for being an author (Istio in Action, Manning, Microservices for Java Developers, O’Reilly 2016), frequent blogger, speaker, open-source enthusiast and committer on various open-source projects including Istio and Kubernetes. Christian has spent time at web-scale companies and now helps companies create and deploy large-scale, resilient, distributed architectures - many of what we now call Serverless and Microservices. He enjoys mentoring, training and leading teams to be successful with distributed systems concepts, microservices, devops, and cloud-native application design.
expert thumbnail

Rajesh Bhojwani

Development Architect,
Sap Labs

You can reach out to me at rajesh.bhojwani@gmail.com or https://www.linkedin.com/in/rajesh-bhojwani
expert thumbnail

Ray Elenteny

Solution Architect,
SOLTECH

‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎

The Latest Microservices Topics

article thumbnail
7 Ways of Containerizing Your Node.js Application
This article lists seven ways to containerize your node.js application, so let’s look at them briefly.
March 23, 2023
by Nikunj Shingala
· 666 Views · 1 Like
article thumbnail
Introduction to Containerization
This article will explore containerization, how it works, drawbacks and its benefits.
March 23, 2023
by Aditya Bhuyan
· 1,141 Views · 2 Likes
article thumbnail
Spring Boot, Quarkus, or Micronaut?
REST API frameworks play a crucial role in developing efficient and scalable microservices. Compare three frameworks, their features, and their pros and cons.
March 23, 2023
by Farith Jose Heras García
· 2,316 Views · 1 Like
article thumbnail
Introduction to Container Orchestration
In this article, we will discuss what container orchestration is, why it is important, and some of the popular container orchestration tools available today.
March 22, 2023
by Aditya Bhuyan
· 1,459 Views · 2 Likes
article thumbnail
Microservices Testing
This article will discuss the importance of microservices testing, its challenges, and best practices.
March 22, 2023
by Aditya Bhuyan
· 1,826 Views · 2 Likes
article thumbnail
A Gentle Introduction to Kubernetes
K8s' architecture and components may seem complex, but they offer unparalleled power, flexibility, and features in the open-source world.
March 22, 2023
by Saqib Jan
· 2,263 Views · 1 Like
article thumbnail
Building Micronaut Microservices Using MicrostarterCLI
This article demonstrates building a simple Micronaut microservices application using MicrostarterCLI.
March 21, 2023
by Ahmed Al-Hashmi
· 7,357 Views · 4 Likes
article thumbnail
How To Build a Spring Boot GraalVM Image
In this article, readers will use a tutorial to learn how to build a Spring Boot GraalVM images and using Reflection, including guide code block examples.
March 21, 2023
by Gunter Rotsaert CORE
· 2,248 Views · 1 Like
article thumbnail
Master Spring Boot 3 With GraalVM Native Image
This article covers the intricacies associated with Spring Boot Native Image development.
March 21, 2023
by Dmitry Chuyko
· 1,788 Views · 1 Like
article thumbnail
How Elasticsearch Works
Discover what Elasticsearch is, how Elasticsearch works, and how you can configure, install, and run Elasticsearch. Also, understand its benefits and major use cases.
March 21, 2023
by Ruchita Varma
· 2,304 Views · 1 Like
article thumbnail
Cloud Modernization Strategies for Coexistence with Monolithic and Multi-Domain Systems: A Target Rollout Approach
This article discusses a strategy for modernizing complex domains with monolithic databases through four phases.
March 21, 2023
by Neeraj Kaushik
· 1,435 Views · 3 Likes
article thumbnail
5 Best Python Testing Frameworks
In this article, readers will find an honest comparison of the top 5 Python frameworks for test automation. Discover all their advantages and disadvantages.
March 21, 2023
by Arnab Roy
· 2,041 Views · 1 Like
article thumbnail
Building Microservice in Golang
Go's performance, concurrency, and simplicity make it an increasingly popular option for building scalable and reliable microservices architectures.
March 21, 2023
by Rama Krishna Panguluri
· 2,220 Views · 3 Likes
article thumbnail
Introduction to Spring Cloud Kubernetes
In this article, we will explore the various features of Spring Cloud Kubernetes, its benefits, and how it works.
March 21, 2023
by Aditya Bhuyan
· 3,038 Views · 2 Likes
article thumbnail
Integrate AWS Secrets Manager in Spring Boot Application
A guide for integration of AWS Secrets Manager in Spring Boot. This service will load the secrets at runtime and keep the sensitive information away from the code.
March 21, 2023
by Aakash Jangid
· 3,720 Views · 1 Like
article thumbnail
Tracking Software Architecture Decisions
Readers will learn a method for systematically tracking software architecture decisions through ADRs, introducing a lifecycle that will support this process.
March 20, 2023
by David Cano
· 1,983 Views · 1 Like
article thumbnail
Strategies for Kubernetes Cluster Administrators: Understanding Pod Scheduling
This guide will equip you with the knowledge and skills necessary to master the art of pod scheduling.
March 20, 2023
by shishir khandelwal
· 2,562 Views · 1 Like
article thumbnail
HTTP vs Messaging for Microservices Communications
This article will look at a comparison of HTTP and messaging to see which means of communications will work best for their microservices communications.
March 20, 2023
by Dennis Mwangi
· 2,371 Views · 1 Like
article thumbnail
Use AWS Controllers for Kubernetes To Deploy a Serverless Data Processing Solution With SQS, Lambda, and DynamoDB
Discover how to use AWS Controllers for Kubernetes to create a Lambda function, SQS, and DynamoDB table and wire them together to deploy a solution.
March 20, 2023
by Abhishek Gupta CORE
· 2,713 Views · 1 Like
article thumbnail
Building a Real-Time App With Spring Boot, Cassandra, Pulsar, React, and Hilla
In this article, readers will learn how to create a Spring Boot application that connects to Pulsar and Cassandra and displays live data in a React frontend.
March 20, 2023
by Marcus Hellberg
· 2,450 Views · 7 Likes
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • ...
  • Next

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: