Cloud + data orchestration: Demolish your data silos. Enable complex analytics. Eliminate I/O bottlenecks. Learn the essentials (and more)!
2024 DZone Community Survey: SMEs wanted! Help shape the future of DZone. Share your insights and enter to win swag!
Software design and architecture focus on the development decisions made to improve a system's overall structure and behavior in order to achieve essential qualities such as modifiability, availability, and security. The Zones in this category are available to help developers stay up to date on the latest software design and architecture trends and techniques.
Cloud architecture refers to how technologies and components are built in a cloud environment. A cloud environment comprises a network of servers that are located in various places globally, and each serves a specific purpose. With the growth of cloud computing and cloud-native development, modern development practices are constantly changing to adapt to this rapid evolution. This Zone offers the latest information on cloud architecture, covering topics such as builds and deployments to cloud-native environments, Kubernetes practices, cloud databases, hybrid and multi-cloud environments, cloud computing, and more!
Containers allow applications to run quicker across many different development environments, and a single container encapsulates everything needed to run an application. Container technologies have exploded in popularity in recent years, leading to diverse use cases as well as new and unexpected challenges. This Zone offers insights into how teams can solve these challenges through its coverage of container performance, Kubernetes, testing, container orchestration, microservices usage to build and deploy containers, and more.
Integration refers to the process of combining software parts (or subsystems) into one system. An integration framework is a lightweight utility that provides libraries and standardized methods to coordinate messaging among different technologies. As software connects the world in increasingly more complex ways, integration makes it all possible facilitating app-to-app communication. Learn more about this necessity for modern software development by keeping a pulse on the industry topics such as integrated development environments, API best practices, service-oriented architecture, enterprise service buses, communication architectures, integration testing, and more.
A microservices architecture is a development method for designing applications as modular services that seamlessly adapt to a highly scalable and dynamic environment. Microservices help solve complex issues such as speed and scalability, while also supporting continuous testing and delivery. This Zone will take you through breaking down the monolith step by step and designing a microservices architecture from scratch. Stay up to date on the industry's changes with topics such as container deployment, architectural design patterns, event-driven architecture, service meshes, and more.
Performance refers to how well an application conducts itself compared to an expected level of service. Today's environments are increasingly complex and typically involve loosely coupled architectures, making it difficult to pinpoint bottlenecks in your system. Whatever your performance troubles, this Zone has you covered with everything from root cause analysis, application monitoring, and log management to anomaly detection, observability, and performance testing.
The topic of security covers many different facets within the SDLC. From focusing on secure application design to designing systems to protect computers, data, and networks against potential attacks, it is clear that security should be top of mind for all developers. This Zone provides the latest information on application vulnerabilities, how to incorporate security earlier in your SDLC practices, data governance, and more.
Observability and Application Performance
Making data-driven decisions, as well as business-critical and technical considerations, first comes down to the accuracy, depth, and usability of the data itself. To build the most performant and resilient applications, teams must stretch beyond monitoring into the world of data, telemetry, and observability. And as a result, you'll gain a far deeper understanding of system performance, enabling you to tackle key challenges that arise from the distributed, modular, and complex nature of modern technical environments.Today, and moving into the future, it's no longer about monitoring logs, metrics, and traces alone — instead, it’s more deeply rooted in a performance-centric team culture, end-to-end monitoring and observability, and the thoughtful usage of data analytics.In DZone's 2023 Observability and Application Performance Trend Report, we delve into emerging trends, covering everything from site reliability and app performance monitoring to observability maturity and AIOps, in our original research. Readers will also find insights from members of the DZone Community, who cover a selection of hand-picked topics, including the benefits and challenges of managing modern application performance, distributed cloud architecture considerations and design patterns for resiliency, observability vs. monitoring and how to practice both effectively, SRE team scalability, and more.
API Implementation on AWS Serverless Architecture
Observability Maturity Model
If you're a Java software developer and you weren't living on the planet Mars during these last years, then you certainly know what Quarkus is. And just in case you don't, you may find it out here. With Quarkus, the field of enterprise cloud-native applications development has never been so comfortable and it never took advantage of such a friendly and professional working environment. The Internet abounds with posts and articles explaining why and how Quarkus is a must for the enterprise, cloud-native software developer. And of course, CDK applications aren't on the sidelines: on the opposite, they can greatly take advantage of the Quarkus features to become smaller, faster, and more aligned with requirements nowadays. CDK With Quarkus Let's look at our first CDK with Quarkus example in the code repository. Go to the Maven module named cdk-quarkus and open the file pom.xml to see how to combine specific CDK and Quarkus dependencies and plugins. XML ... <dependency> <groupId>io.quarkus.platform</groupId> <artifactId>quarkus-bom</artifactId> <version>${quarkus.platform.version}</version> <type>pom</type> <scope>import</scope> </dependency> <dependency> <groupId>io.quarkiverse.amazonservices</groupId> <artifactId>quarkus-amazon-services-bom</artifactId> <version>${quarkus-amazon-services.version}</version> <type>pom</type> <scope>import</scope> </dependency> ... In addition to the aws-cdk-lib artifact which represents the CDK API library and is inherited from the parent Maven module, the dependencies above are required in order to develop CDK Quarkus applications. The first one, quarkus-bom, is the Bill of Material (BOM) which includes all the other required Quarkus artifacts. Here, we're using Quarkus 3.11 which is the most recent release as of this writing. The second one is the BOM of the Quarkus extensions required to interact with AWS services. Another mandatory requirement of Quarkus applications is the use of the quarkus-maven-plugin which is responsible for running the build and augmentation process. Let's recall that as opposed to more traditional frameworks like Spring or Jakarta EE where the application's initialization and configuration steps happen at the runtime, Quarkus performs them at build time, in a specific phase called "augmentation." Consequently, Quarkus doesn't rely on Java introspection and reflection, which is one of the reasons it is much faster than Spring, but needs to use the jandex-maven-plugin to build an index helping to discover annotated classes and beans in external modules. This is almost all as far as the Quarkus master POM is concerned. Let's look now at the CDK submodule. But first, we need to recall that, in order to synthesize and deploy a CDK application, we need a specific working environment defined by the cdk.json file. Hence, trying to use CDK commands in a project not having at its root this file will fail. One of the essential functions of the cdk.json file aims to define how to run the CDK application. By default, the cdk init app --language java command, used to scaffold the project's skeleton, will generate the following JSON statement: JSON ... "app": "mvn -e -q compile exec:java" ... This means that whenever we run a cdk deploy ... command, such that to synthesize a CloudFormation stack and deploy it, the maven-exec-plugin will be used to compile and package the code, before starting the associated main Java class. This is the most general case, the one of a classical Java CDK application. But to run a Quarkus application, we need to observe some special conditions. Quarkus packages an application as either a fast or a thin JAR and, if you aren't familiar with these terms, please don't hesitate to consult the documentation which explains them in detail. What interests us here is the fact that, by default, a fast JAR will be generated, under the name of quarkus-run.jar in the target/quarkus-app directory. Unless we're using Quarkus extensions for AWS, in which case a thin JAR is generated, in target/$finalName-runner.jar file, where $finalName is the value of the same element in pom.xml. In our case, we're using Quarkus extensions for AWS and, hence, a thin JAR will be created by the Maven build process. In order to run a Quarkus thin JAR, we need to manually modify the cdk.json file to replace the line above with the following one: JSON ... "app": "java -jar target/quarkus-app/quarkus-run.jar" ... The other important point to notice here is that, in general, a Quarkus application is exposing a REST API whose endpoint is started by the command above. But in our case, the one of a CDK application, there isn't any REST API and, hence, this endpoint needs to be specified in a different way. Look at our main class in the cdk-quarkus-api-gatewaymodule. Java @QuarkusMain public class CdkApiGatewayMain { public static void main(String... args) { Quarkus.run(CdkApiGatewayApp.class, args); } } Here, the @QuarkusMain annotation flags the subsequent class as the application's main endpoint and, further, using the io.quarkus.runtime.Quarkus.run() method will execute the mentioned class until it receives a signal like Ctrl-C, or one of the exit methods of the same API is called. So, we just saw how the CDK Quarkus application is started and that, once started, it runs the CdkApiGAtewayApp until it exits. This class is our CDK one which implements the App and that we've already seen in the previous post. But this time it looks differently, as you may see: Java @ApplicationScoped public class CdkApiGatewayApp implements QuarkusApplication { private CdkApiGatewayStack cdkApiGatewayStack; private App app; @Inject public CdkApiGatewayApp (App app, CdkApiGatewayStack cdkApiGatewayStack) { this.app = app; this.cdkApiGatewayStack = cdkApiGatewayStack; } @Override public int run(String... args) throws Exception { Tags.of(app).add("project", "API Gateway with Quarkus"); Tags.of(app).add("environment", "development"); Tags.of(app).add("application", "CdkApiGatewayApp"); cdkApiGatewayStack.initStack(); app.synth(); return 0; } } The first thing to notice is that this time, we're using the CDI (Context and Dependency Injection) implemented by Quarkus, also called ArC, which is a subset of the Jakarta CDI 4.1 specifications. It also has another particularity: it's a build-time CDI, as opposed to the runtime Jakarta EE one. The difference lies in the augmentation process, as explained previously. Another important point to observe is that the class implements the io.quarkus.runtime.QuarkusApplication interface which allows it to customize and perform specific actions in the context bootstrapped by the CdkApiGatewayMain class. As a matter of fact, it isn't recommended to perform such operations directly in the CdkApiGatewayMain since, at that point, Quarkus isn't completely bootstrapped and started yet. We need to define our class as @ApplicationScoped, such that to be instantiated only once. We also used constructor injection and took advantage of the producer pattern, as you may see in the CdkApiGatewayProducer class. We override the io.quarkus.runtime.QuarkusApplication.run() method such that to customize our App object by tagging it, as we already did in the previous example, and to invoke CdkApiGatewayStack, responsible to instantiate and initialize our CloudFormation stack. Last but not least, the app.synth() statement is synthesizing this stack and, once executed, our infrastructure, as defined by the CdkApiGatewayStack, should be deployed on the AWS cloud. Here is now the CdkApiGatewayStack class: Java @Singleton public class CdkApiGatewayStack extends Stack { @Inject LambdaWithBucketConstructConfig config; @ConfigProperty(name = "cdk.lambda-with-bucket-construct-id", defaultValue = "LambdaWithBucketConstructId") String lambdaWithBucketConstructId; @Inject public CdkApiGatewayStack(final App scope, final @ConfigProperty(name = "cdk.stack-id", defaultValue = "QuarkusApiGatewayStack") String stackId, final StackProps props) { super(scope, stackId, props); } public void initStack() { String functionUrl = new LambdaWithBucketConstruct(this, lambdaWithBucketConstructId, config).getFunctionUrl(); CfnOutput.Builder.create(this, "FunctionURLOutput").value(functionUrl).build(); } } This class has changed as well, compared to its previous release. It's a singleton that uses the concept of construct, which was introduced formerly. As a matter of fact, instead of defining the stack structure here, in this class, as we did before, we do it by encapsulating the stack's elements together with their configuration in a construct that facilitates easily assembled cloud applications. In our project, this construct is a part of a separate module, named cdk-simple-construct, such that we could reuse it repeatedly and increase the application's modularity. Java public class LambdaWithBucketConstruct extends Construct { private FunctionUrl functionUrl; public LambdaWithBucketConstruct(final Construct scope, final String id, LambdaWithBucketConstructConfig config) { super(scope, id); Role role = Role.Builder.create(this, config.functionProps().id() + "-role") .assumedBy(new ServicePrincipal("lambda.amazonaws.com")).build(); role.addManagedPolicy(ManagedPolicy.fromAwsManagedPolicyName("AmazonS3FullAccess")); role.addManagedPolicy(ManagedPolicy.fromAwsManagedPolicyName("CloudWatchFullAccess")); IFunction function = Function.Builder.create(this, config.functionProps().id()) .runtime(Runtime.JAVA_21) .role(role) .handler(config.functionProps().handler()) .memorySize(config.functionProps().ram()) .timeout(Duration.seconds(config.functionProps().timeout())) .functionName(config.functionProps().function()) .code(Code.fromAsset((String) this.getNode().tryGetContext("zip"))) .build(); functionUrl = function.addFunctionUrl(FunctionUrlOptions.builder().authType(FunctionUrlAuthType.NONE).build()); new Bucket(this, config.bucketProps().bucketId(), BucketProps.builder().bucketName(config.bucketProps().bucketName()).build()); } public String getFunctionUrl() { return functionUrl.getUrl(); } } This is our construct which encapsulates our stack elements: a Lambda function with its associated IAM role and an S3 bucket. As you can see, it extends the software.construct.Construct class and its constructor, in addition to the standard scopeand id, parameters take a configuration object named LambdaWithBucketConstructConfig which defines, among others, properties related to the Lambda function and the S3 bucket belonging to the stack. Please notice that the Lambda function needs the IAM-managed policy AmazonS3FullAccess in order to read, write, delete, etc. to/from the associated S3 bucket. And since for tracing purposes, we need to log messages to the CloudWatch service, the IAM-managed policy CloudWatchFullAccess is required as well. These two policies are associated with a role whose naming convention consists of appending the suffix -role to the Lambda function name. Once this role is created, it will be attached to the Lambda function. As for the Lambda function body, please notice how this is created from an asset dynamically extracted from the deployment context. We'll come back in a few moments with more details concerning this point. Last but not least, please notice how after the Lambda function is created, a URL is attached to it and cached such that it can be retrieved by the consumer. This way we completely decouple the infrastructure logic (i.e., the Lambda function itself) from the business logic; i.e., the Java code executed by the Lambda function, in our case, a REST API implemented as a Quarkus JAX-RS (RESTeasy) endpoint, acting as a proxy for the API Gateway exposed by AWS. Coming back to the CdkApiGatewayStack class, we can see how on behalf of the Quarkus CDI implementation, we inject the configuration object LambdaWithBucketConstructConfig declared externally, as well as how we use the Eclipse MicroProfile Configuration to define its ID. Once the LambdaWithBucketConstruct instantiated, the only thing left to do is to display the Lambda function URL such that we can call it with different consumers, whether JUnit integration tests, curl utility, or postman. We just have seen the whole mechanics which allows us to decouple the two fundamental CDK building blocks App and Stack. We also have seen how to abstract the Stack building block by making it an external module which, once compiled and built as a standalone artifact, can simply be injected wherever needed. Additionally, we have seen the code executed by the Lambda function in our stack can be plugged in as well by providing it as an asset, in the form of a ZIP file, for example, and stored in the CDK deployment context. This code is, too, an external module named quarkus-api and consists of a REST API having a couple of endpoints allowing us to get some information, like the host IP address or the S3 bucket's associated attributes. It's interesting to notice how Quarkus takes advantage of the Qute templates to render HTML pages. For example, the following endpoint displays the attributes of the S3 bucket that has been created as a part of the stack. Java ... @Inject Template s3Info; @Inject S3Client s3; ... @GET @Path("info/{bucketName}") @Produces(MediaType.TEXT_HTML) public TemplateInstance getBucketInfo(@PathParam("bucketName") String bucketName) { Bucket bucket = s3.listBuckets().buckets().stream().filter(b -> b.name().equals(bucketName)) .findFirst().orElseThrow(); TemplateInstance templateInstance = s3Info.data("bucketName", bucketName, "awsRegionName", s3.getBucketLocation(GetBucketLocationRequest.builder().bucket(bucketName).build()) .locationConstraintAsString(), "arn", String.format(S3_FMT, bucketName), "creationDate", LocalDateTime.ofInstant(bucket.creationDate(), ZoneId.systemDefault()), "versioning", s3.getBucketVersioning(GetBucketVersioningRequest.builder().bucket(bucketName).build())); return templateInstance.data("tags", s3.getBucketTagging(GetBucketTaggingRequest.builder().bucket(bucketName).build()).tagSet()); } This endpoint returns a TemplateInstance whose structure is defined in the file src/main/resources/templates/s3info.htmland which is filled with data retrieved by interrogating the S3 bucket in our stack, on behalf of the S3Client class provided by the AWS SDK. A couple of integration tests are provided and they take advantage of the Quarkus integration with AWS, thanks to which it is possible to run local cloud services, on behalf of testcontainers and localstack. In order to run them, proceed as follows: Shell $ git clone https://github.com/nicolasduminil/cdk $ cd cdk/cdk-quarkus/quarkus-api $ mvn verify Running the sequence of commands above will produce a quite verbose output and, at the end, you'll see something like this: Shell [INFO] [INFO] Results: [INFO] [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0 [INFO] [INFO] [INFO] --- failsafe:3.2.5:verify (default) @ quarkus-api --- [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 22.344 s [INFO] Finished at: 2024-07-04T17:18:47+02:00 [INFO] ------------------------------------------------------------------------ That's not a big deal - just a couple of integration tests executed against a localstack running in testcontainers to make sure that everything works as expected. But if you want to test against real AWS services, meaning that you fulfill the requirements, then you should proceed as follows: Shell $ git clone https://github.com/nicolasduminil/cdk $ cd cdk $ ./deploy.sh cdk-quarkus/cdk-quarkus-api-gateway cdk-quarkus/quarkus-api/ Running the script deploy.sh with the parameters shown above will synthesize and deploy your stack. These two parameters are: The CDK application module name: This is the name of the Maven module where your cdk.json file is. The REST API module name: This is the name of the Maven module where the function.zip file is. If you look at the deploy.sh file, you'll see the following: Shell ...cdk deploy --all --context zip=~/cdk/$API_MODULE_NAME/target/function.zip... This command deploys the CDK app after having set in the zip context variable the function.zip location. Do you remember that the Lambda function has been created in the stack (LambdaWithBucketConstruct class) like this? Java IFunction function = Function.Builder.create(this, config.functionProps().id()) ... .code(Code.fromAsset((String) this.getNode().tryGetContext("zip"))) .build(); The statement below gets the asset stored in the deployment context under the context variable zip and uses it as the code that will be executed by the Lambda function. The output of the deploy.sh file execution (quite verbose as well) will finish by displaying the Lambda function URL: Shell ... Outputs: QuarkusApiGatewayStack.FunctionURLOutput = https://...lambda-url.eu-west-3.on.aws/ Stack ARN: arn:aws:cloudformation:eu-west-3:...:stack/QuarkusApiGatewayStack/... ... Now, in order to test your stack, you may fire your preferred browser at https://<generated>.lambda-url.eu-west-3.on.aws/s3/info/my-bucket-8701 and should see something looking like this: Conclusion Your test is successful and you now know how to use CDK constructs to create infrastructure standalone modules and assemble them into AWS CloudFormation stacks. But there is more, so stay tuned!
These are 10 strategies for reducing Kubernetes costs. We’ve split them into pre-deployment, post-deployment, and ongoing cost optimization techniques to help people at the beginning and middle of their cloud journeys, as well as those who have fully adopted the cloud and are just looking for a few extra pointers. So, let’s get started. 10 Kubernetes Cost Optimization Strategies Pre-Deployment Strategies These pre-deployment strategies are for those just starting out. Some will be more relevant to teams at the beginning of their cloud journeys; some will apply to those with an existing environment yet to deploy Kubernetes. 1. Pick A Cloud Provider: Not Cloud Providers Although multi-cloud architectures offer great flexibility in general, they can often incur a greater cost when it comes to Kubernetes. This is related to the different ways Kubernetes is provisioned. On AWS, EKS is the primary way users access Kubernetes; on Azure, it’s AKS. Each is built on top of the core Kubernetes architecture and utilizes it in different ways. Each cloud provider has their implementations, extensions, best practices, and unique features, which means what might work well for cost optimization on EKS works less well (or simply isn’t an option) on AKS. Add to this the operational cost of managing Kubernetes through multiple services, and you begin to understand the cost optimization issues presented by a multi-cloud environment. So, in cost (and complexity) terms it’s better to choose a single provider. 2. Choose the Right Architecture For those at the next step of their cloud and Kubernetes journey, cloud costs (along with everything else) will be significantly impacted by the type of architecture you choose. And when it comes to Kubernetes, there are a few dos and don’ts. As you’ll likely know, microservice-based architectures are a great fit if you’re using Kubernetes clusters or containers more generally. And monolithic applications will fail to leverage the full benefits of containerization. However, there are other considerations which aren’t as well known. Stateful applications such as SQL databases aren’t a great fit for containers. Likewise, applications that require custom hardware (like heavy-use AI/ML) aren’t ideal Kubernetes candidates either. After you’ve picked your cloud provider, it’s best to consider the degree to which you’ll be adopting Kubernetes and containers and then make an informed choice about your architecture. Post-Deployment Strategies These strategies apply to those already using Kubernetes and looking for new methods to reach peak cost efficiency. 3. Set the Right Resource Limits and Quotas, Alongside Proper Scaling Methods Resource limits and quotas put brakes on how you consume, without these any Kubernetes cluster can behave in unpredictable ways. If you set no limit to any pod within a cluster, one pod could easily run away on memory and CPU. For example, if you have a front-end pod, a spike in user traffic will mean a spike in consumption. And while you certainly don’t want your application to crash, unlimited resource consumption is not the answer. Instead, you need sensible resource limits and other strategies for dealing with heavy usage. In this case, optimizing your application’s performance would be a better way of ensuring you meet customer demand without incurring extra costs. The same is true of quotas, although these apply at the namespace level and to other types of resources. In essence, it’s about setting limits based on what’s prudent, and ensuring you deliver by having other methods in place for scaling. 4. Set Smart Autoscaling Rules When it comes to auto-scaling in Kubernetes, you have two options: horizontal scaling and vertical scaling. You will determine which to do under what conditions using a rule-based system. Horizontal scaling means increasing the total number of pods while vertical scaling means increasing pods’ memory and CPU capacity without increasing their total number. Each method has its place when it comes to ideal resource usage and avoiding unnecessary costs. Horizontal scaling is the better choice when you need to scale quickly. Also, because the more pods you have the less chance a single point of failure will result in a crash, horizontal scaling is preferable when distributing heavy traffic. It’s also the better option when running stateless applications, as extra pods are better able to handle multiple concurrent requests. Vertical scaling is more beneficial to stateful applications, as it’s easier to preserve a state by adding resources to a pod as opposed to spreading that state across new pods. Vertical scaling is also preferable when you have other constraints on scaling such as limited IP address space or a limit on the number of nodes imposed by your license. When it comes to defining your scaling rules, you need to know the use case of each, the features of your application, and which types of scaling demands you’re likely to meet. 5. Use Rightsizing Rightsizing simply means ensuring proper resource utilization (CPU and memory) across each pod and node in your Kubernetes environment. If you fail to rightsize, a few things can happen that will impact your application’s performance and cost optimization efforts. In the case of overprovisioning, paid-for CPU and memory will go unused. These are idle resources that could have been made use of elsewhere. In the case of underprovisioning, although it does not impact Kubernetes’ costs directly, it will lead to performance issues, which will ultimately lead to costs down the line. When it comes to rightsizing, there are a few approaches. It can be done manually by engineers or completely automated using a tool (more on this to come). Ultimately, rightsizing is a continuous process requiring dynamic adjustments, but when done right, it’s an essential part of your cost optimization strategy. 6. Making the Best Use of Spot Instances Spot instances are a great fit for some. If your application can handle unpredictability, you can obtain huge discounts on instances (up to 90% on AWS) for a limited amount of time. However, those looking to reduce costs using spot instances on Kubernetes should bear in mind there may be some additional configuration. For example, you’ll need to adjust pod distribution budgets and set up readiness probes to prepare your Kubernetes clusters for the sudden removal of instances. The same goes for node management – you’ll need to diversify your instance types and pods to plan around disruption. Put simply, spot instances are a great way to reduce costs for the right application, but integrating that unpredictability into Kubernetes requires extra know-how. 7. Be Strategic About Regional Resources To Reduce Traffic One often overlooked cost optimization strategy is reducing traffic from between different geographical regions. When nodes cover multiple regions, data transfer charges can mount quickly as you’ll be using the public internet to send and receive data. Here, tools like AWS Private Link and Azure Private Link can help you optimize costs by providing an alternative route. The regional distribution and data transfer strategy of your clusters can be an involving job, and some will make use of a tool to help them, but once finalized, it’s a great way of cutting your monthly bill. Ongoing Improvements These are Kubernetes cost optimization techniques for those who may have already addressed the most common problems and want to implement ongoing improvements. If you’re well aware of practices like auto-scaling and rightsizing, here are a few down-the-road cost management techniques for more experienced Kubernetes users. 8. Tool up to Monitor Costs and Improve Efficiency Kubernetes, EKS, AKS, and GKE all offer their own cost-monitoring and optimization functions. But to gain truly granular insights, it’s often better to invest in a third-party tool. There are plenty of Kubernetes cost optimization tools to choose from. And a few more general cloud cost management tools that can work well with Kubernetes infrastructure. As a general tip, when you’re choosing a tool, it pays to think about what you’re missing most. Some tools work best for generating insights. Some have a heavy focus on AI, meaning less control and user input, which is great for teams that lack staff resources. In short, consider what’s missing in your Kubernetes cost optimization process and pick the right tool for the job. 9. Integrate Cost Controls Into Your CI/CD Pipeline If your organization is using DevOps in conjunction with Kubernetes, you can build Kubernetes cost monitoring and controls into your CI/CD pipeline at various points. For example, when properly integrated, Kubecost can be used to predict the costs of changes before deployment. It can also be used to automate cost-related controls, even failing a build if the predicted costs are deemed too high. In more general terms, integrating Kubecost (or scripts with a similar function) can make Kubernetes costs a monitorable data point to feed into future CI/CD decisions. So, if you’re using Kubernetes, and your organization has adopted DevOps, there’s scope for building cost optimization into the heart of your processes. 10. Build an Environment Where Cost Optimisation Is Embraced Through Tooling and Culture Although this goes for cloud costs in general, it’s worth taking the time to lay out some key points. First off, adopting the right mindset across an organization is going to be easier if you’re already doing some of the post-deployment and ongoing work. This culture needs data. So having the right cost monitoring and optimization tools is a good start. We’ve already mentioned Kubefed. But there are other more comprehensive platforms out there like CAST AI or Densify. Second, this data needs to be accessible and meaningful to multiple stakeholders. If you have adopted DevOps, this should present less difficulty. But if you haven’t, you may face a little resistance. Tools like Apptio Cloudability can help with this, providing clear insights into cost with a specific focus on connecting non-technical stakeholders to key stats. Last, whether you’re looking to cut costs on Kubernetes or the cloud in general, you need to foster an environment that rewards continuous improvement. Teams succeed when they have clear reporting across the business, and when each member feels invested in the continuous process of making things better.
In this post, let’s explore a key performance metric studied during garbage collection analysis: "GC throughput." We’ll understand what it means, its significance in Java applications, and how it impacts overall performance. Additionally, we’ll delve into actionable strategies to improve GC throughput, unlocking its benefits for modern software development. What Is Garbage Collection Throughput? Whenever an automatic garbage collection event runs, it pauses the application to identify unreferenced objects from memory and evict them. During that pause period, no customer transactions will be processed. Garbage collection throughput indicates what percentage of the application’s time is spent in processing customer transactions and what percentage of time is spent in the garbage collection activities. For example, if someone says his application’s GC throughput is 98%, it means his application spends 98% of its time processing customer transactions and the remaining 2% of the time processing Garbage Collection activities. A high GC throughput is desirable as it indicates that the application is efficiently utilizing system resources, leading to minimal interruptions and improved overall performance. Conversely, low GC throughput can lead to increased garbage collection pauses, impacting application responsiveness and high computing costs. Monitoring and optimizing GC throughput are vital to ensure smooth application execution and responsiveness. Reasons for Poor Garbage Collection Throughput Reasons for garbage collection throughput degradation can be categorized into 3 buckets: Performance problems Wrong GC tuning Lack of resources Let’s review each of these categories in detail in this section. 1. Performance Problems When there is a performance problem in the application, GC throughput will degrade. Below are the potential performance reasons that would cause degradation in the application’s performance. Memory Leaks Figure 1: GC events running repeatedly because of memory leak When an application suffers from a memory leak, Garbage Collection events keep running repeatedly without effectively reclaiming memory. In the figure above, you can notice the cluster of red triangles towards the right corner, indicating that GC events are repeatedly running. However, memory utilization does not decrease, which is a classic indication of a memory leak. In such cases, GC events consume most of the application’s time, resulting in a significant degradation of GC throughput and overall performance. To troubleshoot memory leaks, you may find this video clip helpful: Troubleshooting Memory Leaks. Consecutive GC Pauses Figure 2: GC events running repeatedly because of high traffic volume During peak hours of the day or when running batch processes, your application might experience a high traffic volume. As a result, GC events may run consecutively to clean up the objects created by the application. The figure above shows GC events running consecutively (note the red arrow in the above figure). This scenario leads to a dramatic degradation of GC throughput during that time period. To address this problem, you can refer to the blog post: Eliminate Consecutive Full GCs. Heavy Object Creation Rate There is a famous Chinese proverb in the "Art of War" book: "The greatest victory is that which requires no battle." Similarly, instead of trying to focus on tuning the GC events, it would be more efficient if you could prevent the GC events from running. The amount of time spent in garbage collection is directly proportional to the number of objects created by the application. If the application creates more objects, GC events are triggered more frequently. Conversely, if the application creates fewer objects, fewer GC events will be triggered. By profiling your application’s memory using tools, you can identify the memory bottlenecks and fix them. Reducing memory consumption will, in turn, reduce the GC impact on your application. However, reducing the object creation rate is a tedious and time-consuming process as it involves studying your application, identifying the bottlenecks, refactoring the code, and thoroughly testing it. However, it’s well worth the effort in the long run, as it leads to significant improvements in application performance and more efficient resource usage. 2. Wrong GC Tuning Another significant reason for degradation in an application’s GC throughput is incorrect garbage collection (GC) tuning. Various factors can contribute to this issue: Wrong GC Algorithm Choice The garbage collection algorithm plays a pivotal role in influencing the GC pause times. Choosing the wrong GC algorithm can substantially decrease the application’s GC throughput. As of now, there are 7 GC algorithms in OpenJDK: Serial GC, Parallel GC, CMS GC, G1 GC, Shenandoah GC, ZGC, and Epsilon. This brings up the question: how do I choose the right GC algorithm for my application? Figure 3: Flow chart to help you to arrive at the right GC algorithm The above flow chart will help you to identify the right GC algorithm for your application. You may also refer to this detailed post which highlights the capabilities, advantages, and disadvantages of each GC algorithm. Here is a real-world case study of an application, which was used in warehouses to control the robots for shipments. This application was running with the CMS GC algorithm and suffered from long GC pause times of up to 5 minutes. Yes, you read that correctly: it’s 5 minutes, not 5 seconds. During this 5-minute window, robots weren’t receiving instructions from the application and a lot of chaos was caused. When the GC algorithm was switched from CMS GC to G1 GC, the pause time instantly dropped from 5 minutes to 2 seconds. This GC algorithm change made a big difference in improving the warehouse’s delivery. Lack (or Incorrect) GC Tuning Incorrectly configuring GC arguments or failing to tune the application appropriately can also lead to a decline in GC throughput. Be advised there are 600+ JVM arguments related to JVM Memory and garbage collection. It’s a tedious task for anyone to choose the GC right arguments from a poorly documented arguments list. Thus, we have curated less than a handful of JVM arguments by each GC algorithm and given them below. Use the arguments pertaining to your GC algorithm and optimize the GC pause time. Serial GC Tuning Parameters Parallel GC Tuning Parameters CMS GC Tuning Parameters G1 GC Tuning Parameters Shenandoah Tuning Parameters ZGC Tuning Parameters For a detailed overview of GC tuning, you can watch this insightful video talk. Wrong Internal Memory Regions Size JVM memory has the following internal memory regions: Young Generation Old Generation MetaSpace Others You may visit this video post to learn about different JVM memory regions. Changing the internal memory region size can also result in positive GC pause time improvements. Here is a real case study of an application, which was suffering from 12.5 second average GC Pause time. This application’s Young Generation Size was configured at 14.65GB, and Old Gen size was also configured at the same 14.65 GB. Upon reducing the Young Gen size to 1GB, the average GC pause time remarkably reduced to 138 ms, which is a 98.9% improvement. 3. Lack of Resources Insufficient system and application-level resources can contribute to the degradation of an application’s garbage collection (GC) throughput. Insufficient Heap Size In most applications, heap size is either under-allocated or over-allocated. When heap size is under-allocated, GCs will run more frequently, resulting in the degradation of the application’s performance. Here is a real case study of an insurance application that was configured to run with an 8 GB heap size (-Xmx). This heap size wasn’t sufficient enough to handle the incoming traffic, due to which the garbage collector was running back-to-back. As we know, whenever a GC event runs, it pauses the application. Thus, when GC events run back-to-back, pause times were getting stretched and the application became unresponsive in the middle of the day. Upon observing this behavior, the heap size was increased from 8 GB to 12 GB. This change reduced the frequency of GC events and significantly improved the application’s overall availability. Insufficient System Resources A scarcity of CPU cycles or heavy I/O activity within the application can significantly degrade GC performance. Ensuring sufficient CPU availability on the server, virtual machine (VM), or container hosting your application is crucial. Additionally, minimizing I/O activity can help maintain optimal GC throughput. Garbage Collection performance can sometimes suffer due to insufficient system-level resources such as threads, CPU, and I/O. GC log analysis tools like GCeasy identify these limitations by examining the following two patterns in your GC log files: Sys time > User Time: This pattern indicates that the GC event is spending more time on kernel-level operations (system time) compared to executing user-level code. This could be a sign that your application is facing high contention for system resources, which can hinder GC performance. Sys time + User Time > Real Time: This pattern suggests that the combined CPU time (system time plus user time) exceeds the actual elapsed wall clock time. This discrepancy indicates that the system is overburdened, possibly due to insufficient CPU resources or a lack of GC threads. To address these system-level limitations, consider taking one of the following actions: Increase GC threads: Allocate more GC threads to your application by adjusting the relevant JVM parameters. Add CPU resources: If your application is running on a machine with limited CPU capacity, consider scaling up by adding more CPU cores. This can provide the additional processing power needed to handle GC operations more efficiently. I/O bandwidth: Ensure that your application’s I/O operations are optimized and not creating bottlenecks. Poor I/O performance can lead to increased system time, negatively impacting GC performance. Old Version of JDK Continual improvements are made to GC performance by JDK development teams. Operating on an outdated JDK version prevents you from benefiting from the latest enhancements. To maximize GC throughput, it’s recommended to keep your JDK up to date. You can access the latest JDK release information here. Conclusion Garbage Collection (GC) throughput is a critical metric in ensuring the efficient operation of Java applications. By understanding its significance and the factors that influence it, you can take actionable steps to optimize GC throughput and enhance overall performance. To achieve high GC throughput: Address performance problems: Identify and resolve memory leaks, manage heavy object creation rates, and avoid consecutive GC pauses during high-traffic periods. Optimize GC tuning: Select the appropriate GC algorithm, correctly configure GC tuning parameters, and adjust internal memory region sizes to improve GC pause times. Ensure adequate resources: Allocate sufficient heap size, provide enough CPU resources, and minimize I/O activity to prevent system-level bottlenecks. Keep your JDK updated: Regularly update your JDK to benefit from the latest GC performance improvements. By implementing these strategies, you can significantly reduce garbage collection pauses, leading to better application responsiveness and resource utilization.
Managing database connection strings securely for any microservice is critical; often, we secure the username and password using the environment variables and never factor in masking or hiding the database hostname. In reader and writer database instances, there would be a mandate in some organizations not to disclose the hostname and pass that through an environment variable at runtime during the application start. This article discusses configuring the hostname through environment variables in the properties file. Database Configurations Through Environment Variables We would typically configure the default connection string for Spring microservices in the below manner, with the database username and password getting passed as the environment variables. Java server.port=8081 server.servlet.context-path=/api/e-sign/v1 spring.esign.datasource.jdbc-url=jdbc:mysql://localhost:3306/e-sign?allowPublicKeyRetrieval=true&useSSL=false spring.esign.datasource.username=${DB_USER_NAME} spring.esign.datasource.password=${DB_USER_PASSWORD} spring.esign.datasource.driver-class-name=com.mysql.cj.jdbc.Driver spring.esign.datasource.minimumIdle=5 spring.esign.datasource.maxLifetime=120000 If our microservice connects to a secure database with limited access and the database administrator or the infrastructure team does not want you to provide the database hostname, then we have an issue. Typically, the production database hostname would be something like below: Java spring.esign.datasource.jdbc-url=jdbc:mysql://prod-db.fabrikam.com:3306/e-sign?allowPublicKeyRetrieval=true&useSSL=false spring.esign.datasource.username=${DB_USER_NAME} spring.esign.datasource.password=${DB_USER_PASSWORD} Using @Configuration Class In this case, the administrator or the cloud infrastructure team wants them to provide the hostname as an environment variable at runtime when the container starts. One of the options is to build and concatenate the connection string in the configuration class as below: Java @Configuration public class DatabaseConfig { private final Environment environment; public DatabaseConfig(Environment environment) { this.environment = environment; } @Bean public DataSource databaseDataSource() { String hostForDatabase = environment.getProperty("ESIGN_DB_HOST", "localhost:3306"); String dbUserName = environment.getProperty("DB_USER_NAME", "user-name"); String dbUserPassword = environment.getProperty("DB_USER_PASSWORD", "user-password"); String url = String.format("jdbc:mysql://%s/e-sign?allowPublicKeyRetrieval=true&useSSL=false", hostForDatabase); DriverManagerDataSource dataSource = new DriverManagerDataSource(); dataSource.setDriverClassName("com.mysql.cj.jdbc.Driver"); dataSource.setUrl(url); dataSource.setUsername(dbUserName); // Replace with your actual username dataSource.setPassword(dbUserPassword); // Replace with your actual password return dataSource; } } The above approach would work, but we need to use the approach with application.properties, which is easy to use and quite flexible. The properties file allows you to collate all configurations in a centralized manner, making it easier to update and manage. It also improves readability by separating configuration from code. The DevOps team can update the environment variable values without making code changes. Environment Variable for Database Hostname Commonly, we use environment variables for database username and password and use the corresponding expression placeholder expressions ${} in the application properties file. Java spring.esign.datasource.username=${DB_USER_NAME} spring.esign.datasource.password=${DB_USER_PASSWORD} However, for the database URL, we need to use the environment variable only for the hostname and not for the connection string, as each connection string for different microservices would have different parameters. So, to address this, Spring allows you to have the placeholder expression within the connection string shown below; this gives flexibility and the ability to stick with the approach of using the application.properties file instead of doing it through the database configuration class. Java spring.esign.datasource.jdbc-url=jdbc:mysql://${ESIGN_DB_HOST}:3306/e-sign?allowPublicKeyRetrieval=true&useSSL=false Once we have decided on the above approach and if we need to troubleshoot any issue for whatever reason in lower environments, we can then use the ApplicationListener interface to see the resolved URL: Java @Component public class ApplicationReadyLogger implements ApplicationListener<ApplicationReadyEvent> { private final Environment environment; public ApplicationReadyLogger(Environment environment) { this.environment = environment; } @Override public void onApplicationEvent(ApplicationReadyEvent event) { String jdbcUrl = environment.getProperty("spring.esign.datasource.jdbc-url"); System.out.println("Resolved JDBC URL: " + jdbcUrl); } } If there is an issue with the hostname configuration, it will show as an error when the application starts. However, after the application has been started, thanks to the above ApplicationReadyLogger implementation, we can see the database URL in the application logs. Please note that we should not do this in production environments where the infrastructure team wants to maintain secrecy around the database writer hostname. Using the above steps, we can configure the database hostname as an environment variable in the connection string inside the application.properties file. Conclusion Using environment variables for database hostnames to connect to data-sensitive databases can enhance security and flexibility and give the cloud infrastructure and DevOps teams more power. Using the placeholder expressions ensures that our configuration remains clear and maintainable.
In modern software development, containerization offers an isolated and consistent environment, which is crucial for maintaining parity between development and production setups. This guide provides a comprehensive walkthrough on creating a local development environment using IntelliJ IDEA, DevContainers, and Amazon Linux 2023 for Java development. Why Use DevContainers? What Are DevContainers? DevContainers are a feature provided by Visual Studio Code and other IDEs like IntelliJ IDEA through extensions. They allow you to define a consistent and reproducible development environment using Docker containers. By encapsulating the development environment, you ensure that all team members work in an identical setup, avoiding the "it works on my machine" problem. Benefits of DevContainers Consistency: Every developer uses the same development environment, eliminating discrepancies due to different setups. Isolation: Dependencies and configurations are isolated from the host machine, preventing conflicts. Portability: Easily share development environments through version-controlled configuration files. Scalability: Quickly scale environments by creating new containers or replicating existing ones. Diagram of DevContainers Workflow Plain Text +-------------------+ | Developer Machine | +-------------------+ | | Uses v +-----------------------+ | Development Tools | | (IntelliJ, VS Code) | +-----------------------+ | | Connects to v +-----------------------+ | DevContainer | | (Docker Container) | +-----------------------+ | | Runs v +-----------------------+ | Development Project | | (Amazon Linux 2023, | | Java, Dependencies) | +-----------------------+ Step-By-Step Guide Prerequisites Before starting, ensure you have the following installed on your machine: Docker: Install Docker IntelliJ IDEA: Download IntelliJ IDEA Visual Studio Code (optional, for DevContainer configuration): Download VS Code Step 1: Setting Up Docker and Amazon Linux 2023 Container 1. Pull Amazon Linux 2023 Image Open a terminal and pull the Amazon Linux 2023 image from Docker Hub: Shell docker pull amazonlinux:2023 2. Create a Dockerfile Create a directory for your project and inside it, create a Dockerfile: Dockerfile FROM amazonlinux:2023 # Install necessary packages RUN yum update -y && \ yum install -y java-17-openjdk-devel git vim # Set environment variables ENV JAVA_HOME /usr/lib/jvm/java-17-openjdk ENV PATH $JAVA_HOME/bin:$PATH # Create a user for development RUN useradd -ms /bin/bash developer USER developer WORKDIR /home/developer 3. Build the Docker Image Build the Docker image using the following command: Shell docker build -t amazonlinux-java-dev:latest . Step 2: Configuring DevContainers 1. Create a DevContainer Configuration Inside your project directory, create a .devcontainer directory. Within it, create a devcontainer.json file: JSON { "name": "Amazon Linux 2023 Java Development", "image": "amazonlinux-java-dev:latest", "settings": { "java.home": "/usr/lib/jvm/java-17-openjdk", "java.jdt.ls.java.home": "/usr/lib/jvm/java-17-openjdk" }, "extensions": [ "vscjava.vscode-java-pack" ], "postCreateCommand": "git clone https://github.com/your-repo/your-project ." } 2. Optional: Configure VS Code for DevContainers If using VS Code, ensure the DevContainers extension is installed. Open your project in VS Code and select "Reopen in Container" when prompted. Step 3: Setting Up IntelliJ IDEA 1. Open IntelliJ IDEA Open IntelliJ IDEA and navigate to File > New > Project from Existing Sources.... Select your project directory. 2. Configure Remote Development IntelliJ offers remote development capabilities, but since we're using DevContainers, we'll set up the project to work with our local Docker container. 3. Configure Java SDK Navigate to File > Project Structure > Project. Click New... under Project SDK, then select JDK and navigate to /usr/lib/jvm/java-17-openjdk within your Docker container. Alternatively, you can configure this through the terminal by running: Shell docker exec -it <container_id> /bin/bash ... and then configuring the path inside the container. 4. Import Project IntelliJ should automatically detect the project settings. Make sure the project SDK is set to the Java version inside the container. Step 4: Running and Debugging Your Java Application 1. Run Configuration Navigate to Run > Edit Configurations.... Click the + button and add a new Application configuration. Set the main class to your main application class. Set the JRE to the one configured inside the container. 2. Run the Application You should now be able to run and debug your Java application within the containerized environment directly from IntelliJ. Step 5: Integrating With Git 1. Clone Repository If not already cloned, use the following command to clone your repository inside the container: Shell git clone https://github.com/your-repo/your-project . 2. Configure Git in IntelliJ Navigate to File > Settings > Version Control > Git. Ensure the path to the Git executable is correctly set, usually /usr/bin/git within the container. Conclusion By following this guide, you now have a robust, isolated development environment for Java development using IntelliJ, DevContainers, and Amazon Linux 2023. This setup ensures consistency across development and production, reducing the "it works on my machine" syndrome and improving overall development workflow efficiency. Remember, containerization and DevContainers are powerful tools that can significantly streamline your development process. Happy coding!
A microservice system could have a high number of components with complex interactions. It is important to reduce this complexity, at least from the standpoint of the clients interacting with the system. A gateway hides the microservices from the external world. It represents a unique entrance and implements common cross requirements. In this article, you will learn how to configure a gateway component for a Spring Boot application, using the Spring Cloud Gateway package. Spring Cloud Gateway Spring Cloud provides a gateway implementation by the Spring Cloud Gateway project. It is based on Spring Boot, Spring WebFlux, and Reactor. Since it is based on Spring WebFlux, it must run on a Netty environment, not a usual servlet container. The main function of a gateway is to route requests from external clients to the microservices. Its main components are: Route: This is the basic entity. It is configured with an ID, a destination URI, one or more predicates, and filters. Predicate: This is based on a Java function predicate. It represents a condition that must be matched on head or request parameters. Filter: It allows you to change the request or the response. We can identify the following sequence of events: A client makes a call through the gateway. The gateway decides if the request matches a configured route. If there is a match, the request is sent to a gateway web handler. The web handler sends the request to a chain of filters that can execute logic relative to the request or the response, and operate changes on them. The target service is executed. Spring Cloud Gateway Dependencies To implement our Spring Boot application as a gateway we must first provide the spring-cloud-starter-gateway dependency after having defined the release train as in the configuration fragment below: XML <dependencyManagement> <dependencies> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-dependencies</artifactId> <version>2023.0.0</version> <type>pom</type> <scope>import</scope> </dependency> </dependencies> </dependencyManagement> XML <dependencies> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-gateway</artifactId> </dependency> ... </dependencies> Spring Cloud Gateway Configuration We can configure our gateway component using the application.yaml file. We can specify fully expanded arguments or shortcuts to define predicates and filters. In the first case, we define a name and an args field. The args field can contain one or more key-value pairs: YAML spring: cloud: gateway: routes: - id: route-example uri: https://example.com predicates: - name: PredicateExample args: name: predicateName regexp: predicateRegexpValue In the example above, we define a route with an ID value of "route-example", a destination URI "https://example.com," and a predicate with two args, "name" and "regexp." With the shortcut mode, we write the predicate name followed by the "=" character, and then a list of names and values separated by commas. We can rewrite the previous example by the following: YAML spring: cloud: gateway: routes: - id: route-example uri: https://example.com predicates: - Cookie=predicateName,predicateRegexpValue A specific factory class implements each predicate and filter type. There are several built-in predicate and filter factories available. The Cookie predicate shown above is an example. We will list some of them in the following sub-sections. Predicate Built-In Factories The After Predicate Factorymatches requests that happen after a specific time. After=2007-12-03T10:15:30+01:00 Europe/Paris The Before Predicate Factorymatches requests that happen before a specific time. Before=2007-12-03T10:15:30+01:00 Europe/Paris The Method Route Predicate Factoryspecifies the HTTP method types to match. Method=GET,POST Filter Built-In Factories The AddRequestHeader GatewayFilter Factoryallows the addition of an HTTP header to the request by its name and value. AddRequestHeader=X-Request-Foo, Bar The AddRequestParameter GatewayFilter Factoryallows the addition of a parameter to the request by its name and value. AddRequestParameter=foo, bar The AddResponseHeader GatewayFilter Factoryallows the addition of an HTTP header to the request by its name and value. AddResponseHeader=X-Response-Foo, Bar To implement a custom predicate or filter factory, we have to provide an implementation of a specific factory interface. The following sections show how. Custom Predicate Factories To create a custom predicate factory we can extend the AbstractRoutePredicateFactory, an abstract implementation of the RoutePredicateFactory interface. In the example below we define an inner static class Configuration, to pass its properties to the apply method and compare them to the request. Java @Component public class CustomPredicateFactory extends AbstractRoutePredicateFactory<CustomPredicateFactory.Configuration> { public CustomPredicateFactory() { super(Configuration.class); } @Override public Predicate<ServerWebExchange> apply(Configurationconfig) { return exchange -> { ServerHttpRequest request = exchange.getRequest(); //compare request with config properties return matches(config, request); }; } private boolean matches(Configuration config, ServerHttpRequest request) { //implement matching logic return false; } public static class Configuration{ //implement custom configuration properties } } Custom Filter Factories To create a custom filter factory we can extend the AbstractGatewayFilterFactory, an abstract implementation of the GatewayFilterFactory interface. In the examples below, you can see a filter factory that modifies the request and another one that changes the response, using the properties passed by a Configuration object. Java @Component public class PreCustomFilterFactory extends AbstractGatewayFilterFactory<PreCustomFilterFactory.Configuration> { public PreCustomFilterFactory() { super(Configuration.class); } @Override public GatewayFilter apply(Configuration config) { return (exchange, chain) -> { ServerHttpRequest.Builder builder = exchange.getRequest().mutate(); //use builder to modify the request return chain.filter(exchange.mutate().request(builder.build()).build()); }; } public static class Configuration { //implement the configuration properties } } @Component public class PostCustomFilterFactory extends AbstractGatewayFilterFactory<PostCustomFilterFactory.Configuration> { public PostCustomFilterFactory() { super(Configuration.class); } @Override public GatewayFilter apply(Configuration config) { return (exchange, chain) -> { return chain.filter(exchange).then(Mono.fromRunnable(() -> { ServerHttpResponse response = exchange.getResponse(); //Change the response })); }; } public static class Configuration { //implement the configuration properties } } Spring Cloud Gateway Example We will show a practical and simple example to see how the gateway works in a real scenario. You will find a link to the source code at the end of the article. The example is based on the following stack: Spring Boot: 3.2.1 Spring Cloud: 2023.0.0 Java 17 We consider a minimal microservice system that implements a library with only two services: a book service and an author service. The book service calls the author service to retrieve an author's information by passing an authorName parameter. The implementation of the two applications is based on an embedded in-memory H2 database and uses JPA ORM to map and query the Book and Author tables. From the standpoint of this demonstration, the most important part is the /getAuthor REST endpoint exposed by a BookController class of the Book service: Java @RestController @RequestMapping("/library") public class BookController { Logger logger = LoggerFactory.getLogger(BookService.class); @Autowired private BookService bookService; @GetMapping(value = "/getAuthor", params = { "authorName" }) public Optional<Author> getAuthor(@RequestParam("authorName") String authorName) { return bookService.getAuthor(authorName); } } The two applications register themselves in an Eureka discovery server and are configured as discovery clients. The final component is the gateway. The gateway should not register itself with the service discovery server. This is because it is called only by external clients, not internal microservices. On the other hand, it can be configured as a discovery client to fetch the other services automatically and implement a more dynamic routing. We don't do it here, though, to keep things simple. In this example, we want to show two things: See how the routing mechanism by the predicate value works Show how to modify the request by a filter, adding a header The gateway's configuration is the following: Java spring: application: name: gateway-service cloud: gateway: routes: - id: add_request_header_route uri: http://localhost:8082 predicates: - Path=/library/** filters: - AddRequestHeader=X-Request-red, red We have defined a route with id "add_request_header_route," and a URI value of "http://localhost:8082," the base URI of the book service. We then have a Path predicate with a "/library/**" value. Every call starting with "http://localhost:8080/library/" will be matched and routed toward the book's service with URIs starting with "http://localhost:8082/library/." Running the Example To run the example you can start each component by executing "mvn spring-boot:run" command from the component's base directory. You can then test it by executing the "http://localhost:8080/library/getAuthor?authorName=Goethe" URI. The result will be a JSON value containing info about the author. If you check the browser developer tools you will also find that an X-Request-red header has been added to the request with a value of "red." Conclusion Implementing a gateway with the Spring Cloud Gateway package is the natural choice in the Spring Boot framework. It reduces the complexity of a microservice environment by placing a single facade component in front of it. It also gives you great flexibility in implementing cross-cutting concerns like authentication, authorization, aggregate logging, tracing, and monitoring. You can find the source code of the example of this article on GitHub.
In this third part of our CDK series, the project cdk-quarkus-s3, in the same GIT repository, will be used to illustrate a couple of advanced Quarkus to AWS integration features, together with several tricks specific to RESTeasy which is, as everyone knows, the RedHat implementation of Jakarta REST specifications. Let's start by looking at the project's pom.xml file which drives the Maven build process. You'll see the following dependency: ... <dependency> <groupId>io.quarkiverse.amazonservices</groupId> <artifactId>quarkus-amazon-s3</artifactId> </dependency> <dependency> <groupId>io.quarkus</groupId> <artifactId>quarkus-amazon-lambda-http</artifactId> </dependency> <dependency> <groupId>io.quarkus</groupId> <artifactId>quarkus-rest-jackson</artifactId> </dependency> <dependency> <groupId>io.quarkus</groupId> <artifactId>quarkus-rest-client</artifactId> </dependency> ... <dependency> <groupId>software.amazon.awssdk</groupId> <artifactId>netty-nio-client</artifactId> </dependency> <dependency> <groupId>software.amazon.awssdk</groupId> <artifactId>url-connection-client</artifactId> </dependency> ... The first dependency in the listing above, quarkus-amazon-s3 is a Quarkus extension allowing your code to act as an AWS S3 client and to store and delete objects in buckets or implement backup and recovery strategies, archive data, etc. The next dependency, quarkus-amazon-lambda-http, is another Quarkus extension that aims at supporting the AWS HTTP Gateway API. As the reader already knows from the two previous parts of this series, with Quarkus, one can deploy a REST API as AWS Lambda using either AWS HTTP Gateway API or AWS REST Gateway API. Here we'll be using the former one, less expansive, hence the mentioned extension. If we wanted to use the AWS REST Gateway API, then we would have had to replace the quarkus-amazon-lambda-http extension by the quarkus-amazon-lambda-rest one. What To Expect In this project, we'll be using Quarkus 3.11 which, at the time of this writing, is the most recent release. Some of the RESTeasy dependencies have changed, compared with former versions, hence the dependency quarkus-rest-jackson which replaces now the quarkus-resteasy one, used in 3.10 and before. Also, the quarkus-rest-client extension, implementing the Eclipse MP REST Client specifications, is needed for test purposes, as we will see in a moment. Last but not least, the url-connection-client Quarkus extension is needed because the MP REST Client implementation uses it by default and, consequently, it has to be included in the build process. Now, let's look at our new REST API. Open the Java class S3FileManagementAPI in the cdk-quarkus-s3 project and you'll see that it defines three operations: download file, upload file, and list files. All three use the same S3 bucket created as a part of the CDK application's stack. Java @Path("/s3") public class S3FileManagementApi { @Inject S3Client s3; @ConfigProperty(name = "bucket.name") String bucketName; @POST @Path("upload") @Consumes(MediaType.MULTIPART_FORM_DATA) public Response uploadFile(@Valid FileMetadata fileMetadata) throws Exception { PutObjectRequest request = PutObjectRequest.builder() .bucket(bucketName) .key(fileMetadata.filename) .contentType(fileMetadata.mimetype) .build(); s3.putObject(request, RequestBody.fromFile(fileMetadata.file)); return Response.ok().status(Response.Status.CREATED).build(); } ... } Explaining the Code The code fragment above reproduces only the upload file operation, the other two being very similar. Observe how simple the instantiation of the S3Client is by taking advantage of the Quarkus CDI which avoids the need for several boilerplate lines of code. Also, we're using the Eclipse MP Config specification to define the name of the destination S3 bucket. Our endpoint uploadFile() accepts POST requests and consumes MULTIPART_FORM_DATA MIME data is structured in two distinct parts, one for the payload and the other one containing the file to be uploaded. The endpoint takes an input parameter of the class FileMetadata, shown below: Java public class FileMetadata { @RestForm @NotNull public File file; @RestForm @PartType(MediaType.TEXT_PLAIN) @NotEmpty @Size(min = 3, max = 40) public String filename; @RestForm @PartType(MediaType.TEXT_PLAIN) @NotEmpty @Size(min = 10, max = 127) public String mimetype; ... } This class is a data object grouping the file to be uploaded together with its name and MIME type. It uses the @RestForm RESTeasy specific annotation to handle HTTP requests that have multipart/form-dataas their content type. The use of jakarta.validation.constraints annotations are very practical as well for validation purposes. To come back at our endpoint above, it creates a PutObjectRequest having as input arguments the destination bucket name, a key that uniquely identifies the stored file in the bucket, in this case, the file name, and the associated MIME type, for example TEXT_PLAIN for a text file. Once the PutObjectRequest created it is sent via an HTTP PUT request to the AWS S3 service. Please notice how easy the file to be uploaded is inserted into the request body using the RequestBody.fromFile(...) statement. That's all as far as the REST API exposed as an AWS Lambda function is concerned. Now let's look at what's new in our CDK application's stack: Java ... HttpApi httpApi = HttpApi.Builder.create(this, "HttpApiGatewayIntegration") .defaultIntegration(HttpLambdaIntegration.Builder.create("HttpApiGatewayIntegration", function).build()).build(); httpApiGatewayUrl = httpApi.getUrl(); CfnOutput.Builder.create(this, "HttpApiGatewayUrlOutput").value(httpApi.getUrl()).build(); ... These lines have been added to the LambdaWithBucketConstruct class in the cdk-simple-construct project. We want the Lambda function we're creating in the current stack to be located behind an HTTP Gateway and backups it. This might have some advantages. So we need to create an integration for our Lambda function. The notion of integration, as defined by AWS, means providing a backend for an API endpoint. In the case of the HTTP Gateway, one or more backends should be provided for each API Gateway's endpoints. The integrations have their own request and responses, distinct from the ones of the API itself. There are two integration types: Lambda integrations where the backend is a Lambda function; HTTP integrations where the backend might be any deployed web application; In our example, we're using Lambda integration, of course. There are two types of Lambda integrations as well: Lambda proxy integration where the definition of the integration's request and response, as well as their mapping to/from the original ones, aren't required as they are automatically provided; Lambda non-proxy integration where we need to explicitly specify how the incoming request data is mapped to the integration request and how the resulting integration response data is mapped to the method response; For simplicity's sake, we're using the 1st case in our project. This is what the statement .defaultIntegration(...) above is doing. Once the integration is created, we need to display the URL of the newly created API Gateway, which our Lambda function is the backup. This way, in addition to being able to directly invoke our Lambda function, as we did previously, we'll be able to do it through the API Gateway. And in a project with several dozens of REST endpoints, it's very important to have a single contact point, where to apply security policies, logging, journalisation, and other cross-cutting concerns. The API Gateway is ideal as a single contact point. The project comes with a couple of unit and integration tests. For example, the class S3FileManagementTest performs unit testing using REST Assured, as shown below: Java @QuarkusTest @TestMethodOrder(MethodOrderer.OrderAnnotation.class) public class S3FileManagementTest { private static File readme = new File("./src/test/resources/README.md"); @Test @Order(10) public void testUploadFile() { given() .contentType(MediaType.MULTIPART_FORM_DATA) .multiPart("file", readme) .multiPart("filename", "README.md") .multiPart("mimetype", MediaType.TEXT_PLAIN) .when() .post("/s3/upload") .then() .statusCode(HttpStatus.SC_CREATED); } @Test @Order(20) public void testListFiles() { given() .when().get("/s3/list") .then() .statusCode(200) .body("size()", equalTo(1)) .body("[0].objectKey", equalTo("README.md")) .body("[0].size", greaterThan(0)); } @Test @Order(30) public void testDownloadFile() throws IOException { given() .pathParam("objectKey", "README.md") .when().get("/s3/download/{objectKey}") .then() .statusCode(200) .body(equalTo(Files.readString(readme.toPath()))); } } This unit test starts by uploading the file README.md to the S3 bucket defined for the purpose. Then it lists all the files present in the bucket and finishes by downloading the file just uploaded. Please notice the following lines in the application.properties file: Plain Text bucket.name=my-bucket-8701 %test.quarkus.s3.devservices.buckets=${bucket.name} The first one defines the names of the destination bucket and the second one automatically creates it. This only works while executed via the Quarkus Mock server. While this unit test is executed in the Maven test phase, against a localstackinstance run by testcontainers, automatically managed by Quarkus, the integration one, S3FileManagementIT, is executed against the real AWS infrastructure, once our CDK application is deployed. The integration tests use a different paradigm and, instead of REST Assured, very practical for unit tests, they take advantage of the Eclipse MP REST Client specifications, implemented by Quarkus, as shown in the following snippet: Java @QuarkusTest @TestMethodOrder(MethodOrderer.OrderAnnotation.class) public class S3FileManagementIT { private static File readme = new File("./src/test/resources/README.md"); @Inject @RestClient S3FileManagementClient s3FileManagementTestClient; @Inject @ConfigProperty(name = "base_uri/mp-rest/url") String baseURI; @Test @Order(40) public void testUploadFile() throws Exception { Response response = s3FileManagementTestClient.uploadFile(new FileMetadata(readme, "README.md", MediaType.TEXT_PLAIN)); assertThat(response).isNotNull(); assertThat(response.getStatusInfo().toEnum()).isEqualTo(Response.Status.CREATED); } ... } We inject S3FileManagementClient which is a simple interface defining our API endpoints and Quarkus does the rest. It generates the required client code. We just have to invoke endpoints on this interface, for example uploadFile(...), and that's all. Have a look at S3FileManagementClient, in the cdk-quarkus-s3 project, to see how everything works and please notice how the annotation @RegisterRestClient defines a configuration key, named base_uri, used further in the deploy.sh script. Now, to test against the AWS real infrastructure, you need to execute the deploy.sh script, as follows: Shell $ cd cdk $ ./deploy.sh cdk-quarkus/cdk-quarkus-api-gateway cdk-quarkus/cdk-quarkus-s3 This will compile and build the application, execute the unit tests, deploy the CloudFormation stack on AWS, and execute the integration tests against this infrastructure. At the end of the execution, you should see something like: Plain Text Outputs: QuarkusApiGatewayStack.FunctionURLOutput = https://<generated>.lambda-url.eu-west-3.on.aws/ QuarkusApiGatewayStack.LambdaWithBucketConstructIdHttpApiGatewayUrlOutput = https://<generated>.execute-api.eu-west-3.amazonaws.com/ Stack ARN: arn:aws:cloudformation:eu-west-3:...:stack/QuarkusApiGatewayStack/<generated> Now, in addition to the Lambda function URL that you've already seen in our previous examples, you can see how the API HTTP Gateway URL, that you can use now for testing purposes, instead of the Lambda one. An E2E test case, exported from Postman (S3FileManagementPostmanIT), is provided as well. It is executed via the Docker image postman/newman:latest, running in testcontainers. Here is a snippet: Java @QuarkusTest public class S3FileManagementPostmanIT { ... private static GenericContainer<?> postman = new GenericContainer<>("postman/newman") .withNetwork(Network.newNetwork()) .withCopyFileToContainer(MountableFile.forClasspathResource("postman/AWS.postman_collection.json"), "/etc/newman/AWS.postman_collection.json") .withStartupCheckStrategy(new OneShotStartupCheckStrategy().withTimeout(Duration.ofSeconds(10))); @Test public void run() { String apiEndpoint = System.getenv("API_ENDPOINT"); assertThat(apiEndpoint).isNotEmpty(); postman.withCommand("run", "AWS.postman_collection.json", "--global-var base_uri=" + apiEndpoint.substring(8).replaceAll(".$", "")); postman.start(); LOG.info(postman.getLogs()); assertThat(postman.getCurrentContainerInfo().getState().getExitCodeLong()).isZero(); postman.stop(); } } Conclusion As you can see, after starting the postman/newman:latest image with testcontainers, we run the E2E test case exported from Postman by passing to it the option global-vars such that to initialize the global variable labeled base_uri to the value of the REST API URL saved by the deploy.sh script in the API-ENDPOINT environment variable. Unfortunately, due probably to a bug, the postman/newman image doesn't recognize this option, accordingly, waiting for this issue to be fixed, this test is disabled for now. You can, of course, import the file AWS.postman_collection.json in Postman and run it this way after having replaced the global variable {{base_uri} with the current value of the API URL generated by AWS. Enjoy!
Over the past decade, I have presented many times and written numerous blogs and source code on sagas and event-driven microservices. In those blogs, I’ve discussed the need for sagas in microservices architectures, the preferred and increased use of event-driven patterns and communication for microservices, and the difficulties in implementing sagas, particularly around developing saga participant code for compensating transactions. These are addressed in the product solution I will describe, including an example source code here, and soon, an update of the beta version of the saga workshop showing the same. The features are in the Oracle Database Free Docker container and soon in the Oracle Autonomous Database. Part of what makes the new Oracle Saga Framework so powerful is its combined usage of other features in the Oracle Database including the TxEventQ transactional messaging system and reservation-less locking; therefore, I will describe them and how they contribute to the overall comprehensive solution as well. Quick Background on Sagas The saga pattern is used to provide data integrity between multiple services and to do so for potentially long-running transactions. There are many cursory blogs, as they tend to be, written on sagas and long-running transactions. In short, XA and 2PC require distributed locks (con) which manage ACID properties so that the user can simply execute rollback or commit (pro). In contrast, sagas use local transactions only and do not require distributed locks (pro) but require the user to implement compensation logic, etc. (con). My previous blog showed examples of compensation logic and the need to explicitly maintain journals and handle a number of often subtle, but important complexities. Indeed, most agree that data integrity is perhaps the most difficult and critical of challenges when taking advantage of a microservices architecture. The Transfer Service receives a request to transfer money from one account to another. The transfer method that is called is annotated with @LRA(value = LRA.Type.REQUIRES_NEW, end = false); therefore, the underlying LRA client library makes a call to the Coordinator/Orchestrator service which creates a new LRA/saga and passes the LRA/saga ID back to the Transfer Service. The Transfer Service makes a call to the (Bank)Account (for account 66) service to make the withdraw call. The LRA/saga ID is propagated as a header as part of this call. The withdraw method that is called is annotated with @LRA(value = LRA.Type.MANDATORY, end = false); therefore, the underlying client library makes a call to the Coordinator/Orchestrator service, which recognizes the LRA/saga ID and enlists/joins the Account Service endpoint (address) to the LRA/saga started by the Transfer Service. This endpoint has a number of methods, including the complete and compensate methods that will be called when the saga/LRA is terminated/ended. The withdraw method is executed and control returns to the Transfer Service. This is repeated with a call from the Transfer Service to the Account service (for account 67) to make the deposit call. Depending on the returns from the Account Service calls, the Transfer Service determines if it should close or cancel the saga/LRA. close and cancel are somewhat analogous to commit or rollback. The Transfer Service issues the close or cancel call to the Coordinator (I will get into more details on how this is done implicitly when looking closer at the application). The Coordinator in turn issues complete (in the case of close) or compensate calls on the participants that were joined in Saga/LRA previously. Oracle TxEventQ (Formerly AQ) Messaging System in the Database There are several advantages to event-driven microservices — particularly those that are close to the (potentially critical) data — including scalability and QoS levels, reliability, transactionality, integration and routing, versioning, etc. Of course, there needs to be a messaging system/broker in order to provide event-driven sagas. The TxEventQ messaging system (formerly called AQ) has been part of the Oracle database for decades (long before Kafka existed). It provides some key differentiators not available in other messaging systems; in particular, the ability to do messaging and data operations in the same local transaction — which is required to provide transactional outbox, idempotent producers and consumers, and in particular, the robust saga simply can't do. These are described in the blog "Apache Kafka vs. Oracle Transactional Event Queues as Microservices Event Mesh," but the following table gives an idea of the common scenario that would require extra developer and admin handling or is simply not possible in Kafka and other messaging and database systems. The scenario involves an Order micoservice inserting an order in the database and sending a message to an Inventory microservice. The Inventory microservices receives the message, updates the Inventory table, and sends a message back to the Order service, which receives that message and updates the Order in the database. Notice how the Oracle Database and TxEventQ handle all failure scenarios automatically. Auto-Compensating Data Types via Lock-Free Reservations Saga and Escrow History There is little debate that the saga pattern is currently the best approach to data integrity between microservices and long-running transactions/activities. This comes as little surprise, as it has a long history starting with the original paper that was published in 1987 which also states that a simplified and optimal implementation of the saga pattern is one where the coordinator is implemented in the database(s). The concept of escrow concurrency and compensation-aware transactions was described even earlier in 1985. The new Oracle database feature is named "Lock-free Reservations," as a reservation journal acts as an intermediary to the actual data table for any fields marked with the keyword RESERVABLE. Here is an example of how easy it is to simply label a column/field as reservable: CREATE TABLE bankA ( ucid VARCHAR2(50), account_number NUMBER(20) PRIMARY KEY, account_type VARCHAR2(15) CHECK (account_type IN ('CHECKING', 'SAVING')), balance_amount decimal(10,2) RESERVABLE constraint balance_con check(balance_amount >= 0), created_at TIMESTAMP DEFAULT SYSTIMESTAMP ); An internal reservation journal table is created and managed automatically by the database (with nomenclature SYS_RESERVJRNL_<object_number_of_base_table>) which tracks the actions made on the reservable field by concurrent transactions. Changes requested by each transaction are verified against the journal value (not the actual database table), and thus, promises of the change are made to the transactions based on the reservation/journal. The changes are not flushed/processed on the underlying table until the commit of the transaction(s). The modifications made on these fields must be commutative; that is, relative increment/decrement operations such as quantity = quantity + 1, not absolute assignments such as quantity = 2. This is the case in the vast majority of data hot spots and indeed, even state machines work on this principle. Along with the fine-grained/column-level nature of escrow/reservations, high throughput for hot spots of concurrent transactions is attained. Likewise, transactions do not block for long-running transactions. A customer in a store no longer locks all of a particular type of an item just because one of the items is in their cart, nor can they take items from another person's cart. A good way to understand is to compare and contrast lock-less reservations/escrow with the concurrency mechanisms, drawbacks, and benefits of pessimistic and optimistic locking. Pessimistic Locking Optimistic Locking Escrow Locking What is extremely interesting is the fact that the journaling, etc. conducted by lock-free reservations is also used by the Oracle Saga Framework to provide auto-compensating/compensation-aware data. The Saga framework performs compensating actions during a Saga rollback. Reservation journal entries provide the data that is required to take compensatory actions for Saga transactions. The Saga framework sequentially processes the saga_finalization$ table for a Saga branch and executes the compensatory actions using the reservation journal. In other words, it removes the burden of coding the compensation logic, as described in the Developing Saga Participant Code For Compensating Transactions blog. Quick Feature Comparison in Saga Implementations In my previous blog, I used the versatile Oracle MicroTx product, written by the same team that wrote the famous Tuxedo transaction processing monitor. I've provided this table of comparison features to show what is provided by LRA in general and what unique features currently exist (others are in development) between the two Oracle Saga coordinator implementations. Without LRA LRA MicroTX Sagas/LRA Oracle Database Sagas/LRA Automatic propagation of saga/LRA ID and participant enlistment X X X Automatic coordination of completion protocol (commit/rollback). X X X Automatic timeout and recovery logic X X X REST support X X Messaging support X Automatic recovery state maintained in participants X Automatic Compensating Data via Lock-free Reservations X XA and Try-Cancel-Commit support X Coordinator runs in... and HA, Security, ... is supported by... Kubernetes Oracle Database Languages directly supported Java, JavaScript Java, PL/SQL Application Setup and Code Setup There are a few simple setup steps on the database side that need to be issued just once to initialize the system. The full doc can be found here. It is possible to use a number of different configurations for microservices, all of which are supported by the Oracle Database Saga Framework. For example, there can be schema or another isolation level between microservices, or there can be a strict database-per-service isolation. We will show the latter here and use a Pluggable Database (PDB) per service. A PDB is a devoted database that can be managed as a unit/CDB for HA, etc., making it perfect for microservices per service. Create database links between each database for message propagation, forming an event mesh. The command looks like this: CREATE PUBLIC DATABASE LINK PDB2_LINK CONNECT TO admin IDENTIFIED BY test USING 'cdb1_pdb2'; CREATE PUBLIC DATABASE LINK PDB3_LINK CONNECT TO admin IDENTIFIED BY test USING 'cdb1_pdb3'; CREATE PUBLIC DATABASE LINK PDB4_LINK CONNECT TO admin IDENTIFIED BY test USING 'cdb1_pdb4'; 2. Grant saga-related privileges to the saga coordinator/admin. grant saga_adm_role to admin; grant saga_participant_role to admin; grant saga_connect_role to admin; grant all on sys.saga_message_broker$ to admin; grant all on sys.saga_participant$ to admin; grant all on sys.saga$ to admin; grant all on sys.saga_participant_set$ to admin; 3. add_broker and add_coordinator: exec dbms_saga_adm.add_broker(broker_name => 'TEST', broker_schema => 'admin'); exec dbms_saga_adm.add_coordinator(coordinator_name => 'CloudBankCoordinator', mailbox_schema => 'admin', broker_name => 'TEST', dblink_to_coordinator => 'pdb1_link'); exec dbms_saga_adm.add_participant(participant_name => 'CloudBank', coordinator_name => 'CloudBankCoordinator' , dblink_to_broker => 'pdb1_link' , mailbox_schema => 'admin' , broker_name => 'TEST', dblink_to_participant => 'pdb1_link'); 4. add_participant(s): exec dbms_saga_adm.add_participant(participant_name=> 'BankB' ,dblink_to_broker => 'pdb1_link',mailbox_schema=> 'admin',broker_name=> 'TEST', dblink_to_participant=> 'pdb3_link'); Application Dependencies On the Java application side, we just need to add these two dependencies to the maven pom.xml: <dependency> <groupId>com.oracle.database.saga</groupId> <artifactId>saga-core</artifactId> <version>[23.3.0,)</version> </dependency> <dependency> <groupId>com.oracle.database.saga</groupId> <artifactId>saga-filter</artifactId> <version>[23.3.0,)</version> </dependency> Application Source Code As the Oracle Database Saga Framework implements the MicroProfile LRA (Long Running Actions) specification, much of the code, annotations, etc. that I've presented in previous blogs apply to this one. However, though future support has been discussed, the LRA specification does not support messaging/eventing — only REST (it supports Async REST, but that is of course not messaging/eventing) — and so a few additional annotations have been furnished to provide such support and take advantage of the TxEventQ transactional messaging and auto-compensation functionality already described. Full documentation can be found here, but the key two additions are @Request in the saga/LRA participants and @Response in the initiating service/participant as described in the following. Note that the Oracle Database Saga Framework also provides access to the same saga functionality via direct API calls (e.g., SagaInitiator beginSaga(), Saga sendRequest, commitSaga, rollbackSaga, etc.), and so can be used not only in JAX-RS clients but any Java client, and, of course, in PL/SQL as well. As shown in the previous blog and code repos, JAX-RS can also be used in Spring Boot. The example code snippets below show the classic TravelAgency saga scenario (with Airline, etc., as participants) while the example I've been using in the previous blog and the GitHub repos provided continues the bank transfer scenario. The same principles apply, of course; just using different use cases to illustrate. The Initiator (The Initiating Participant) @LRA for demarcation of the saga/LRA indicating whether the method should start, end, or join an LRA. @Response is an Oracle Saga-specific annotation, indicating this method collects responses from Saga participants (who were enrolled into a Saga using the sendRequest() API and the name of the participant (Airline, in this case). Java @Participant(name = "TravelAgency") /* @Participant declares the participant’s name to the saga framework */ public class TravelAgencyController extends SagaInitiator { /* TravelAgencyController extends the SagaInitiator class */ @LRA(end = false) /* @LRA annotates the method that begins a saga and invites participants */ @POST("booking") @Consumes(MediaType.TEXT_PLAIN) @Produces(MediaType.APPLICATION_JSON) public jakarta.ws.rs.core.Response booking( @HeaderParam(LRA_HTTP_CONTEXT_HEADER) URI lraId, String bookingPayload) { Saga saga = this.getSaga(lraId.toString()); /* The application can access the sagaId via the HTTP header and instantiate the Saga object using it */ try { /* The TravelAgency sends a request to the Airline sending a JSON payload using the Saga.sendRequest() method */ saga.sendRequest ("Airline", bookingPayload); response = Response.status(Response.Status.ACCEPTED).build(); } catch (SagaException e) { response=Response.status(Response.Status.INTERNAL_SERVER_ERROR).build(); } } @Response(sender = "Airline.*") /* @Response annotates the method to receive responses from a specific Saga participant */ public void responseFromAirline(SagaMessageContext info) { if (info.getPayload().equals("success")) { saga.commitSaga (); /* The TravelAgency commits the saga if a successful response is received */ } else { /* Otherwise, the TravelAgency performs a Saga rollback */ saga.rollbackSaga (); } } } The Participant Services @Request is an Oracle Saga-specific annotation to indicate the method that receives incoming requests from Saga initiators. @Complete: The completion callback (called by the coordinator) for the saga/LRA @Compensate: The compensate callback (called by the coordinator) for the saga/LRA The Saga framework provides a SagaMessageContext object as an input to the annotated method which includes convenience methods to get the Saga, SagaId, Sender, Payload, and Connection (to use transactionally and as an auto-compensating data type as part of the saga as described earlier). Java @Participant(name = "Airline") /* @Participant declares the participant’s name to the saga framework */ public class Airline extends SagaParticipant { /* Airline extends the SagaParticipant class */ @Request(sender = "TravelAgency") /* The @Request annotates the method that handles incoming request from a given sender, in this example the TravelAgency */ public String handleTravelAgencyRequest(SagaMessageContext info) { /* Perform all DML with this connection to ensure everything is in a single transaction */ FlightService fs = new FlightService(info.getConnection()); fs.bookFlight(info.getPayload(), info.getSagaId()); return response; /* Local commit is automatically performed by the saga framework. The response is returned to the initiator */ } @Compensate /* @Compensate annotates the method automatically called to roll back a saga */ public void compensate(SagaMessageContext info) { fs.deleteBooking(info.getPayload(), info.getSagaId()); } @Complete /* @Complete annotates the method automatically called to commit a saga */ public void complete(SagaMessageContext info) { fs.sendConfirmation(info.getSagaId()); } } APEX Workflow With Oracle Saga Framework Oracle's new APEX Workflow product has been designed to include and account for sagas. More blogs with details are coming, but to give you an idea, the following shows the same bank transfer saga we've been discussing, but defined in a workflow and with the inclusion of a manual step in the flow for approval of the transfer (a common use case in finance and other workflows). You can read more about the workflow product in the blogs here and here. Other Topics: Observability, Optimizations, and Workshop Event-driven applications are, of course, different from blocking/sync/REST applications, and lend to different patterns and advantages, particularly as far as parallelism. Therefore, settings for pool size, number of publishers and listeners, etc. are part of the saga framework in order to optimize. As the journaling and bookkeeping are stored in the database along with the data and messaging, completion and compensation can be conducted locally there, making it in many cases unnecessary to make the callbacks to the application code that are otherwise necessary. This again greatly simplifies development and also drastically cuts down on the costs and considerations of network calls. Microservices, and especially those containing sagas, require effective observability. Especially those containing sagas need this observability not only in the application but also in the saga coordinator, communication infrastructure, and database. Oracle has an OpenTelemetry-based solution for this that is coordinated across all tiers. A "DevOps meets DataOps" video explains this Unified Observability architecture and how it can be used with entirely open-source products such as Kubernetes (including eBPF), Prometheus, Loki and Promtail, ELK stack, Jaeger and Zipkin, Grafana, etc. Finally, note that the existing beta workshop will soon be updated to include the new GA release of the Saga Framework which will be announced at this same blog space. Conclusion Thank you for reading and of course please feel free to reach out to me with any questions or feedback. I want to give credit to the Oracle TxEventQ and Transaction Processing teams for all the amazing work they've done to conquer some of if not the most difficult areas of data-driven microservices and simplify it for the developers. I'd like to give special credit to Oracle's Dieter Gawlick who started the work in both escrow (lock-free reservations) and compensation-aware datatypes and sagas 40 years ago and who is also the original architect of TxEventQ (formerly AQ) with its ability to do messaging and data manipulation in the same local transaction.
Recently in one of my projects, there was a requirement to create JWT within the MuleSoft application and send that as an OAuth token to the backend for authentication. After doing some research, I got to know several ways to create JWT like Java code, DataWeave code, JWT sign module, etc. Java code can be complex to implement, Dataweave code does not work for the RSA algorithm and the client didn’t want to use a custom module like the JWT sign module. Finally, I got to know about the DataWeave JWT Library available in MuleSoft Exchange. In this blog, I will be describing the process of creating JWT using the Dataweave JWT Library available in Mulesoft Exchange which supports both HMAC and RSA algorithms. Background JSON Web Token JSON Web Token (JWT) is an open standard that provides a mechanism for securely transmitting data between parties as a JSON object. JWTs are relatively smaller in size and because of that they can be sent through a URL, POST parameter, or HTTP header, and it is transmitted quickly. JSON Web Token Structure JSON Web Tokens are made of three parts separated by dots (.), which are: 1. Header The first part is the Base64-URL encoded header which typically consists of two fields: the type of the token, which is JWT, and the signing algorithm being used (HMAC or RSA). For example: Plain Text { "alg": "RS256", "typ": "JWT" } 2. Payload The second part of the token is the payload, which are Base64-URL encoded claims. Claims are details about the user and some additional data. For example: Plain Text { "sub": "jwt-demo@test.com", "aud": "https://test.mulesoft.com", "exp": "1661508617" } Please note that for signed tokens, this information is protected against tampering, but it is readable by anyone. Do not put sensitive information in the payload or header parts of a JWT unless it is encrypted. 3. Signature To create the signature part we need to take the Base64-URL encoded header, the Base64-URL encoded payload, a private key or secret key based on the algorithm type (HMAC or RSA), and the algorithm specified in the header, and sign that. In the case of an HMAC signature, a secret key is used to sign the JWT by the client, and the same secret key is used to validate the JWT by the server. In the case of an RSA signature, the private key is used to sign the JWT by the client, and the public key is used to validate the JWT by the server. This ensures that the message is not changed along the way and that the sender of the JWT is who it says it is. Below is an example of JWT for the above-encoded header and payload, which is signed with a private key: eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJqd3QtZGVtb0B0ZXN0LmNvbSIsImF1ZCI6Imh0dHBzOi8vdGVzdC5tdWxlc29mdC5jb20iLCJleHAiOjE2NjE1MDg2MTd9.WnXOmrIv2SRF940x5qGuiRUkPJ14rBMnDRc53NLCf8LbJXEiwiSlKaulQGwRwBsBBG1C2DcANVqabC1KkeCen5D1dKaaabGo8BtV83qiP9FyKIhRgl81ldzOZ0QuybqBF78-Tq8LpjAX6W4HIlU5Im6MhgARnWKillxPbnwK8t_AVxIFxl2JW_h0gNbqT9tnOR2YDFm3gNlfLvHEu01FgI8LW9VQLvEuCsEMSCaz7-t1JsQ9nH8wGoVnmU0NgCyRBMd3F0hoCDzIP1PMJSceOHVdlK4hsmsjmDLsVUT0aInhoWqeyVcJkoULmBB34VUazV0yjXLzup26jUvfFxkwlA Walkthrough Step 1 Go to Dataweave JWT Library in MuleSoft Exchange. Step 2 Copy the dependency snippet from the Exchange and add it to your project’s pom.xml dependencies section. This will import the Dataweave JWT Library from MuleSoft Exchange to your MuleSoft application. Step 3 Add a transform message in your flow to create JWT. Create JWT With RSA Algorithm Step 1: Add the transform message to read the private key in the Mule application and store it in a variable. Plain Text output application/json --- readUrl("classpath://pkcs1-rsa256-privatekey.pem","text/plain") replace "\r" with "" Here, the private key is present under the src/main/resources folder and we are reading the private key into Mule flow using the DataWeave readUrl function. You can also load the private key using the file read connector or any other way like loading from Azure Key Vault, AWS Secret Manager, etc. as per use case requirement. Also, you need to check the new line character in your private key. If it is “\r\n” instead of “\n” then we need to remove the extra “\r” like above. The JWT library supports private keys with the new line character “\n”. Step 2: Add the transform message to create the JWT. Here, we need to pass the 4 parameters below to the JWT function. Sr. No Parameter Datatype Description Example 1 header Object JWT Header { "alg": "RS256", "typ": "JWT" } 2 payload Object JWT Payload { iss: "jwt-rsa-demo@test.com", aud: 'https://test.mulesoft.com', iat: (now()) as Number { unit: 'seconds' }, exp: (now() + |PT7200S|) as Number { unit: 'seconds' } } 3 key String RSA private keys. JWT library supports PKCS#1 or PKCS#8 formatted private keys. vars.privateKey 4 algorithm String The supported RSA algorithms are: RS256: Sha256withRSA RS384: Sha384withRSA RS512: Sha512withRSA Sha256withRSA DataWeave: Plain Text %dw 2.0 import * from jwt::RSA output application/json --- { token: JWT( { "alg": "RS256", "typ": "JWT" }, { iss: "jwt-rsa-demo@test.com", aud: 'https://test.mulesoft.com', iat: (now()) as Number { unit: 'seconds' }, exp: (now() + |PT7200S|) as Number { unit: 'seconds' } }, vars.privateKey as String, 'Sha256withRSA' ), expiration: (now() + |PT7150S|) } Create JWT With HMAC Algorithm Step 1: Add the transform message to create the JWT. Here, we need to pass the 4 parameters below to the JWT function. Sr. No. Parameter Datatype Description Example 1 header Object JWT Header { "alg": "HS256", "typ": "JWT" } 2 payload Object JWT Payload { iss: "jwt-hmac-demo@test.com", aud: 'https://test.mulesoft.com', iat: (now()) as Number { unit: 'seconds' }, exp: (now() + |PT7200S|) as Number { unit: 'seconds' } } 3 signingKey String Secret Key "MuleJWTPassword@2023" 4 algorithm String The supported HMAC algorithms are: HS256: HmacSHA256 HS384: HmacSHA384 HS512: HmacSHA512 HmacSHA256 DataWeave: Plain Text %dw 2.0 import * from jwt::HMAC output application/json var secretKey = "MuleJWTPassword@2023" --- { token: JWT( { "alg": "HS256", "typ": "JWT" }, { iss: "jwt-hmac-demo@test.com", aud: 'https://test.mulesoft.com', iat: (now()) as Number { unit: 'seconds' }, exp: (now() + |PT7200S|) as Number { unit: 'seconds' } }, secretKey as String, 'HmacSHA256' ), expiration: (now() + |PT7150S|) } Step 4 Trigger the request and it will generate JWT. JWT with RSA output: Plain Text { "token": "eyJhbGciOiAiUlMyNTYiLCJ0eXAiOiAiSldUIn0.eyJpc3MiOiAiand0LXJzYS1kZW1vQHRlc3QuY29tIiwiYXVkIjogImh0dHBzOi8vdGVzdC5tdWxlc29mdC5jb20iLCJpYXQiOiAxNjk2NDkzODQwLCJleHAiOiAxNjk2NDk3NDQwfQ.q50ao_-1_ke7ZIizZYgz_914q8JcISWk8uCC0h08FtzlUJYWU0ss7M0gtBJSnDa3e1hAsJ2MlmKhVjL7wXbkYNRVtdCk6N1RC6dEJ2xLOPKMObvcSHvt9e5sTWOPqCBW4sZOQm9xMkCqWqkHAJ5wZzvDGOlo7K0I-23b2AhqESDqVGXNXdWKvgwVGtH1okL7PKy9aQw7grJ9iB6iV_yaFgGX82gu0m1QilF83VHvAy7sWq7RYk54FmI09U45-CXYtX_tpaq3Y1vjaGjHmkKqPfJnqO4ysBiRICvxhRcRqQgONqUSu7YpV59JoUG66r2ONnS9NFJXQSBVq7-GQl0g4A", "expiration": "2023-10-05T14:46:30.505+05:30" } JWT with HMAC output: Plain Text { "token": "eyJhbGciOiAiSFMyNTYiLCJ0eXAiOiAiSldUIn0.eyJpc3MiOiAiand0LWhtYWMtZGVtb0B0ZXN0LmNvbSIsImF1ZCI6ICJodHRwczovL3Rlc3QubXVsZXNvZnQuY29tIiwiaWF0IjogMTY5NjQ5MzIzOSwiZXhwIjogMTY5NjQ5NjgzOX0.sAiK-Bto_8JS4WLJa3nFoSYCIIv3IiXYLyL0QKXB-hQ", "expiration": "2023-10-05T14:36:29.62+05:30" } Validation Validate RSA JWT using public key: Validate HMAC JWT using a secret key: Limitations DataWeave JWT Library does not support encrypted private keys to generate JWT using the RSA algorithm. If the requirement is to create JWT using the RSA algorithm with an encrypted private key, you can use either custom Java code or JWT Sign Module.
Guardrails for Amazon Bedrock enables you to implement safeguards for your generative AI applications based on your use cases and responsible AI policies. You can create multiple guardrails tailored to different use cases and apply them across multiple foundation models (FM), providing a consistent user experience and standardizing safety and privacy controls across generative AI applications. Until now, Guardrails supported four policies — denied topics, content filters, sensitive information filters, and word filters. The Contextual grounding check policy (the latest one added at the time of writing) can detect and filter hallucination in model responses that are not grounded in enterprise data or are irrelevant to the users’ query. Contextual Grounding To Prevent Hallucinations The generative AI applications that we build depend on LLMs to provide accurate responses. This might be based on LLM inherent capabilities or using techniques such as RAG (Retrieval Augmented Generation). However, it's a known fact that LLMs are prone to hallucination and can end up responding with inaccurate information which impacts application reliability. The Contextual grounding check policy evaluates hallucinations using two parameters: Grounding — This checks if the model response is factually accurate based on the source and is grounded in the source. Any new information introduced in the response will be considered un-grounded. Relevance — This checks if the model response is relevant to the user query. Score Based Evaluation The result of the contextual grounding check is a set of confidence scores corresponding to grounding and relevance for each model response processed based on the source and user query provided. You can configure thresholds to filter (block) model responses based on the generated scores. These thresholds determine the minimum confidence score for the model response to be considered grounded and relevant. For example, if your grounding threshold and relevance threshold are each set at 0.6, all model responses with a grounding or relevance score of less than that will be detected as hallucinations and blocked. You may need to adjust the threshold scores based on the accuracy tolerance for your specific use case. For example, a customer-facing application in the finance domain may need a high threshold due to lower tolerance for inaccurate content. Keep in mind that a higher threshold for the grounding and relevance scores will result in more responses being blocked. Getting Started With Contextual Grounding To get an understanding of how contextual grounding checks work, I would recommend using the Amazon Bedrock Console since it makes it easy to test your Guardrail policies with different combinations of source data and prompts. Start by creating a Guardrails configuration. For this example, I have set the grounding check threshold to, relevance score threshold to 0.5 and configured the messages for blocked prompts and responses: As an example, I used this snippet of text from the 2023 Amazon shareholder letter PDF and used it as the Reference source. For the Prompt, I used: What is Amazon doing in the field of quantum computing? The nice part about using the AWS console is that not only can you see the final response (pre-configured in the Guardrail), but also the actual model response (that was blocked). In this case, the model response was relevant since it it came back with information about Amazon Braket. But the response was un-grounded since it wasn’t based on the source information, which had no data about quantum computing, or Amazon Braket. Hence the grounding score was 0.01 — much lower than the configured threshold of 0.85, which resulted in the model response getting blocked. Use Contextual Grounding Check for RAG Applications With Knowledge Bases Remember, Contextual grounding check is yet another policy and it can be leveraged anywhere Guardrails can be used. One of the key use cases is combining it with RAG applications built with Knowledge Bases for Amazon Bedrock. To do this, create a Knowledge Base. I created it using the 2023 Amazon shareholder letter PDF as the source data (loaded from Amazon S3) and the default vector database (OpenSearch Serverless collection). After the Knowledge Base has been created, sync the data source, and you should be ready to go! Let's start with a question that I know can be answered accurately: What is Amazon doing in the field of generative AI? This went well, as expected — we got a relevant and grounded response. Let's try another one: What is Amazon doing in the field of quantum computing? As you can see, the model response got blocked, and the pre-configured response (in Guardrails) was returned instead. This is because the source data does not actually contain information about quantum computing (or Amazon Braket), and a hallucinated response was prevented by the Guardrails. Combine Contextual Grounding Checks With RetrieveAndGenerate API Let’s go beyond the AWS console and see how to apply the same approach in a programmatic way. Here is an example using the RetrieveAndGenerate API, which queries a knowledge base and generates responses based on the retrieved results. I have used the AWS SDK for Python (boto3), but it will work with any of the SDKs. Before trying out the example, make sure you have configured and set up Amazon Bedrock, including requesting access to the Foundation Model(s). Python import boto3 guardrailId = "ENTER_GUARDRAIL_ID" guardrailVersion= "ENTER_GUARDRAIL_VERSION" knowledgeBaseId = "ENTER_KB_ID" modelArn = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-instant-v1' def main(): client = boto3.client('bedrock-agent-runtime') response = client.retrieve_and_generate( input={ 'text': 'what is amazon doing in the field of quantum computing?' }, retrieveAndGenerateConfiguration={ 'knowledgeBaseConfiguration': { 'generationConfiguration': { 'guardrailConfiguration': { 'guardrailId': guardrailId, 'guardrailVersion': guardrailVersion } }, 'knowledgeBaseId': knowledgeBaseId, 'modelArn': modelArn, 'retrievalConfiguration': { 'vectorSearchConfiguration': { 'overrideSearchType': 'SEMANTIC' } } }, 'type': 'KNOWLEDGE_BASE' }, ) action = response["guardrailAction"] print(f'Guardrail action: {action}') finalResponse = response["output"]["text"] print(f'Final response:\n{finalResponse}') if __name__ == "__main__": main() You can also refer to the code in this Github repo. Run the example (don’t forget to enter the Guardrail ID, version, Knowledge Base ID): Python pip install boto3 python grounding.py You should get an output as such: Python Guardrail action: INTERVENED Final response: Response blocked - Sorry, the model cannot answer this question. Conclusion Contextual grounding check is a simple yet powerful technique to improve response quality in applications based on RAG, summarization, or information extraction. It can help detect and filter hallucinations in model responses if they are not grounded (factually inaccurate or add new information) in the source information or are irrelevant to the user’s query. Contextual grounding check is made available to you as a policy/configuration in Guardrails for Amazon Bedrock and can be plugged in anywhere you may be using Guardrails to enforce responsible AI for your applications. For more details, refer to the Amazon Bedrock documentation for Contextual grounding. Happy building!