What's in your tech stack? Tell us about it in our annual Community Survey, and help shape the future of DZone!
Learn how to build your data architecture with open-source tools + design patterns for scalability, disaster recovery, monitoring, and more.
Leveling Up My GraphQL Skills: Real-Time Subscriptions
The Ultimate Database Scaling Cheatsheet: Strategies for Optimizing Performance and Scalability
Enterprise Security
Security is everywhere: Behind every highly performant application, or even detected threat, there is a powerful security system and set of processes implemented. And in the off chance there are NOT such systems in place, that fact will quickly make itself known. We are living in an entirely new world, where bad actors are growing more and more sophisticated the moment we make ourselves "comfortable." So how do you remain hypervigilant in this ever so treacherous environment?DZone's annual Enterprise Security Trend Report has you covered. The research and expert articles explore the fastest emerging techniques and nuances in the security space, diving into key topics like CSPM, full-stack security practices and challenges, SBOMs and DevSecOps for secure software supply chains, threat hunting, secrets management, zero-trust security, and more. It's time to expand your organization's tactics and put any future attackers in their place as you hear from industry leaders and experts on how they are facing these challenges in everyday scenarios — because if there is one thing we know about the cyberspace, any vulnerabilities left to chance will always be exposed.
Open-Source Data Management Practices and Patterns
Data Orchestration on Cloud Essentials
Are you a software developer or other tech professional? If you’re reading this, chances are pretty good that the answer is "yes." Long story short — we want DZone to work for you! We're asking that you take our annual community survey so we can better serve you! ^^ You can also enter the drawing for a chance to receive an exclusive DZone Swag Pack! The software development world moves fast, and we want to keep up! Across our community, we found that readers come to DZone for various reasons, including to learn about new development trends and technologies, find answers to help solve problems they have, connect with other peers, publish their content, and expand their personal brand's audience. In order to continue helping the DZone Community reach goals such as these, we need to know more about you, your learning preferences, and your overall experience on dzone.com and with the DZone team. For this year's DZone Community research, our primary goals are to: Learn about developer tech preferences and habits Identify content types and topics that developers want to get more information on Share this data for public consumption! To support our Community research, we're focusing on several primary areas in the survey: You, including your experience, the types of software you work on, and the tools you use How you prefer to learn and what you want to learn more about on dzone.com The ways in which you engage with DZone, your content likes vs. dislikes, and your overall journey on dzone.com As a community-driven site, our relationships with our members and contributors is invaluable, and we want to make sure that we continue to serve our audience to the best of our ability. If you're curious to see the report from the 2023 Community survey, feel free to check it out here! Thank you in advance for your participation!—Your favorite DZone Content and Community team
As I promised in Part 2, it is time to build something substantial with our Semantic Kernel so far. If you are new to Semantic Kernel and must dive into code/head first, I highly recommend starting with Part 1 of this series. There is a lot of theory out there, but we explore these articles with a GitHub sample you can easily download and play with to understand the core concepts. I wanted to use Agent Smith from The Matrix, but I can't seem to find one without copyrights. So, DALL-E 3 to the rescue. Semantic Kernel’s agents aren’t just your typical AI assistants — they’re the multitasking powerhouses that bring advanced automation to your fingertips. By leveraging AI models, plugins, and personas, these agents can perform complex tasks that go beyond mere question-answering and light automation. This article will guide you through building agents with Semantic Kernel, focusing on the key components and offering practical examples to illustrate how to create an agent that plans a trip using various plugins. In this part, we will start looking into AI agents, expand on our example from Part 2, and plan an entire day trip with our newly minted Agent. What Are Agents in Semantic Kernel? Agents in Semantic Kernel are intelligent orchestrators designed to handle complex tasks by interacting with multiple plugins and AI models. They work like a highly organized manager who knows exactly which team members (plugins) to call upon and when to get the job done. Whether it’s planning a road trip, providing weather updates, or even helping you pack for a vacation, agents can combine all these functionalities into a cohesive, efficient flow. Fundamental Building Blocks of an Agent AI Models: The core decision-making unit of an agent, AI models can be Large Language Models like OpenAI’s GPT-4/Mistral AI or small language models like Microsoft's Phi-3. The models interpret user input and generate appropriate responses or actions. Plugins: We explored these in Part 2. These specialized tools allow the agent to perform actions like data retrieval, computation, or API communication. Think of plugins as the agent’s Swiss Army knife, each tool ready for a specific purpose. Simply put, plugins are just existing code callable by an agent. Plans: Plans define the flow of tasks the agent should follow. They map out each step the agent takes, determining which plugins to activate and in what sequence — this part we haven't discussed yet. We will go over plans in this article. Personas: A persona is simply the agent's role in a given context. In the general AI world, it is often called a meta prompt or system prompt. These instructions set the tone for the Agent and give it ground rules for what to do when in doubt. Memory: Memory helps agents retain information across interactions, allowing them to maintain context and remember user preferences. In other words, a simple chat history is part of memory, giving the agent a conversation context. Even if you provide a simple input like "yes" to an Agent's question, the Agent can tie your "yes" to the rest of the conversation and understand what you are answering, much like the humans. There are a few more small components that belong to Agents, such as connectors, etc.; we will omit them here to focus on what matters. It’s Time To Plan for Our Spontaneous Day Trip Let's build an agent capable of planning a day trip by car. Where I live, I have access to the mountains by the Poconos, Jersey Shore beaches, and the greatest city of New York, all within an hour to two-hour drive. I want to build an Agent capable of planning my entire day trip, considering the weather, what to pack, whether my car is fully charged, etc. Let's dive code/head first onto our Agent. C# using Microsoft.SemanticKernel; using Microsoft.SemanticKernel.ChatCompletion; using Microsoft.SemanticKernel.Connectors.OpenAI; using System.ComponentModel; var builder = Kernel.CreateBuilder(); builder.AddAzureOpenAIChatCompletion( deploymentName: "<YOUR_DEPLOYMENT_NAME>", endpoint: "<YOUR_ENDPOINT>", apiKey: "<YOUR_AZURE_OPENAI_API_KEY>" ); builder.Plugins.AddFromType<TripPlanner>(); // <----- This is anew fellow on this Part 3 - TripPlanner. Let's add it to the Kernel builder.Plugins.AddFromType<TimeTeller>(); // <----- This is the same fellow plugin from Part 2 builder.Plugins.AddFromType<ElectricCar>(); // <----- This is the same fellow plugin from Part 2 builder.Plugins.AddFromType<WeatherForecaster>(); // <----- New plugin. We don't want to end up in beach with rain, right? var kernel = builder.Build(); IChatCompletionService chatCompletionService = kernel.GetRequiredService<IChatCompletionService>(); ChatHistory chatMessages = new ChatHistory(""" You are a friendly assistant who likes to follow the rules. You will complete required steps and request approval before taking any consequential actions. If the user doesn't provide enough information for you to complete a task, you will keep asking questions until you have enough information to complete the task. """); while (true) { Console.Write("User > "); chatMessages.AddUserMessage(Console.ReadLine()!); OpenAIPromptExecutionSettings settings = new() { ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions }; var result = chatCompletionService.GetStreamingChatMessageContentsAsync( chatMessages, executionSettings: settings, kernel: kernel); Console.Write("Assistant > "); // Stream the results string fullMessage = ""; await foreach (var content in result) { Console.Write(content.Content); fullMessage += content.Content; } Console.WriteLine("\n--------------------------------------------------------------"); // Add the message from the agent to the chat history chatMessages.AddAssistantMessage(fullMessage); } public class TripPlanner // <------------ Trip planner plugin. An expert on planning trips { [KernelFunction] [Description("Returns back the required steps necessary to plan a one day travel to a destination by an electric car.")] [return: Description("The list of steps needed to plan a one day travel by an electric car")] public async Task<string> GenerateRequiredStepsAsync( Kernel kernel, [Description("A 2-3 sentence description of where is a good place to go to today")] string destination, [Description("The time of the day to start the trip")] string timeOfDay) { // Prompt the LLM to generate a list of steps to complete the task var result = await kernel.InvokePromptAsync($""" I'm going to plan a short one day vacation to {destination}. I would like to start around {timeOfDay}. Before I do that, can you succinctly recommend the top 2 steps I should take in a numbered list? I want to make sure I don't forget to pack anything for the weather at my destination and my car is sufficiently charged before I start the journey. """, new() { { "destination", destination }, { "timeOfDay", timeOfDay } }); // Return the plan back to the agent return result.ToString(); } } public class TimeTeller // <------------ Time teller plugin. An expert on time, peak and off-peak periods { [KernelFunction] [Description("This function retrieves the current time.")] [return: Description("The current time.")] public string GetCurrentTime() => DateTime.Now.ToString("F"); [KernelFunction] [Description("This function checks if the current time is off-peak.")] [return: Description("True if the current time is off-peak; otherwise, false.")] public bool IsOffPeak() => DateTime.Now.Hour < 7 || DateTime.Now.Hour >= 21; } public class WeatherForecaster // <------------ Weather plugin. An expert on weather. Can tell the weather at a given destination { [KernelFunction] [Description("This function retrieves weather at given destination.")] [return: Description("Weather at given destination.")] public string GetTodaysWeather([Description("The destination to retrieve the weather for.")] string destination) { // <--------- This is where you would call a fancy weather API to get the weather for the given <<destination>>. // We are just simulating a random weather here. string[] weatherPatterns = { "Sunny", "Cloudy", "Windy", "Rainy", "Snowy" }; Random rand = new Random(); return weatherPatterns[rand.Next(weatherPatterns.Length)]; } } public class ElectricCar // <------------ Car plugin. Knows about states and conditions of the electric car. Also can charge the car. { private bool isCarCharging = false; private int batteryLevel = 0; private CancellationTokenSource source; // Mimic charging the electric car, using a periodic timer. private async Task AddJuice() { source = new CancellationTokenSource(); var timer = new PeriodicTimer(TimeSpan.FromSeconds(5)); while (await timer.WaitForNextTickAsync(source.Token)) { batteryLevel++; if (batteryLevel == 100) { isCarCharging = false; Console.WriteLine("\rBattery is full."); source.Cancel(); return; } //Console.WriteLine($"Charging {batteryLevel}%"); Console.Write("\rCharging {0}%", batteryLevel); } } [KernelFunction] [Description("This function checks if the electric car is currently charging.")] [return: Description("True if the car is charging; otherwise, false.")] public bool IsCarCharging() => isCarCharging; [KernelFunction] [Description("This function returns the current battery level of the electric car.")] [return: Description("The current battery level.")] public int GetBatteryLevel() => batteryLevel; [KernelFunction] [Description("This function starts charging the electric car.")] [return: Description("A message indicating the status of the charging process.")] public string StartCharging() { if (isCarCharging) { return "Car is already charging."; } else if (batteryLevel == 100) { return "Battery is already full."; } Task.Run(AddJuice); isCarCharging = true; return "Charging started."; } [KernelFunction] [Description("This function stops charging the electric car.")] [return: Description("A message indicating the status of the charging process.")] public string StopCharging() { if (!isCarCharging) { return "Car is not charging."; } isCarCharging = false; source?.Cancel(); return "Charging stopped."; } } We will dissect the code later. For now, let's ask our Agent to plan our day trip for us. Kinda cool, isn't it? We didn't tell the Agent we wanted to charge the electric car. We only told the Agent to plan a trip; it knows intuitively that: The electric car needs to be charged, and The weather needs to be checked. Cool, indeed! We have a small charging simulator using .NET's PeriodicTimer. It is irrelevant for SK, but it would give an exciting update on the console, showing that the charging and battery juice levels are ongoing. As you can see in the screenshot below, I asked the Agent to stop charging the car when the battery level was 91%, which is sufficient for the trip. Did you also notice an interesting thing? When I first asked the question, I only said to plan a trip to the beach. I didn't mention when I was planning to go or which beach. The Agent was aware of this and asked us clarifying questions to get answers to these questions. This is where the persona+memory and the planner come into the picture. Let's start dissecting the code sideways with the Planner first. Planner: The Manager of Everything Think of a planner as a manager of some sort. It can identify the course of action, or "simple steps," to achieve what the user wants. In the above example, planner identifies two steps. Check the weather and pack accordingly: This is where the WeatherForecaster plugin comes into play later. Ensure the car is ready for the trip: This is where the ElectricCar plugin comes into play later. C# public class TripPlanner // <------------ Trip planner plugin. An expert on planning trips { [KernelFunction] [Description("Returns back the required steps necessary to plan a one day travel to a destination by an electric car.")] [return: Description("The list of steps needed to plan a one day travel by an electric car")] public async Task<string> GenerateRequiredStepsAsync( Kernel kernel, [Description("A 2-3 sentence description of where is a good place to go to today")] string destination, [Description("The time of the day to start the trip")] string timeOfDay) { // Prompt the LLM to generate a list of steps to complete the task var result = await kernel.InvokePromptAsync($""" I'm going to plan a short one day vacation to {destination}. I would like to start around {timeOfDay}. Before I do that, can you succinctly recommend the top 2 steps I should take in a numbered list? I want to make sure I don't forget to pack anything for the weather at my destination and my car is sufficiently charged before I start the journey. """, new() { { "destination", destination }, { "timeOfDay", timeOfDay } }); // Return the plan back to the agent return result.ToString(); } } Look at the parameters of the GenerateRequiredStepsAsync KernelFunction. It also needs to take in destination and timeOfDay. These are necessary to plan the trip. Without knowing when and to where, there can be no trips. Now, take a closer look at the prompt. This is where we tell the planner that I want to plan for the following: A day trip To the given destination At the specified time I am using my electric car. I haven't packed for the weather at the destination. Now our Agent knows through the planner that we need to come up with steps to satisfy all of these to plan the trip. The Agent is also aware of available plugins and has the authority to invoke them to provide me with a pleasant trip. Persona: Who Am I? This is where we tell the Agent who it is. The agent's persona is important as it helps the model act within character and take instructions from the user to decide what to do in a dilemma, what steps are to be taken before an action etc. In short, personas define the ground rules of behavior of an Agent. C# ChatHistory chatMessages = new ChatHistory(""" You are a friendly assistant who likes to follow the rules. You will complete required steps and request approval before taking any consequential actions. If the user doesn't provide enough information for you to complete a task, you will keep asking questions until you have enough information to complete the task. """); Here, we clearly define the character and role of our agent. We told it that you: Are an assistant Will follow given rules Take steps. Ask for approval before any major actions. Get clarification if the user doesn't give enough input. Iterations and Memory A new CharHistory instance is created with meta prompt/persona instruction as the first message. This history, later added by the user's input and LLM's responses, serves as a context memory of the conversation. This helps the Agent choose the correct action based on the context derived from the conversation history. C# while (true) { Console.Write("User > "); chatMessages.AddUserMessage(Console.ReadLine()!); OpenAIPromptExecutionSettings settings = new() { ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions }; var result = chatCompletionService.GetStreamingChatMessageContentsAsync( chatMessages, executionSettings: settings, kernel: kernel); Console.Write("Assistant > "); // Stream the results string fullMessage = ""; await foreach (var content in result) { Console.Write(content.Content); fullMessage += content.Content; } Console.WriteLine("\n--------------------------------------------------------------"); // Add the message from the agent to the chat history chatMessages.AddAssistantMessage(fullMessage); } As you can see, we are setting ToolCallBehavior to ToolCallBehavior.AutoInvokeKernelFunctions. This gives our Agent enough authority to invoke plugins when necessary. Each user's input and the model's response are added to the chatMessages. This will help set the context for further interactions. When I say, "That's enough charging," the agent would know that the car is being charged based on previous conversations. An agent's memory gear is nothing but chat history here. Augmented data would also serve as memory (part of the fancy RAG); we wouldn't touch on that for now. Plugins: The Robotic Arms We have already discussed plugins in detail in Part 2. We have added a WeatherForecaster plugin to the mix to help us plan the trip. In a real-world scenario, we would call a real weather API to get the actual weather. We are picking a random weather pattern for this example, which should suffice. We have also added a batteryLevel variable into our ElectricCar plugin. This helps us simulate the charging behavior using a simple timer. We wouldn't be getting into the details of each of these plugins here. Please revisit Part 2 to have a deeper understanding of how plugins work. As usual, this article includes a working GitHub sample. Clone the code and enjoy playing with it. Wrap Up We started harnessing the power of the Semantic Kernel. Once we start mixing plugins with persona, planner, and memory, the resulting Agents can automate tasks, ask leading questions, take actions on your behalf, get confirmation before executing essential tasks, and more. Agents in Semantic Kernel are not just tools; they’re dynamic assistants that combine the power of AI, plugins, and orchestrated plans to solve complex problems. By understanding their building blocks — AI models, plugins, plans, memory, and connectors — you can create competent agents tailored to your specific needs. The possibilities are vast, from managing travel plans to automating tedious tasks, making Semantic Kernel a powerful ally in your AI toolkit. What's Next? Now that we have connected all the pieces of the Semantic Kernel puzzle through Part 1, Part 2, and Part 3, it is time to start thinking beyond a console application. In the following parts of our series, we will add an Agent to an ASP.NET Core API and use dependency injection to create more than one kernel instance to help us navigate our trip planning. We are not going to stop there. We will integrate Semantic Kernel to a locally downloaded Small Language Model (SLM) and make it work for us. Once that works, we aren't far from a .NET MAUI app that can do the AI dance without internet connectivity or GPT-4. I am not going to spoil most of the surprises, keep going through this series to learn more and more!
There are 9 types of java.lang.OutOfMemoryErrors, each signaling a unique memory-related issue within Java applications. Among these, java.lang.OutOfMemoryError: Metaspace is a challenging error to diagnose. In this post, we’ll delve into the root causes behind this error, explore potential solutions, and discuss effective diagnostic methods to troubleshoot this problem. Let’s equip ourselves with the knowledge and tools to conquer this common adversary. JVM Memory Regions To better understand OutOfMemoryError, we first need to understand different JVM Memory regions. Here is a video clip that gives a good introduction to different JVM memory regions. But in a nutshell, JVM has the following memory regions: Figure 1: JVM memory regions Young Generation: Newly created application objects are stored in this region. Old Generation: Application objects that are living for a longer duration are promoted from the Young Generation to the Old Generation. Basically, this region holds long-lived objects. Metaspace: Class definitions, method definitions, and other metadata that are required to execute your program are stored in the Metaspace region. This region was added in Java 8. Before that metadata definitions were stored in the PermGen. Since Java 8, PermGen was replaced by Metaspace. Threads: Each application thread requires a thread stack. Space allocated for thread stacks, which contain method call information and local variables are stored in this region. Code cache: Memory areas where compiled native code (machine code) of methods is stored for efficient execution are stored in this region. Direct buffer: ByteBuffer objects are used by modern frameworks (i.e., Spring WebClient) for efficient I/O operations. They are stored in this region. GC (Garbage Collection): Memory required for automatic garbage collection to work is stored in this region. JNI (Java Native Interface): Memory for interacting with native libraries and code written in other languages is stored in this region. misc: There are areas specific to certain JVM implementations or configurations, such as the internal JVM structures or reserved memory spaces, they are classified as ‘misc’ regions. What Is java.lang.OutOfMemoryError: Metaspace? Figure 2: java.lang.OutOfMemoryError: Metaspace With a lot of class definitions, method definitions are created in the Metaspace region than the allocated Metaspace memory limit (i.e., -XX:MaxMetaspaceSize), JVM will throw java.lang.OutOfMemoryError: Metaspace. What Causes java.lang.OutOfMemoryError: Metaspace? java.lang.OutOfMemoryError: Metaspace is triggered by the JVM under the following circumstances: Creating a large number of dynamic classes: If your application uses Groovy kind of scripting languages or Java Reflection to create new classes at runtime Loading a large number of classes: Either your application itself has a lot of classes or it uses a lot of 3rd party libraries/frameworks which have a lot of classes in it. Loading a large number of class loaders: Your application is loading a lot of class loaders. Solutions for OutOfMemoryError: Metaspace The following are the potential solutions to fix this error: Increase Metaspace size: If OutOfMemoryError surfaced due to an increase in the number of classes loaded, then increased the JVM’s Metaspace size (-XX:MetaspaceSize and -XX:MaxMetaspaceSize). This solution is sufficient to fix most of the OutOfMemoryError: Metaspace errors, because memory leaks rarely happen in the Metaspace region. Fix memory leak: Analyze memory leaks in your application using the approach given in this post. Ensure that class definitions are properly dereferenced when they are no longer needed to allow them to be garbage collected. Sample Program That Generates OutOfMemoryError: Metaspace To better understand java.lang.OutOfMemoryError: Metaspace, let’s try to simulate it. Let’s leverage BuggyApp, a simple open-source chaos engineering project. BuggyApp can generate various sorts of performance problems such as Memory Leak, Thread Leak, Deadlock, multiple BLOCKED threads, etc. Below is the Java program from the BuggyApp project that simulates java.lang.OutOfMemoryError: Metaspace when executed. import java.util.UUID; import javassist.ClassPool; public class OOMMetaspace { public static void main(String[] args) throws Exception { ClassPool classPool = ClassPool.getDefault(); while (true) { // Keep creating classes dynamically! String className = "com.buggyapp.MetaspaceObject" + UUID.randomUUID(); classPool.makeClass(className).toClass(); } } } In the above program, the OOMMetaspace’ class’s ‘main() method contains an infinite while (true) loop. Within the loop, the thread uses open-source library javassist to create dynamic classes whose names start with com.buggyapp.MetaspaceObject. Class names generated by this program will look something like this: com.buggyapp.MetaspaceObjectb7a02000-ff51-4ef8-9433-3f16b92bba78. When so many such dynamic classes are created, the Metaspace memory region will reach its limit and the JVM will throw java.lang.OutOfMemoryError: Metaspace. How To Troubleshoot OutOfMemoryError: Metaspace To diagnose OutOfMemoryError: Metaspace, we need to inspect the contents of the Metaspace region. Upon inspecting the contents, you can figure out the leaking area of the application code. Here is a blog post that describes a few different approaches to inspecting the contents of the Metaspace region. You can choose the approach that suits your requirements. My favorite options are: 1. -verbose:class If you are running on Java version 8 or below, then you can use this option. When you pass the -verbose:class option to your application during startup, it will print all the classes that are loaded into memory. Loaded classes will be printed in the standard error stream (i.e., console, if you aren’t routing your error stream to a log file). Example: java {app_name} -verbose:class When we passed the -verbose:class flag to the above program, in the console we started to see the following lines to be printed: [Loaded com.buggyapp.MetaspaceObjecta97f62c5-0f71-4702-8521-c312f3668f47 from __JVM_DefineClass__] [Loaded com.buggyapp.MetaspaceObject70967d20-609f-42c4-a2c4-b70b50592198 from __JVM_DefineClass__] [Loaded com.buggyapp.MetaspaceObjectf592a420-7109-42e6-b6cb-bc5635a6024e from __JVM_DefineClass__] [Loaded com.buggyapp.MetaspaceObjectdc7d12ad-21e6-4b17-a303-743c0008df87 from __JVM_DefineClass__] [Loaded com.buggyapp.MetaspaceObject01d175cc-01dd-4619-9d7d-297c561805d5 from __JVM_DefineClass__] [Loaded com.buggyapp.MetaspaceObject5519bef3-d872-426c-9d13-517be79a1a07 from __JVM_DefineClass__] [Loaded com.buggyapp.MetaspaceObject84ad83c5-7cee-467b-a6b8-70b9a43d8761 from __JVM_DefineClass__] [Loaded com.buggyapp.MetaspaceObject35825bf8-ff39-4a00-8287-afeba4bce19e from __JVM_DefineClass__] [Loaded com.buggyapp.MetaspaceObject665c7c09-7ef6-4b66-bc0e-c696527b5810 from __JVM_DefineClass__] [Loaded com.buggyapp.MetaspaceObject793d8aec-f2ee-4df6-9e0f-5ffb9789459d from __JVM_DefineClass__] : : This is a clear indication that classes with the com.buggyapp.MetaspaceObject prefixes are loaded so frequently into the memory. This is a great clue/hint to let you know where the leak is happening in the application. 2. -Xlog:class+load If you are running on Java version 9 or above, then you can use this option. When you pass the -Xlog:class+load option to your application during startup, it will print all the classes that are loaded into memory. Loaded classes will be printed in the file path you have configured. Example: java {app_name} -Xlog:class+load=info:/opt/log/loadedClasses.txt If you are still unable to determine the origination of the leak based on the class name, then you can do a deep dive by taking a heap dump from the application. You can capture a heap dump using one of the 8 options discussed in this post. You might choose the option that fits your needs. Once a heap dump is captured, you need to use tools like HeapHero, JHat, etc. to analyze the dumps. What Is Heap Dump? Heap Dump is basically a snapshot of your application memory. It contains detailed information about the objects and data structures present in the memory. It will tell what objects are present in the memory, whom they are referencing, who is referencing, what is the actual customer data stored in them, what size of they occupy, whether they are eligible for garbage collection, etc. They provide valuable insights into the memory usage patterns of an application, helping developers identify and resolve memory-related issues. How To Analyze Metaspace Memory Leak Through Heap Dump HeapHero is available in two modes: Cloud: You can upload the dump to the HeapHero cloud and see the results. On-Prem: You can register here, get the HeapHero installed on your local machine, and then do the analysis.Note: I prefer using the on-prem installation of the tool instead of using the cloud edition because heap dump tends to contain sensitive information (such as SSN, Credit Card Numbers, VAT, etc.), and I don’t want the dump to be analyzed in external locations. Once the heap dump is captured, from the above program, we upload it to the HeapHero tool. The tool analyzed the dump and generated the report. In the report go to the ‘Histogram’ view. This view will show all the classes that are loaded into the memory. In this view, you will notice the classes with the prefix com.buggyapp.MetaspaceObject. Right-click on the … that is next to the class name. Then click on the List Object(s) with > incoming references as shown in the below figure. Figure 3: Histogram view of showing all the loaded classes in memory Once you do it, the tool will display all the incoming references of this particular class. This will show the origin point of these classes as shown in the below figure. It will clearly show which part of the code is creating these class definitions. Once we know which part of the code is creating these class definitions, then it will be easy to fix the problem. Figure 4: Incoming references of the class Video Summary Here’s a video summary of the article: Conclusion In this post, we’ve covered a range of topics, from understanding JVM memory regions to diagnosing and resolving java.lang.OutOfMemoryError: Metaspace. We hope you’ve found the information useful and insightful. But our conversation doesn’t end here. Your experiences and insights are invaluable to us and to your fellow readers. We encourage you to share your encounters with java.lang.OutOfMemoryError: Metaspace in the comments below. Whether it’s a unique solution you’ve discovered, a best practice you swear by, or even just a personal anecdote, your contributions can enrich the learning experience for everyone.
Regarding contemporary software architecture, distributed systems have been widely recognized for quite some time as the foundation for applications with high availability, scalability, and reliability goals. When systems shifted from a centralized structure, it became increasingly important to focus on the components and architectures that support a distributed structure. Regarding the choice of frameworks, Spring Boot is a widely adopted framework encompassing many tools, libraries, and components to support these patterns. This article will focus on the specific recommendations for implementing various distributed system patterns regarding Spring Boot, backed by sample code and professional advice. Spring Boot Overview One of the most popular Java EE frameworks for creating apps is Spring. The Spring framework offers a comprehensive programming and configuration mechanism for the Java platform. It seeks to make Java EE programming easier and increase developers' productivity in the workplace. Any type of deployment platform can use it. It tries to meet modern industry demands by making application development rapid and straightforward. While the Spring framework focuses on giving you flexibility, the goal of Spring Boot is to reduce the amount of code and give developers the most straightforward approach possible to create web applications. Spring Boot's default codes and annotation setup lessen the time it takes to design an application. It facilitates the creation of stand-alone applications with minimal, if any, configuration. It is constructed on top of a module of the Spring framework. With its layered architecture, Spring Boot has a hierarchical structure where each layer can communicate with any layer above or below it. Presentation layer: The presentation layer converts the JSON parameter to an object, processes HTTP requests (from the specific Restful API), authenticates the request, and sends it to the business layer. It is made up, in brief, of views or the frontend section. Business layer: All business logic is managed by this layer. It employs services from data access layers and is composed of service classes. It also carries out validation and permission. Persistence layer: Using various tools like JDBC and Repository, the persistence layer translates business objects from and to database rows. It also houses all of the storage logic. Database layer: CRUD (create, retrieve, update, and delete) actions are carried out at the database layer. The actual scripts that import and export data into and out of the database This is how the Spring Boot flow architecture appears: Table 1: Significant differences between Spring and Spring Boot 1. Microservices Pattern The pattern of implementing microservices is arguably one of the most used designs in the current software world. It entails breaking down a complex, monolithic application into a collection of small, interoperable services. System-dependent microservices execute their processes and interconnect with other services using simple, lightweight protocols, commonly RESTful APIs or message queues. The first advantages of microservices include that they are easier to scale, separate faults well, and can be deployed independently. Spring Boot and Spring Cloud provide an impressive list of features to help implement a microservices architecture. Services from Spring Cloud include service registry, provided by Netflix Eureka or Consul; configuration offered by Spring Cloud config; and resilience pattern offered through either Hystrix or recently developed Resilience4j. Let’s, for instance, take a case where you’re creating an e-commerce application. This application can be split into several microservices covering different domains, for example, OrderService, PaymentService, and InventoryService. All these services can be built, tested, and implemented singularly in service-oriented systems. Java @RestController @RequestMapping("/orders") public class OrderController { @Autowired private OrderService orderService; @PostMapping public ResponseEntity<Order> createOrder(@RequestBody Order order) { Order createdOrder = orderService.createOrder(order); return ResponseEntity.status(HttpStatus.CREATED).body(createdOrder); } @GetMapping("/{id}") public ResponseEntity<Order> getOrder(@PathVariable Long id) { Order order = orderService.getOrderById(id); return ResponseEntity.ok(order); } } @Service public class OrderService { // Mocking a database call private Map<Long, Order> orderRepository = new HashMap<>(); public Order createOrder(Order order) { order.setId(System.currentTimeMillis()); orderRepository.put(order.getId(), order); return order; } public Order getOrderById(Long id) { return orderRepository.get(id); } } In the example above, OrderController offers REST endpoints for making and retrieving orders, while OrderService manages the business logic associated with orders. With each service operating in a separate, isolated environment, this pattern may be replicated for the PaymentService and InventoryService. 2. Event-Driven Pattern In an event-driven architecture, the services do not interact with each other in a request-response manner but rather in a loosely coupled manner where some services only produce events and others only consume them. This pattern is most appropriate when there is a need for real-time processing while simultaneously fulfilling high scalability requirements. It thus establishes the independence of the producers and consumers of events — they are no longer tightly linked. An event-driven system can efficiently work with large and unpredictable loads of events and easily tolerate partial failures. Implementation With Spring Boot Apache Kafka, RabbitMQ, or AWS SNS/SQS can be effectively integrated with Spring Boot, greatly simplifying the creation of event-driven architecture. Spring Cloud Stream provides developers with a higher-level programming model oriented on microservices based on message-driven architecture, hiding the specifics of different messaging systems behind the same API. Let us expand more on the e-commerce application. Consider such a scenario where the order is placed, and the OrderService sends out an event. This event can be consumed by other services like InventoryService to adjust the stock automatically and by ShippingService to arrange delivery. Java // OrderService publishes an event @Autowired private KafkaTemplate<String, String> kafkaTemplate; public void publishOrderEvent(Order order) { kafkaTemplate.send("order_topic", "Order created: " + order.getId()); } // InventoryService listens for the order event @KafkaListener(topics = "order_topic", groupId = "inventory_group") public void consumeOrderEvent(String message) { System.out.println("Received event: " + message); // Update inventory based on the order details } In this example, OrderService publishes an event to a Kafka topic whenever a new order is created. InventoryService, which subscribes to this topic, consumes and processes the event accordingly. 3. CQRS (Command Query Responsibility Segregation) The CQRS pattern suggests the division of the handling of commands into events that change the state from the queries, which are events that retrieve the state. This can help achieve a higher level of scalability and maintainability of the solution, especially when the read and write operations within an application are significantly different in the given area of a business domain. As for the support for implementing CQRS in Spring Boot applications, let’s mention the Axon Framework, designed to fit this pattern and includes command handling, event sourcing, and query handling into the mix. In a CQRS setup, commands modify the state in the write model, while queries retrieve data from the read model, which could be optimized for different query patterns. A banking application, for example, where account balances are often asked, but the number of transactions that result in balance change is comparatively less. By separating these concerns, a developer can optimize the read model for fast access while keeping the write model more consistent and secure. Java // Command to handle money withdrawal @CommandHandler public void handle(WithdrawMoneyCommand command) { if (balance >= command.getAmount()) { balance -= command.getAmount(); AggregateLifecycle.apply(new MoneyWithdrawnEvent(command.getAccountId(), command.getAmount())); } else { throw new InsufficientFundsException(); } } // Query to fetch account balance @QueryHandler public AccountBalance handle(FindAccountBalanceQuery query) { return new AccountBalance(query.getAccountId(), this.balance); } In this code snippet, a WithdrawMoneyCommand modifies the account balance in the command model, while a FindAccountBalanceQuery retrieves the balance from the query model. 4. API Gateway Pattern The API Gateway pattern is one of the critical patterns used in a microservices architecture. It is the central access point for every client request and forwards it to the right microservice. The following are the cross-cutting concerns: Authentication, logging, rate limiting, and load balancing, which are all handled by the gateway. Spring Cloud Gateway is considered the most appropriate among all the available options for using an API Gateway in a Spring Boot application. It is developed on Project Reactor, which makes it very fast and can work with reactive streams. Let us go back to our first e-commerce example: an API gateway can forward the request to UserService, OrderService, PaymentService, etc. It can also have an authentication layer and accept subsequent user requests to be passed to the back-end services. Java @Bean public RouteLocator customRouteLocator(RouteLocatorBuilder builder) { return builder.routes() .route("order_service", r -> r.path("/orders/**") .uri("lb://ORDER-SERVICE")) .route("payment_service", r -> r.path("/payments/**") .uri("lb://PAYMENT-SERVICE")) .build(); } In this example, the API Gateway routes requests to the appropriate microservice based on the request path. The lb://prefix indicates that these services are registered with a load balancer (such as Eureka). 5. Saga Pattern The Saga pattern maintains transactions across multiple services in a distributed transaction environment. With multiple microservices available, it becomes challenging to adjust data consistency in a distributed system where each service can have its own database. The Saga pattern makes it possible for all the operations across services to be successfully completed or for the system to perform compensating transactions to reverse the effects of failure across services. The Saga pattern can be implemented by Spring Boot using either choreography — where services coordinate and interact directly through events — or orchestration, where a central coordinator oversees the Saga. Each strategy has advantages and disadvantages depending on the intricacy of the transactions and the degree of service coupling. Imagine a scenario where placing an order involves multiple services: A few of them include PaymentService, InventoryService, and ShippingService. Every service has to be successfully executed for the order to be confirmed. If any service fails, compensating transactions must be performed to bring the system back to its initial status. Java public void processOrder(Order order) { try { paymentService.processPayment(order.getPaymentDetails()); inventoryService.reserveItems(order.getItems()); shippingService.schedule**process(order);** Figure 2: Amazon’s Saga Pattern Functions Workflow The saga pattern is a failure management technique that assists in coordinating transactions across several microservices to preserve data consistency and establish consistency in distributed systems. Every transaction in a microservice publishes an event, and the subsequent transaction is started based on the event's result. Depending on whether the transactions are successful or unsuccessful, they can proceed in one of two ways. As demonstrated in Figure 2, the Saga pattern uses AWS Step Functions to construct an order processing system. Every step (like "ProcessPayment") has a separate step to manage the process's success (like "UpdateCustomerAccount") or failure (like "SetOrderFailure"). A company or developer ought to think about implementing the Saga pattern if: The program must provide data consistency amongst several microservices without tightly connecting them together. Because some transactions take a long time to complete, they want to avoid the blocking of other microservices due to the prolonged operation of one microservice. If an operation in the sequence fails, it must be possible to go back in time. It is important to remember that the saga pattern becomes more complex as the number of microservices increases and that debugging is challenging. The pattern necessitates the creation of compensatory transactions for reversing and undoing modifications using a sophisticated programming methodology. 6. Circuit Breaker Pattern Circuit Breaker is yet another fundamental design pattern in distributed systems, and it assists in overcoming the domino effect, thereby enhancing the system's reliability. It operates so that potentially failing operations are enclosed by a circuit breaker object that looks for failure. When failures exceed the specified limit, the circuit "bends,and the subsequent calls to the operation simply return an error or an option of failure without performing the task. It enables the system to fail quickly and/or protects other services that may be overwhelmed. In Spring, you can apply the Circuit Breaker pattern with the help of Spring Cloud Circuit Breaker with Resilience4j. Here's a concise implementation: Java // Add dependency in build.gradle or pom.xml // implementation 'org.springframework.cloud:spring-cloud-starter-circuitbreaker-resilience4j' import io.github.resilience4j.circuitbreaker.annotation.CircuitBreaker; import org.springframework.stereotype.Service; @Service public class ExampleService { @CircuitBreaker(name = "exampleBreaker", fallbackMethod = "fallbackMethod") public String callExternalService() { // Simulating an external service call that might fail if (Math.random() < 0.7) { // 70% chance of failure throw new RuntimeException("External service failed"); } return "Success from external service"; } public String fallbackMethod(Exception ex) { return "Fallback response: " + ex.getMessage(); } } // In application.properties or application.yml resilience4j.circuitbreaker.instances.exampleBreaker.failureRateThreshold=50 resilience4j.circuitbreaker.instances.exampleBreaker.waitDurationInOpenState=5000ms resilience4j.circuitbreaker.instances.exampleBreaker.slidingWindowSize=10 In this instance of implementation: A developer adds the @CircuitBreaker annotation to the callExternalService function. When the circuit is open, the developer specifies a fallback method that will be called. Configure the application configuration file's circuit breaker properties. This configuration enhances system stability by eliminating cascade failures and allowing the service to handle errors gracefully in the external service call. Conclusion By applying the microservices pattern, event-driven pattern, command query responsibility segregation, API gateway pattern, saga pattern, and circuit breaker pattern with the help of Spring Boot, developers and programmers can develop distributed systems that are scalable, recoverable, easily maintainable, and subject to evolution. An extensive ecosystem of Spring Boot makes it possible to solve all the problems associated with distributed computing, which makes this framework the optimal choice for developers who want to create a cloud application. Essential examples and explanations in this article are constructed to help the reader begin using distributed system patterns while developing applications with Spring Boot. However, in order to better optimize and develop systems and make sure they can withstand the demands of today's complex and dynamic software environments, developers can investigate more patterns and sophisticated methodologies as they gain experience. References Newman, S. (2015). Building Microservices: Designing Fine-Grained Systems. O'Reilly Media. Richards, M. (2020). Software Architecture Patterns. O'Reilly Media. AWS Documentation. (n.d.). AWS Step Functions - Saga Pattern Implementation Nygard, M. T. (2007). Release It!: Design and Deploy Production-Ready Software. Pragmatic Bookshelf. Resilience4j Documentation. (n.d.). Spring Cloud Circuit Breaker with Resilience4j. Red Hat Developer. (2020). Microservices with the Saga Pattern in Spring Boot.
MuleSoft Anypoint is an enterprise platform of an API event-driven architecture that allows developers to create design and build an API, and to name a few capabilities where you can share API templates and application assets. It also has a central web interface to manage, integrate, secure, and monitor API performance. Open API Specification MuleSoft API development supports Open API specification. The Open API Specification (OAS) is a standard practice of programming language and agonistic interface for HTTP APIs and provides service content discovery capabilities without actual access to code ,documentation, network traffic inspection. For this tutorial, we won't be covering the creation of the open API specification, but just to note: an Open API spec file needs to be created and defined inside the folder content. MuleSoft Versioning MuleSoft uses 3 digits semantic versioning scheme which denotes the major, minor, and patch release. {Major}.{Minor}.{Patch} The Major version is for when you make incompatible or breaking API changes. The Minor version is for when you add functionality in a backward-compatible manner. The Patch version is when you make changes like documentation update without altering the spec. The API version is different from the above versioning scheme and is specified in the OAS definition. An API contains each asset version signifying major/minor updates. Version v1 can contain a number of asset versions; for example, 1.0.0, 1.0.1, 1.0.2, etc., and version v2 can contain asset versions, example2.0.0, 2.0.1, 2.0.2, etc. Asset Lifecycle States For the Asset Development Lifecycle, we will be using the MuleSoft lifecycle states below. Development State This is an iterative process of design and development. Sample asset in development state config file: JSON { "assetId": "hello-worldsample-v1", "status": "development", "version": "1.0.1", "classifier": "oas" } Stable State The asset is ready for consumption and in MuleSoft this is denoted by setting the status to "published" which actually means stable. Sample asset in the stable state config file: JSON { "assetId": "hello-worldsample-v1", "status": "published", "version": "1.0.1", "classifier": "oas" } Deprecated State Deprecated flags the API and can be used to earmark the API for deletion. We won't be using this state in this tutorial. Asset Id: Id of an asset (API) published to the Exchange (API Catalog) Asset version: Version of the asset (API) published to the Exchange (API catalog) Classifier: The type of API specification; in this case, it is OAS (Open API specification) Automated release tags allocated according to asset state: ASSET STATE FOLDER CUSTOM RELEASE VERSIONING LOGIC SAMPLE RELEASE TAG Development global config file develop-mulesoftAPIVersion.timestamp develop-1.0.1.166720192 Stable global config file stable-mulesoftAPIVersion.timestamp stable-1.0.1.166720192 Custom Release Tagging Scheme In this tutorial, we have devised a custom tagging scheme that we can utilize during our release phase to tag the codebase with this custom tag. This helps us to track the particular MuleSoft release development all the way to the deployment of an API to a particular environment as it gets promoted to the production environment. Ideally, you don't have to follow the same logic for tagging. You can devise your own custom based on the Asset and environment-related deployments in MuleSoft. Since we will be configuring a custom tag based on a particular environment deployment, the MuleSoft version will be used inside the tag indicated below with mulesoftAssetVersion with the provided sample as "1.0.1". Automated release tags according to environment deployment: ENVIRONMENT FOLDER CUSTOM RELEASE VERSIONING LOGIC SAMPLE RELEASE TAG Development dev dev-deploy-mulesoftAssetVersion.timestamp dev-deploy-1.0.1.166720192 Quality Assurance qa qa-deploy-mulesoftAssetVersion.timestamp qa-deploy-1.0.1.166720192 Production prod prod-deploy-mulesoftAssetVersion.timestamp prod-deploy-1.0.1.166720192 For this tutorial, we have set up the below automated custom release versioning and tagging scheme. The code development is based on Trunk based development with a single stream of development. The sequence of changes as indicated in the below diagram starts when a developer cuts a feature branch for their development purposes. They have the flexibility to amend the changes in the global config (sits outside the environment folders) or the OAS file for development and publishing for the particular asset to the MuleSoft Exchange. At the end of the development, they can set the Asset state to published or development and create a pull request to merge onto the master branch. Once the asset is available on an exchange, the developer can configure the environment-related asset config files or API policy files which are environment-specific and enable the developer to consume any of the published asset versions in the environment-specific asset config for API deployment purposes only. For example, the Dev environment can deploy 1.0.3 of an asset, while the QA environment can use 1.0.2 and the prod can deploy 1.0.1. API policies can be utilized in a similar fashion while Dev and QA use a different set of API policies and Prod can use a different set of API policies. The Azure CI/CD pipeline has been set up in a way that once the pull the request for merge to master has been approved, it automatically senses the changes for the states in the global asset config file and environment-specific file changes and triggers the deployment of an Asset or MuleSoft API based on the changes applied to certain files then creates a tag based on a set of changes that the developer has initiated. MuleSoft Custom Release Version Tagging Scheme Since the Azure CI/CD pipeline is outside the scope of this tutorial, we won't be covering this in this tutorial. Create Folder Structure To create a folder structure according to your environment requirements, we have created the below folder structure. Parent Folder Name You can set this to the name of your API and major version; for example, helloWorldSample-V1 (note that V1 in this instance indicates the major version of the API). config.json (sitting outside the environment folders) - Used for the development and publishing of the particular asset to the MuleSoft exchange hello-oas.yaml (sitting outside the environment folders) - API OAS file for development and publishing for the particular asset to MuleSoft runtime README.md - Read me file for description, and tutorial notes about the repository content dev env config.json - Used for an API deployment on MuleSoft with the particular associated asset name and version specified inside this file policy.json - Used to define custom API policy payload that will applied to the API on the environment tls-context.json - Used to define environment-level custom TLS context for binding to inbound / outbound API requests. qa env - Same folder structure as dev env; Note: the actual content of individual JSON files will vary according to environment-specific requirements. prod env - Same folder structure as dev env; Note: the actual content of individual JSON files will vary according to environment-specific requirements. Content of Files Config.json (sitting outside the environment folders): JSON { "assetId": "helloWorldSample-v1", "status": "published", #change this to "development" if you are developing the asset. "version": "1.0.1", "classifier": "oas" } hello-oas.yaml (sitting outside the environment folders): You can create the sample hello-oas.yaml file or an example file by referring to this link. dev env(folder) config.json { "assetId": "helloWorldSample-v1", "version": "1.0.1", "classifier": "oas" } policy.json (sample policy file): JSON { "policy": [ { "configurationData": { "clusterizable": true, "exposeHeaders": true, "rateLimits": [ { "timePeriodInMilliseconds": 5000, "maximumRequests": 1000 } ] }, "order": 1, "pointcutData": null, "assetId": "rate-limiting", "assetVersion": "1.0.0", "groupId": "$(groupId)" }, { "configurationData": { "credentialsOriginHasHttpBasicAuthenticationHeader": "customExpression", "clientIdExpression": "#[attributes.headers['client_id']]", "clientSecretExpression": "#[attributes.headers['client_secret']]" }, "order": 2, "disabled": false, "pointcutData": null, "groupId": "$(groupId)", "assetId": "client-id-enforcement", "assetVersion": "1.3.2" } ] } tls-context.json (sample JSON file) - Note: "masked $(variables)" to be passed via pipeline; you can change this to any other valid variable or value. JSON { "tlsContext_InputParameters": { "technology": "mule4", "endpoint": { "type": "rest", "deploymentType": "CH", "proxyUri": "https://0.0.0.0:8092/api", "isCloudHub": true, "referencesUserDomain": false, "responseTimeout": null, "muleVersion4OrAbove": true, "tlsContexts": { "inbound": { "tlsContextId": "$(inboundtlsContextId)", "name": "$(inboundTlsname)", "secretGroupId": "$(inboundsecretGroupId)" }, "outbound": { "tlsContextId": "$(outboundtlsContextId)", "name": "$(outboundTlsname)", "secretGroupId": "$(outboundsecretGroupId)" } } } } } Capture Config Changes For this, we used Python code to capture the list of changes and then generate a JSON file based on the list of changes. Import Libraries Python import subprocess; from subprocess import Popen, PIPE; import sys; import getopt; import os; Create JSON Functions Python import json; def get_json_key_value(filename, key): with open(filename, 'r') as config_file: config_data = json.load(config_file) return (config_data[''+key+'']) def create_json_file(json_data, json_file): jsonString = json.dumps(json_data) jsonFile = open(json_file, "w") jsonFile.write(jsonString) jsonFile.close() Create cmd Functions Python from subprocess import *; from os import chdir def run_cmd(git_command, use_shell=True): """Run's the given git command, throws exception on failure""" return check_output(git_command, shell=use_shell) def create_folder(path): '''Check if directory exists, if not, create it''' import os # You should change 'test' to your preferred folder. CHECK_FOLDER = os.path.isdir(path) # If folder doesn't exist, then create it. if not CHECK_FOLDER: os.makedirs(path) print("created folder : ", path) else: print(path, "folder already exists.") Create an Array Create an array for the modified files and a tags dictionary inside an array. Python modified_files = [] tag_dict = [] counter = 0 # Output folder path outputFolder = ".output" Create Timestamp Function Python import calendar; import time; def get_timestamp(): return calendar.timegm(time.gmtime()) Get and append the array of modified files using the Git command to gather the list of file changes. Python modified_file_list = (subprocess.Popen(['git show --pretty"=format:" --name-only'], shell=True,stdout=subprocess.PIPE).communicate()[0].decode('utf-8').strip()) modified_files.append(modified_file_list.split()) Python # Create dictionary for modified files and assign tags for modified_file in modified_files: for file in modified_file: counter += 1 ## For Asset Status set to development if (((file == "config.json") or (file.endswith('.yaml'))) and ((get_json_key_value("config.json", "status")) == 'development')): tag_dict.append({'order': counter, 'fileChanged': file, 'tag': 'develop-'+( get_json_key_value("config.json", "version"))+'.'+str(get_timestamp())}) ## For Asset Status set to published if (((file == "config.json") or (file.endswith('.yaml'))) and ((get_json_key_value("config.json", "status")) == 'published')): tag_dict.append({'order': counter, 'fileChanged': file, 'tag': 'stable-'+( get_json_key_value("config.json", "version"))+'.'+str(get_timestamp())}) ## For Development environment deployment if (file.startswith('dev/')): tag_dict.append({'order': counter, 'fileChanged': file, 'tag': 'dev-deploy-'+( get_json_key_value("config.json", "version"))+'.'+str(get_timestamp())}) ## For QA environment deployment if (file.startswith('qa/')): tag_dict.append({'order': counter, 'fileChanged': file, 'tag': 'qa-deploy-'+( get_json_key_value("config.json", "version"))+'.'+str(get_timestamp())}) ## For Prod environment deployment if (file.startswith('prod/')): tag_dict.append({'order': counter, 'fileChanged': file, 'tag': 'prod-deploy-'+( get_json_key_value("config.json", "version"))+'.'+str(get_timestamp())} Create Output Folder Python # Create .output folder if it does not exisit create_folder(outputFolder) Create JSON Tags File Python # Create Tags json file create_json_file(tag_dict, outputFolder+"/git_tags.json") If all is configured properly, the script will generate the below JSON tags file. JSON [ { "order": 1, "fileChanged": "config.json", "tag": "stable-1.0.0.1663027933" }, { "order": 2, "fileChanged": "dev/config.json", "tag": "dev-deploy-1.0.0.1663027933" }, { "order": 3, "fileChanged": "qa/config.json", "tag": "qa-deploy-1.0.0.1663027933" }, { "order": 4, "fileChanged": "prod/config.json", "tag": "prod-deploy-1.0.0.1663027933" } ] We then capture the appropriate Tags using the bash script and then assign it variables using jq. The variables can then be used in the pipeline to automatically trigger further deployment tasks. Shell # Use jq to read the Json Tags and assign it to the variable API_STATUS=`${JQ_TOOL_PATH}jq -r '.[]| select(.tag | startswith("develop-") or startswith("stable-")) |.tag' ${API_CONFIG_FILE}` DEV_DEPLOY_STATUS=`${JQ_TOOL_PATH}jq -r '.[]| select(.tag | startswith("dev-deploy-"))|.tag' ${TAG_FILE}` QA_DEPLOY_STATUS=`${JQ_TOOL_PATH}jq -r '.[]| select(.tag | startswith("qa-deploy-"))|.tag' ${TAG_FILE}` PROD_DEPLOY_STATUS=`${JQ_TOOL_PATH}jq -r '.[]| select(.tag | startswith("prod-deploy-"))|.tag' ${TAG_FILE}` # Assign Azure Variables echo "##vso[task.setvariable variable=TAG_API_STATUS;]${API_STATUS}" echo "##vso[task.setvariable variable=TAG_dev_DEPLOY_STATUS;]${DEV_DEPLOY_STATUS}" echo "##vso[task.setvariable variable=TAG_qa_DEPLOY_STATUS;]${QA_DEPLOY_STATUS}" echo "##vso[task.setvariable variable=TAG_prod_DEPLOY_STATUS;]${PROD_DEPLOY_STATUS}" We used the Pipeline task below in Azure DevOps to create a tag at the end of the deployment tasks. Assign the Tag with the appropriate variable at the end of the deployment, as in the example below. Conclusion Once the pipeline has run successfully you can test this by changing any of the environment-level files and merging them to the master via pull request. The rest of the steps are basically automated. Then you can browse to the code repository tags section and then you should see the below tag as an example.
Domain-Driven Design (DDD) is an important strategic approach to software development. It involves deeply understanding and modeling a business domain, particularly beneficial in complex domains with intricate business rules, processes, and interactions. However, effectively implementing DDD requires discipline, a strong grasp of the domain, and the avoidance of common pitfalls that can lead to suboptimal designs and technical debt. In this article, we'll explore 10 things to avoid in DDD and examples to illustrate these pitfalls. 1. Focusing Too Much on Technical Patterns Sample Scenario A team begins a project by excessively creating repositories, aggregates, and value objects without fully grasping the business domain. For example, they develop a complicated repository for managing Customer entities without understanding how customers are represented and utilized within the business. Consequently, the repository contains numerous unnecessary methods that do not align with the domain's actual use cases and requirements. Avoidance Tip DDD's main focus should be comprehending the domain. This involves a collaborative effort, with the team working closely with domain experts to establish a shared understanding and a common language that accurately represents the fundamental business concepts. Technical patterns like repositories and aggregates should naturally emerge from the domain model, rather than being prematurely implemented or excessively emphasized at the beginning. 2. Over-Engineering the Model Sample Scenario A team strictly adheres to DDD principles by creating a detailed domain model that includes separate classes for every conceivable domain concept. For instance, they create individual classes for CustomerName, CustomerEmail, and CustomerAddress when these could have been combined into a more straightforward Customer value object. The resulting model becomes overly complex and difficult to maintain, with little added value. Avoidance Tip To prevent potential issues, it is your responsibility to maintain a domain model that is uncomplicated and accurately reflects the domain. This diligent approach is important to focus on modeling the components of the domain that offer strategic importance and to streamline or exclude less critical elements. Remember, Domain-Driven Design (DDD) is primarily concerned with strategic design and not with needlessly complexifying the domain model with unnecessary intricacies. 3. Ignoring the Ubiquitous Language Sample Scenario In a collaborative environment, it's common for developers and domain experts to employ distinct technical vocabulary. For instance, while domain experts commonly use the term "Purchase Orders," developers might utilize OrderEntity within their codebase. This terminology disparity can lead to various challenges, including misunderstandings, mishandled implementations, and a need for synchronization between the code and the specific business requirements. Such discrepancies can impede effective communication and hinder the accurate translation of business logic into technical implementations. Therefore, ensuring a shared understanding of terminology between developers and domain experts is crucial for fostering effective collaboration and aligning technical solutions with business needs. Avoidance Tip Establishing and upholding a universal language is important to ensure clear communication and understanding across all aspects of the project. This process involves close collaboration with domain experts to develop and maintain a consistent vocabulary. The language should be applied uniformly in code, documentation, conversations, and all forms of communication. By doing so, we can prevent misunderstandings and guarantee that the model accurately reflects the business requirements. This approach fosters alignment between all stakeholders and promotes cohesion throughout the project lifecycle. 4. Misunderstanding Bounded Contexts Sample Scenario When a team tries to utilize a single "Customer" entity across various subdomains like "Billing," "Customer Service," and "Marketing," it leads to ambiguity and unwarranted interconnections between different sections of the application. For instance, modifications made to the "Customer" entity within the "Billing" context can unintentionally impact the "Marketing" context, resulting in unforeseen behavior and data discrepancies. Avoidance Tip When defining bounded contexts, clearly delineating different areas of the domain with distinct responsibilities is crucial. Each bounded context should possess its own model and clearly defined boundaries. To maintain the integrity of each context's model, it's essential to facilitate integration between contexts through well-defined interfaces or anti-corruption layers. These measures ensure that the responsibilities and integrity of each bounded context are preserved. 5. Not Aligning With Business Strategy Sample Scenario The team's approach to Domain-Driven Design (DDD) is purely technical, focusing on modeling all aspects of the domain without considering the business strategy. Their extensive efforts have been invested in modeling peripheral elements of the domain, such as "Employee Attendance" and "Office Supplies Management," while overlooking the core business process that provides the most value: "Order Fulfillment." As a result of this approach, the resulting model is complex but needs to align with the business's strategic goals. Avoidance Tip It's crucial to leverage Domain-Driven Design (DDD) to deeply analyze and concentrate on the domain's most vital and influential parts. Identify the aspects that deliver the highest value to the business and ensure that your modeling efforts are closely aligned with the business's overarching priorities and strategic objectives. Actively collaborating with key business stakeholders is essential to gain a comprehensive understanding of what holds the greatest value to them and subsequently prioritize these areas in your modeling endeavors. This approach will optimally reflect the business's critical needs and contribute to the successful realization of strategic goals. 6. Overusing Entities Instead of Value Objects Sample Scenario Many software developers conceptualize "Currency" as an entity with a unique identifier, essentially treating it as a standalone object. However, it could be more efficient to view "Currency" as a value object defined by its attributes, such as "USD" or "EUR." By persisting with the former approach, the team inadvertently introduces unnecessary complexity, such as the need to manage the lifecycle, state, and identity of "Currency" entities. This unnecessarily complicates the codebase, making it more cumbersome and adding to its bloat. Avoidance Tip When designing your software, it is beneficial to utilize value objects as much as possible, especially for types and objects that remain unchanged and do not require a unique identity. Value objects are advantageous as they are more straightforward to manage and predictable, often leading to more maintainable code. These objects can effectively represent domain-specific values such as dates, monetary values, measurements, and other essential concepts. 7. Neglecting Aggregates and Their Boundaries Sample Scenario Aggregates represent clusters or groups of related objects treated as a single unit for data changes in the context of domain-driven design. When a team models a "Product" as a standalone entity, they may overlook its aggregate boundaries, allowing multiple services to modify it independently. This can lead to conflicting updates by different services to the same "Product," resulting in inconsistent data and business rule violations. Defining and respecting aggregate boundaries is crucial for maintaining data integrity and ensuring consistency across different system parts. Avoidance Tip Aggregates are an essential concept in domain-driven design (DDD) that involves a cluster of related objects treated as a unit for data changes. An aggregate comprises one specific object, known as the aggregate root, which serves as the entry point for all modifications within the aggregate. Encapsulating related objects within an aggregate and making modifications through the aggregate root makes it easier to enforce business rules and maintain data consistency and integrity. This approach helps to ensure that all operations and changes occur within the defined boundaries of the aggregate, allowing for better control and management of complex data structures. 8. Failing To Use Domain Events Effectively Sample Scenario One team ignores domain events and directly invokes services. This causes their system to become tightly coupled, increasing dependencies between different system parts. As a result, the system becomes more challenging to modify or extend because any change in one service directly impacts other services, leading to a domino effect of changes and making the system more fragile. On the other hand, in a different scenario, another team overuses domain events by emitting an event for every minor change, such as "CustomerCreated," "CustomerUpdated," and "CustomerDeleted," even when other parts of the system don't need these events. This results in excessive events being generated and processed, causing performance degradation, increased complexity, and unnecessary resource consumption. The system becomes cluttered with events that serve no real purpose, leading to event fatigue and making it harder to identify and respond to critical events. Avoidance Tip When developing your system, it is essential to utilize domain events to capture significant changes within the domain. These events should be meticulously designed to serve a clear purpose and communicate meaningful state changes within the system. By employing domain events, you can effectively decouple various parts of the system, thus facilitating scalability and improved maintainability. However, exercising caution and avoiding overusing domain events is crucial, as doing so may lead to unnecessary complexity and potential performance issues. It is essential to strike a balance and only employ genuinely beneficial domain events rather than indiscriminately incorporating them throughout the system. This approach will ultimately contribute to a more streamlined and manageable system architecture. 9. Ignoring the Importance of Collaboration With Domain Experts Sample Scenario The development team embarked on creating a "Loan Approval" process without soliciting insights from loan officers or other experts in the field. Consequently, the model omitted critical business rules, including specific risk assessment criteria, verification steps, and regulatory requirements. This significant oversight resulted in a software solution that needed to align with business needs, leading to its rejection by stakeholders. Avoidance Tip It's crucial to establish close collaboration with domain experts at every stage of the design and development process. It's important to regularly engage with them to confirm that your understanding of the domain is accurate. You can involve domain experts in discussions, design sessions, and model reviews to gather their valuable insights. Techniques such as Event Storming and Domain Storytelling can facilitate collaborative modeling, ensuring that the model faithfully represents the domain. 10. Treating DDD as a Silver Bullet Sample Scenario The team in question mistakenly believes that Domain-Driven Design (DDD) is universally applicable to all software projects, regardless of the domain's complexity. This leads them to apply DDD principles to a simple CRUD application, such as a 'To-Do List' or 'Contact Management' system. The result is a needlessly complex codebase that is difficult to maintain and incurs high development costs, far exceeding the project's needs. Avoidance Tip Domain-driven design (DDD) is most suitable for domains known for their complexity, particularly those characterized by intricate business rules and processes. In such complex domains, a close alignment between business and technical teams is crucial for success. In contrast, DDD may not be the best fit for simple applications or domains where a more straightforward approach would suffice. It's important to carefully assess the domain's complexity and the project's specific requirements to determine the most appropriate approach. Conclusion Domain-driven design is a powerful methodology for building software that is aligned with complex business domains. However, like any powerful tool, it must be used wisely. By avoiding these ten common pitfalls, you can leverage DDD to its fullest potential, creating software that accurately reflects and supports your business goals. Remember that DDD is not just about patterns and practices; it is about fostering collaboration, creating a shared understanding of the domain, and building solutions that provide real strategic value, a value that your collaborative efforts can significantly contribute to. By focusing on understanding the domain, avoiding over-engineering, aligning with business strategy, and maintaining clear, consistent language, teams can create models that are not only technically sound but also deeply aligned with the business's needs.
Java Floating Numbers Look Familiar In Java, we have two types of floating-point numbers: float and double. All Java developers know them but can't answer a simple question described in the following meme: Are you robot enough? What Do You Already Know about Float and Double? float and double represent floating-point numbers. float uses 32 bits, while double uses 64 bits, which can be utilized for the decimal or fractional part. But what does that actually mean? To understand, let's review the following examples: Intuitive results are completely contradictory and seem to contain mistakes: But how is this possible? Where do the 4 and 2 at the end of the numbers come from? To understand, let's review how these numbers are actually created. Don’t Trust Your Eyes: Reviewing How 0.1 Converted to IEEE Standard float and double follow the IEEE standard, which defines how to use 32/64 bits to represent a floating-point number. But how is a number like 0.1 converted to a bit array? Without diving too much into the details, the logic is similar to the following: Converting Floating 0.1 to Arrays of Bits First In the first stage, we need to convert the decimal representation of 0.1 to binary using the following steps: Multiply 0.1 by 2 and write down the decimal part. Take the fractional part, multiply it by 2, and note the decimal part. Repeat the first step with the fraction from the second step. So for 0.1, we get the following results: Step Operation Integer Part Fraction 1 0.1 * 2 0 0.2 2 0.2 * 2 0 0.4 3 0.4 * 2 0 0.8 4 0.8 * 2 1 0.6 5 0.6 * 2 1 0.2 6 0.2 * 2 0 0.4 After repeating these steps, we get a binary sequence like 0.0001100110011 (in fact, it’s a repeating infinite sequence). Converting Binary Array to IEEE Standard Inside float/double, we don't keep the binary array as it is. float/double follow the IEEE 754 standard. This standard splits the number into three parts: Sign (0 for positive and 1 for negative) Exponent (defines the position of the floating point, with an offset of 127 for float or 1023 for double) Mantissa (the part that comes after the floating point, but is limited by the number of remaining bits) So now converting 0.0001100110011... to IEEE, we get: Sign 0 for positive Exponent : Considering first four zeros 0.0001100110011 = -4 + 127 = 123 (or 01111011) Mantissa 1100110011 (Mantissa ignores first 1), so we get 100110011 So the final representation is: So What? How Do These Numbers Explain Weird Results? After all these conversions, we lose precision due to two factors: We lose precision when converting from the infinite binary representation (which has repeating values like 1100110011). We lose precision when converting to IEEE format because we consider only the first 32 or 64 bits. This means that the value we have in float or double doesn't represent exactly 0.1. If we convert the IEEE bit array from float to a "real float," we get a different value. More precisely, instead of 0.1, we get 0.100000001490116119384765625. How can we verify this? There are a couple of ways. Take a look at the following code: And as we expect, we get the following results: But if we want to go deeper, we can write reverse engineering code: As expected, it confirms our ideas: Answering the Question and Drawing Conclusions Now that we know the value we see at initialization is different from what is actually stored in float/double, we expect the value on the left (0.1) but instead, we initialize with the value on the right (0.100000001490116119384765625): So, knowing this, it's clear that when we perform actions such as adding or multiplying values, this difference becomes more pronounced, until it becomes visible during printing. Conclusions Here are the conclusions we can draw: Don’t use floating-point values for precise calculations, such as in finance, medicine, or complex science. Don’t compare two double values for equality directly; instead, check the difference between them with a small delta. For example: boolean isEqual = Math.abs(a - b) < 0.0000001; Use BigDecimal or similar classes for precise calculations. I hope you now understand why 0.1 + 0.2 returns 0.300000000000004. Thanks for reading!
Efficient data synchronization is crucial in high-performance computing and multi-threaded applications. This article explores an optimization technique for scenarios where frequent writes to a container occur in a multi-threaded environment. We’ll examine the challenges of traditional synchronization methods and present an advanced approach that significantly improves performance for write-heavy environments. The method in question is beneficial because it is easy to implement and versatile, unlike pre-optimized containers that may be platform-specific, require special data types, or bring additional library dependencies. Traditional Approaches and Their Limitations Imagine a scenario where we have a cache of user transactions: C++ struct TransactionData { long transactionId; long userId; unsigned long date; double amount; int type; std::string description; }; std::map<long, std::vector<TransactionData>> transactionCache; // key - userId In a multi-threaded environment, we need to synchronize access to this cache. The traditional approach might involve using a mutex: C++ class SimpleSynchronizedCache { public: void write(const TransactionData&& transaction) { std::lock_guard<std::mutex> lock(cacheMutex); transactionCache[transaction.userId].push_back(transaction); } std::vector<TransactionData> read(const long&& userId) { std::lock_guard<std::mutex> lock(cacheMutex); try { return transactionCache.at(userId); } catch (const std::out_of_range& ex) { return std::vector<TransactionData>(); } } std::vector<TransactionData> pop(const long& userId) { std::lock_guard<std::mutex> lock(_cacheMutex); auto userNode = _transactionCache.extract(userId); return userNode.empty() ? std::vector<TransactionData>() : std::move(userNode.mapped()); } private: std::map<int, std::vector<TransactionData>> transactionCache; std::mutex cacheMutex; }; As system load increases, especially with frequent reads, we might consider using a shared_mutex: C++ class CacheWithSharedMutex { public: void write(const TransactionData&& transaction) { std::lock_guard<std::shared_mutex> lock(cacheMutex); transactionCache[transaction.userId].push_back(transaction); } std::vector<TransactionData> read(const long&& userId) { std::shared_lock<std::shared_mutex> lock(cacheMutex); try { return transactionCache.at(userId); } catch (const std::out_of_range& ex) { return std::vector<TransactionData>(); } } std::vector<TransactionData> pop(const long& userId) { std::lock_guard<std::shared_mutex> lock(_cacheMutex); auto userNode = _transactionCache.extract(userId); return userNode.empty() ? std::vector<TransactionData>() : std::move(userNode.mapped()); } private: std::map<int, std::vector<TransactionData>> transactionCache; std::shared_mutex cacheMutex; }; However, when the load is primarily generated by writes rather than reads, the advantage of a shared_mutex over a regular mutex becomes minimal. The lock will often be acquired exclusively, negating the benefits of shared access. Moreover, let’s imagine that we don’t use read() at all — instead, we frequently write incoming transactions and periodically flush the accumulated transaction vectors using pop(). As pop() involves reading with extraction, both write() and pop() operations would modify the cache, necessitating exclusive access rather than shared access. Thus, the shared_lock becomes entirely useless in terms of optimization over a regular mutex, and maybe even performs worse — its more intricate implementation is now used for the same exclusive locks that a faster regular mutex provides. Clearly, we need something else. Optimizing Synchronization With the Sharding Approach Given the following conditions: A multi-threaded environment with a shared container Frequent modification of the container from different threads Objects in the container can be divided for parallel processing by some member variable. Regarding point 3, in our cache, transactions from different users can be processed independently. While creating a mutex for each user might seem ideal, it would lead to excessive overhead in maintaining so many locks. Instead, we can divide our cache into a fixed number of chunks based on the user ID, in a process known as sharding. This approach reduces the overhead and yet allows the parallel processing, thereby optimizing performance in a multi-threaded environment. C++ class ShardedCache { public: ShardedCache(size_t shardSize): _shardSize(shardSize), _transactionCaches(shardSize) { std::generate( _transactionCaches.begin(), _transactionCaches.end(), []() { return std::make_unique<SimpleSynchronizedCache>(); }); } void write(const TransactionData& transaction) { _transactionCaches[transaction.userId % _shardSize]->write(transaction); } std::vector<TransactionData> read(const long& userId) { _transactionCaches[userId % _shardSize]->read(userId); } std::vector<TransactionData> pop(const long& userId) { return std::move(_transactionCaches[userId % _shardSize]->pop(userId)); } private: const size_t _shardSize; std::vector<std::unique_ptr<SimpleSynchronizedCache>> _transactionCaches; }; This approach allows for finer-grained locking without the overhead of maintaining an excessive number of mutexes. The division can be adjusted based on system architecture specifics, such as size of a thread pool that works with the cache, or hardware concurrency. Let’s run tests where we check how sharding accelerates cache performance by testing different partition sizes. Performance Comparison In these tests, we aim to do more than just measure the maximum number of operations the processor can handle. We want to observe how the cache behaves under conditions that closely resemble real-world scenarios, where transactions occur randomly. Our optimization goal is to minimize the processing time for these transactions, which enhances system responsiveness in practical applications. The implementation and tests are available in the GitHub repository. C++ #include <thread> #include <functional> #include <condition_variable> #include <random> #include <chrono> #include <iostream> #include <fstream> #include <array> #include "SynchronizedContainers.h" const auto hardware_concurrency = (size_t)std::thread::hardware_concurrency(); class TaskPool { public: template <typename Callable> TaskPool(size_t poolSize, Callable task) { for (auto i = 0; i < poolSize; ++i) { _workers.emplace_back(task); } } ~TaskPool() { for (auto& worker : _workers) { if (worker.joinable()) worker.join(); } } private: std::vector<std::thread> _workers; }; template <typename CacheImpl> class Test { public: template <typename CacheImpl = ShardedCache, typename ... CacheArgs> Test(const int testrunsNum, const size_t writeWorkersNum, const size_t popWorkersNum, const std::string& resultsFile, CacheArgs&& ... cacheArgs) : _cache(std::forward<CacheArgs>(cacheArgs)...), _writeWorkersNum(writeWorkersNum), _popWorkersNum(popWorkersNum), _resultsFile(resultsFile), _testrunsNum(testrunsNum), _testStarted (false) { std::random_device rd; _randomGenerator = std::mt19937(rd()); } void run() { for (auto i = 0; i < _testrunsNum; ++i) { runSingleTest(); logResults(); } } private: void runSingleTest() { { std::lock_guard<std::mutex> lock(_testStartSync); _testStarted = false; } // these pools won’t just fire as many operations as they can, // but will emulate real-time occuring requests to the cache in multithreaded environment auto writeTestPool = TaskPool(_writeWorkersNum, std::bind(&Test::writeTransactions, this)); auto popTestPool = TaskPool(_popWorkersNum, std::bind(&Test::popTransactions, this)); _writeTime = 0; _writeOpNum = 0; _popTime = 0; _popOpNum = 0; { std::lock_guard<std::mutex> lock(_testStartSync); _testStarted = true; _testStartCv.notify_all(); } } void logResults() { std::cout << "===============================================" << std::endl; std::cout << "Writing operations number per sec:\t" << _writeOpNum / 60. << std::endl; std::cout << "Writing operations avg time (mcsec):\t" << (double)_writeTime / _writeOpNum << std::endl; std::cout << "Pop operations number per sec: \t" << _popOpNum / 60. << std::endl; std::cout << "Pop operations avg time (mcsec): \t" << (double)_popTime / _popOpNum << std::endl; std::ofstream resultsFilestream; resultsFilestream.open(_resultsFile, std::ios_base::app); resultsFilestream << _writeOpNum / 60. << "," << (double)_writeTime / _writeOpNum << "," << _popOpNum / 60. << "," << (double)_popTime / _popOpNum << std::endl; std::cout << "Results saved to file " << _resultsFile << std::endl; } void writeTransactions() { { std::unique_lock<std::mutex> lock(_testStartSync); _testStartCv.wait(lock, [this] { return _testStarted; }); } std::chrono::steady_clock::time_point start = std::chrono::steady_clock::now(); // hypothetical system has around 100k currently active users std::uniform_int_distribution<> userDistribution(1, 100000); // delay up to 5 ms for every thread not to start simultaneously std::uniform_int_distribution<> waitTimeDistribution(0, 5000); std::this_thread::sleep_for(std::chrono::microseconds(waitTimeDistribution(_randomGenerator))); for ( auto iterationStart = std::chrono::steady_clock::now(); iterationStart - start < std::chrono::minutes(1); iterationStart = std::chrono::steady_clock::now()) { auto generatedUser = userDistribution(_randomGenerator); TransactionData dummyTransaction = { 5477311, generatedUser, 1824507435, 8055.05, 0, "regular transaction by " + std::to_string(generatedUser)}; std::chrono::steady_clock::time_point operationStart = std::chrono::steady_clock::now(); _cache.write(dummyTransaction); std::chrono::steady_clock::time_point operationEnd = std::chrono::steady_clock::now(); ++_writeOpNum; _writeTime += std::chrono::duration_cast<std::chrono::microseconds>(operationEnd - operationStart).count(); // make span between iterations at least 5ms std::this_thread::sleep_for(iterationStart + std::chrono::milliseconds(5) - std::chrono::steady_clock::now()); } } void popTransactions() { { std::unique_lock<std::mutex> lock(_testStartSync); _testStartCv.wait(lock, [this] { return _testStarted; }); } std::chrono::steady_clock::time_point start = std::chrono::steady_clock::now(); // hypothetical system has around 100k currently active users std::uniform_int_distribution<> userDistribution(1, 100000); // delay up to 100 ms for every thread not to start simultaneously std::uniform_int_distribution<> waitTimeDistribution(0, 100000); std::this_thread::sleep_for(std::chrono::microseconds(waitTimeDistribution(_randomGenerator))); for ( auto iterationStart = std::chrono::steady_clock::now(); iterationStart - start < std::chrono::minutes(1); iterationStart = std::chrono::steady_clock::now()) { auto requestedUser = userDistribution(_randomGenerator); std::chrono::steady_clock::time_point operationStart = std::chrono::steady_clock::now(); auto userTransactions = _cache.pop(requestedUser); std::chrono::steady_clock::time_point operationEnd = std::chrono::steady_clock::now(); ++_popOpNum; _popTime += std::chrono::duration_cast<std::chrono::microseconds>(operationEnd - operationStart).count(); // make span between iterations at least 100ms std::this_thread::sleep_for(iterationStart + std::chrono::milliseconds(100) - std::chrono::steady_clock::now()); } } CacheImpl _cache; std::atomic<long> _writeTime; std::atomic<long> _writeOpNum; std::atomic<long> _popTime; std::atomic<long> _popOpNum; size_t _writeWorkersNum; size_t _popWorkersNum; std::string _resultsFile; int _testrunsNum; bool _testStarted; std::mutex _testStartSync; std::condition_variable _testStartCv; std::mt19937 _randomGenerator; }; void testCaches(const size_t testedShardSize, const size_t workersNum) { if (testedShardSize == 1) { auto simpleImplTest = Test<SimpleSynchronizedCache>( 10, workersNum, workersNum, "simple_cache_tests(" + std::to_string(workersNum) + "_workers).csv"); simpleImplTest.run(); } else { auto shardedImpl4Test = Test<ShardedCache>( 10, workersNum, workersNum, "sharded_cache_" + std::to_string(testedShardSize) + "_tests(" + std::to_string(workersNum) + "_workers).csv", 4); shardedImpl4Test.run(); } } int main() { std::cout << "Hardware concurrency: " << hardware_concurrency << std::endl; std::array<size_t, 7> testPlan = { 1, 4, 8, 32, 128, 4096, 100000 }; for (auto i = 0; i < testPlan.size(); ++i) { testCaches(testPlan[i], 4 * hardware_concurrency); } // additional tests with diminished load to show limits of optimization advantage std::array<size_t, 4> additionalTestPlan = { 1, 8, 128, 100000 }; for (auto i = 0; i < additionalTestPlan.size(); ++i) { testCaches(additionalTestPlan[i], hardware_concurrency); } } We observe that with 2,000 writes and 300 pops per second (with a concurrency of 8) — which are not very high numbers for a high-load system — optimization using sharding significantly accelerates cache performance, by orders of magnitude. However, evaluating the significance of this difference is left to the reader, as, in both scenarios, operations took less than a millisecond. It’s important to note that the tests used a relatively lightweight data structure for transactions, and synchronization was applied only to the container itself. In real-world scenarios, data is often more complex and larger, and synchronized processing may require additional computations and access to other data, which can significantly increase the time of operation itself. Therefore, we aim to spend as little time on synchronization as possible. The tests do not show the significant difference in processing time when increasing the shard size. The greater the size the bigger is the maintaining overhead, so how low should we go? I suspect that the minimal effective value is tied to the system's concurrency, so for modern server machines with much greater concurrency than my home PC, a shard size that is too small won’t yield the most optimal results. I would love to see the results on other machines with different concurrency that may confirm or disprove this hypothesis, but for now I assume it is optimal to use a shard size that is several times larger than the concurrency. You can also note that the largest size tested — 100,000 — effectively matches the mentioned earlier approach of assigning a mutex to each user (in the tests, user IDs were generated within the range of 100,000). As can be seen, this did not provide any advantage in processing speed, and this approach is obviously more demanding in terms of memory. Limitations and Considerations So, we determined an optimal shard size, but this is not the only thing that should be considered for the best results. It’s important to remember that such a difference compared to a simple implementation exists only because we are attempting to perform a sufficiently large number of transactions at the same time, causing a “queue” to build up. If the system’s concurrency and the speed of each operation (within the mutex lock) allow operations to be processed without bottlenecks, the effectiveness of sharding optimization decreases. To demonstrate this, let’s look at the test results with reduced load — at 500 writes and 75 pops (with a concurrency of 8) — the difference is still present, but it is no longer as significant. This is yet another reminder that premature optimizations can complicate code without significantly impacting results. It’s crucial to understand the application requirements and expected load. Also, it’s important to note that the effectiveness of sharding heavily depends on the distribution of values of the chosen key (in this case, user ID). If the distribution becomes heavily skewed, we may revert to performance more similar to that of a single mutex — imagine all of the transactions coming from a single user. Conclusion In scenarios with frequent writes to a container in a multi-threaded environment, traditional synchronization methods can become a bottleneck. By leveraging the ability of parallel processing of data and predictable distribution by some specific key and implementing a sharded synchronization approach, we can significantly improve performance without sacrificing thread safety. This technique can prove itself effective for systems dealing with user-specific data, such as transaction processing systems, user session caches, or any scenario where data can be logically partitioned based on a key attribute. As with any optimization, it’s crucial to profile your specific use case and adjust the implementation accordingly. The approach presented here provides a starting point for tackling synchronization challenges in write-heavy, multi-threaded applications. Remember, the goal of optimization is not just to make things faster, but to make them more efficient and scalable. By thinking critically about your data access patterns and leveraging the inherent structure of your data, you can often find innovative solutions to performance bottlenecks.
Securing distributed systems is a complex challenge due to the diversity and scale of components involved. With multiple services interacting across potentially unsecured networks, the risk of unauthorized access and data breaches increases significantly. This article explores a practical approach to securing distributed systems using an open-source project. The project demonstrates how to integrate several security mechanisms and technologies to tackle common security challenges such as authentication, authorization, and secure communication. Understanding Security Challenges in Distributed Systems Distributed systems involve multiple services or microservices that must communicate securely across a network. Key security challenges in such architectures include: Secure communication: Ensuring that data transmitted between services is encrypted and safe from eavesdropping or tampering Authentication: Verifying the identities of users and services to prevent unauthorized access Authorization: Controlling what authenticated users and services are allowed to do, based on their roles and permissions Policy enforcement: Implementing fine-grained access controls and policies that govern service-to-service and user interactions Certificate management: Managing digital certificates for encrypting data and establishing trust between services This open-source project addresses these challenges using several integrated technologies and solutions. Project Setup and Configuration The project begins with setting up a secure environment using shell scripts and Docker. The setup involves provisioning digital certificates and starting the necessary services to ensure all components are ready for secure communication. Steps to Set Up the Environment 1. Provisioning Certificates The project uses a shell script (provisioning.sh) to simulate a Certificate Authority (CA) and generate the necessary certificates for the services. ./provisioning.sh 2. Launching Services Docker Compose is used to start all services defined in the project, ensuring they are configured correctly for secure operation. docker-compose up 3. Testing Service-to-Service Communication To validate service-to-service communication using certificates and JWT tokens, the test_services.sh script is provided. This script demonstrates how different services interact securely using their assigned certificates. Solving Security Challenges in Distributed Systems The project integrates several key technologies to address the primary security challenges mentioned earlier. Here's how each challenge is tackled: 1. Secure Communication With Mutual TLS (mTLS) Challenge In a distributed system, services must communicate securely to prevent unauthorized access and data breaches. Solution The project uses Mutual TLS (mTLS) to secure communication between services. mTLS ensures that both the client and server authenticate each other using their respective certificates. This mutual authentication prevents unauthorized services from communicating with legitimate services. Implementation Nginx is configured as a reverse proxy to handle mTLS. It requires both client and server certificates for establishing a secure connection, ensuring that data transmitted between services remains confidential and tamper-proof. 2. Authentication With Keycloak Challenge Properly authenticating users and services is critical to prevent unauthorized access. Solution The project leverages Keycloak, an open-source identity and access management solution, to manage authentication. Keycloak supports multiple authentication methods, including OpenID Connect and client credentials, making it suitable for both user and service authentication. User Authentication: Users are authenticated using OpenID Connect. Keycloak is configured with a client (appTest-login-client) that handles user authentication flows, including login, token issuance, and callback handling. Service Authentication: For service-to-service authentication, the project uses a Keycloak client (client_credentials-test) configured for the client credentials grant type. This method is ideal for authenticating services without user intervention. Authentication Flow Example Users navigate to the login page. After successful login, Keycloak redirects the user to a callback page with an authorization code. The authorization code is then exchanged for a JWT token, which is used for subsequent requests. The authn.js file in the nginx/njs directory provides a detailed implementation of this flow. Service Authentication Example Using Client Credentials curl -X POST "http://localhost:9000/realms/tenantA/protocol/openid-connect/token" \ -H "Content-Type: application/x-www-form-urlencoded" \ -d "grant_type=client_credentials" \ -d "client_id=client_credentials-test" \ -d "client_secret=your-client-secret-here" 3. User Authorization With Open Policy Agent (OPA) and JWT Challenge Enforcing fine-grained access controls to ensure that authenticated users and services only have access to authorized resources Solution The project utilizes a combination of Open Policy Agent (OPA) and JWT tokens to enforce authorization policies. The project demostrate three different strategies for JWT validation to ensure robust security: Retrieving certificates from Keycloak: Fetches the certificates dynamically from Keycloak to validate the token. Using x5t (Thumbprint): Uses the thumbprint embedded in the token to retrieve the public key from a local trust store. Embedded certificate validation: Validates the token using an embedded certificate, ensuring the certificate is validated against a trusted Certificate Authority (CA). Refer to the nginx/njs/token.js file for the detailed implementation of these strategies. 4. Policy Enforcement With Open Policy Agent (OPA) Challenge Implementing dynamic and flexible access control policies for both services and users Solution OPA is used to enforce fine-grained policies for access control. Policies are written in a declarative language (Rego) and stored in the opa/ directory. These policies dictate the conditions under which services can communicate and users can access resources, ensuring that access controls are consistently applied across the system. 5. Certificate Management Challenge Managing digital certificates for services to establish trust and secure communications Solution: The project includes a robust certificate management system. A shell script (provisioning.sh) is used to simulate a Certificate Authority (CA) and generate certificates for each service. This approach simplifies certificate management and ensures that all services have the necessary credentials for secure communication. We also added an endpoint to update the service certificate without the need of nginx restart. curl --insecure https://localhost/certs --cert certificates/gen/serviceA/client.crt --key certificates/gen/serviceA/client.key -F cert=@certificates/gen/serviceA/client.crt -F key=@certificates/gen/serviceA/client.key Conclusion Building a secure distributed system requires careful consideration of various security aspects, including secure communication, authentication, authorization, policy enforcement, and certificate management. This open-source project provides a comprehensive example of how to integrate multiple security mechanisms to address these challenges effectively. By following the setup and configurations demonstrated in this project, developers can leverage mutual TLS, Keycloak, Open Policy Agent, and Nginx to build a robust security architecture. These technologies, when combined, provide a strong foundation for securing distributed systems against a wide range of threats, ensuring both data protection and secure access control.
“DX”, aka Developer Experience One of the goals of Stalactite is to make developers aware of the impact of the mapping of their entities onto the database, and, as a consequence, onto performances. To fulfill this goal, the developer's experience, as a user of the Mapping API, is key to helping him express his intention. The idea is to guide the user-developer in the choices he can make while he describes its persistence. As you may already know, Stalactite doesn’t use annotation or XML files for that. It proposes a fluent API that constrains user choices according to the context. To clarify: available methods after a call to mapOneToOne(..) are not the same as the ones after mapOneToMany(..). This capacity can be done in different ways. Stalactite chose to leverage Java proxies for it and combines it with the multiple-inheritance capability of interfaces. Contextualized Options Let’s start with a simple goal: we want to help a developer express the option of aliasing a column in a request, and also the option of casting it. Usually, we would find something like: Java select() .add("id", new ColumnOptions().as("countryId").cast("int")) .from(..); It would be smarter to have this: Java select() .add("id").as("countryId").cast("int") .add("name").as("countryName") .add("population").cast("long") .from(..); As the former is kind of trivial to implement and many examples can be found on the Internet, in particular Spring with its Security DSL or its MockMVC DSL, the latter is trickier because we have to locally mix the main API (select().from(..).where(..)) with some local one (as(..).cast(..)) on the return type of the add(..) method. This means that if the main API is brought by the FluentSelect interface, and the column options by the FluentColumnOptions interface, the method add(String) must return a third one that inherits from both: the FluentSelectColumnOptions interface. Java /** The main API */ interface FluentSelect { FluentSelect add(String columnName); FluentFrom from(String tableName); } /** The column options API */ interface FluentColumnOptions { FluentColumnOptions as(String alias); FluentColumnOptions cast(String alias); } /** The main API with column options as an argument to make it more fluent */ interface EnhancedFluentSelect extends FluentSelect { /** The returned type of this method is overwritten to return and enhanced version */ FluentSelectColumnOptions add(String columnName); FluentFrom from(String tableName); } /** The mashup between main API and column options API */ interface FluentSelectColumnOptions extends EnhancedFluentSelect, // we inherit from it to be capable of coming back to from(..) or chain with another add(..) FluentColumnOptions { /** we overwrite return types to make it capable of chaining with itself */ FluentSelectColumnOptions as(String alias); FluentSelectColumnOptions cast(String alias); } This can be done with standard Java code but brings some boilerplate code which is cumbersome to maintain. An elegant way to address it is to create a “method dispatcher” that will redirect main methods to the object that supports the main API, and redirect the options ones to the object that supports the options. Creating a Method Dispatcher: Java Proxy Luckily, Java Proxy API helps in being aware of method invocations on an object. As a reminder, a Proxy can be created as such: Proxy.newProxyInstance(classLoader, interfaces, methodHandler) It returns an instance (a "magic" one, thanks to the JVM) that can be typed in one of the interfaces passed as a parameter at any time (the Proxy implements all given interfaces). All methods of all interfaces are intercepted by InvocationHandler.invoke(..) (even equals/hashCode/toString !). So, our goal can be fulfilled if we’re capable of returning a Proxy that implements our interfaces (or the mashup one) and creates an InvocationHandler that propagates calls of the main API to the “main object” and calls the options to the “options object." Since InvocationHandler.invoke(..) gets the invoked Method as an argument, we can easily check if it belongs to one or another of the aforementioned interfaces. This gives the following naïve implementation for our example : Java public static void main(String[] args) { String mySelect = MyFluentQueryAPI.newQuery() .select("a").as("A").cast("char") .add("b").as("B").cast("int") .toString(); System.out.println(mySelect); // will print "[a as A cast char, b as B cast int]", see createFluentSelect() on case "toString" } interface Select { Select add(String s); Select select(String s); } interface ColumnOptions { ColumnOptions as(String s); ColumnOptions cast(String s); } interface FluentSelect extends Select, ColumnOptions { FluentSelect as(String s); FluentSelect cast(String s); FluentSelect add(String s); FluentSelect select(String s); } public static class MyFluentQueryAPI { public static FluentSelect newQuery() { return new MyFluentQueryAPI().createFluentSelect(); } private final SelectSupport selectSupport = new SelectSupport(); public FluentSelect createFluentSelect() { return (FluentSelect) Proxy.newProxyInstance(getClass().getClassLoader(), new Class[] { FluentSelect.class }, new InvocationHandler() { @Override public Object invoke(Object proxy, Method method, Object[] args) throws Throwable { switch (method.getName()) { case "as": case "cast": // we look for "as" or "cast" method on ColumnOptions class Method optionMethod = ColumnOptions.class.getMethod(method.getName(), method.getParameterTypes()); // we apply the "as" or "cast" call on element being created optionMethod.invoke(selectSupport.getCurrentElement(), args); break; case "add": case "select": // we look for "add" or "select" method on Select class Method selectMethod = Select.class.getMethod(method.getName(), method.getParameterTypes()); // we apply the "add" or "select" call on the final result (select instance) selectMethod.invoke(selectSupport, args); break; case "toString": return selectSupport.getElements().toString(); } return proxy; } }); } } /** Basic implementation of Select */ static class SelectSupport implements Select { private final List<SelectedElement> elements = new ArrayList<>(); private SelectedElement currentElement; @Override public Select add(String s) { this.currentElement = new SelectedElement(s); this.elements.add(currentElement); return this; } @Override public Select select(String s) { return add(s); } public SelectedElement getCurrentElement() { return currentElement; } public List<SelectedElement> getElements() { return elements; } } /** Basic representation of an element of the select clause, implements ColumnOptions */ static class SelectedElement implements ColumnOptions { private String clause; public SelectedElement(String clause) { this.clause = clause; } @Override public ColumnOptions as(String s) { clause += " as " + s; return this; } @Override public ColumnOptions cast(String s) { clause += " cast " + s; return this; } @Override public String toString() { return clause; } } This proof of concept needs to take inheritance into account, as well as argument type compatibility, and even more to make it a robust solution. Stalactite invested that time and created the MethodDispatcher class in the standalone library named Reflection, the final DX for an SQL Query definition is available here, and its usage is here. Stalactite DSL for persistence mapping definition is even more complex; that’s the caveat of all this: creating all the composite interfaces and redirecting correctly all the method calls is a bit complex. That’s why, for the last stage of the rocket, the MethodReferenceDispatcher has been created: it lets one redirect a method reference to some lambda expression to avoid extra code for small interfaces. Its usage can be seen here. Conclusion Implementing a naïve DSL can be straightforward but doesn’t really guide the user-developer. On the other hand, implementing a robust DSL can be cumbersome, Stalactite helped itself by creating an engine for it. While it's not easy to master, it really helps to meet the user-developer experience. Since the engine is the library Reflection, which is out of Stalactite, it can be used in other projects.
September 17, 2024 by CORE
Enhancing Few-Shot Text Classification With Noisy Channel Language Model Prompting
September 18, 2024 by
Performing Advanced Facebook Event Data Analysis With a Vector Database
September 18, 2024 by
Explainable AI: Making the Black Box Transparent
May 16, 2023 by CORE
Optimizing Cost and Carbon Footprint With Smart Scaling Using SQS Queue Triggers: Part 1
September 18, 2024 by
Securing Your Enterprise With an Identity-First Security Strategy
September 18, 2024 by
Low Code vs. Traditional Development: A Comprehensive Comparison
May 16, 2023 by
Optimizing Cost and Carbon Footprint With Smart Scaling Using SQS Queue Triggers: Part 1
September 18, 2024 by
September 18, 2024 by
Optimizing Cost and Carbon Footprint With Smart Scaling Using SQS Queue Triggers: Part 1
September 18, 2024 by
Test Automation for Mobile Apps: Strategy for Improved Testing Results
September 17, 2024 by
Low Code vs. Traditional Development: A Comprehensive Comparison
May 16, 2023 by
Enhancing Few-Shot Text Classification With Noisy Channel Language Model Prompting
September 18, 2024 by
Securing Your Enterprise With an Identity-First Security Strategy
September 18, 2024 by
Five IntelliJ Idea Plugins That Will Change the Way You Code
May 15, 2023 by