DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

DZone Spotlight

Wednesday, May 6 View All Articles »
AI in Software Architecture: Hype, Reality, and the Engineer’s Role

AI in Software Architecture: Hype, Reality, and the Engineer’s Role

By Otavio Santana DZone Core CORE
There’s a recurring pattern in software engineering: every few years, a new wave arrives promising to redefine everything, and the conversation quickly drifts to extremes. The current wave — driven by advances in machine learning and large language models — is no different. Some argue that AI will replace engineers entirely; others reduce it to just another tool in the IDE. Both views oversimplify what is actually happening. If you look historically — from assembly to high-level languages, from monoliths to distributed systems — each shift didn’t eliminate complexity; it relocated it. The word architecture itself, from the Greek arkhitekton (“chief builder”), is revealing. It never referred to the act of placing bricks, but to the decision about how the structure holds together. That distinction becomes sharper now. As AI reduces the effort required to produce code, it does not remove the need for design, trade-offs, or system thinking. Instead, it amplifies them. When the cost of generating code drops, the cost of making poor decisions increases — because you can produce more, faster, and propagate mistakes at scale. This article takes a pragmatic stance: not debating whether AI will replace engineers, but examining how it shifts the role of those designing systems. What actually changes when code is no longer the main constraint? What remains stable despite the hype? And where should senior engineers, staff engineers, and architects focus their attention? Because, as in every previous shift, the advantage lies with those who understand where the real constraints have moved. Software Development and the Hype Around AI Technology has always lived close to hype. The word hype comes from hyperbole, the Greek idea of “excess” or “throwing beyond,” and that is exactly what often happens in software: a real innovation appears, but the expectations are projected far beyond its current maturity. Gartner describes this pattern through the Hype Cycle, which follows five phases: first, the Innovation Trigger, when a breakthrough creates attention; then the Peak of Inflated Expectations, when success stories dominate the narrative; after that, the Trough of Disillusionment, when limitations become visible; then the Slope of Enlightenment, where practical use cases emerge; and finally the Plateau of Productivity, where the technology becomes boring, useful, and integrated into real work (Gartner). Software development has seen this many times. We saw it with object-oriented programming, service-oriented architecture, cloud computing, microservices, blockchain, low-code platforms, and even DevOps. Each of these movements carried a promise: faster delivery, better scalability, more autonomy, less complexity. Some of those promises were partially true, but never magically true. Microservices did not eliminate architecture; they moved complexity into communication, deployment, observability, and organizational design. Cloud did not remove infrastructure concerns; it changed who manages them and how much financial discipline matters. The same skeptical lens must be applied to AI. AI itself is not new. The term Artificial Intelligence was formally introduced at the Dartmouth workshop in 1956, one of the field's foundational moments. (Dartmouth) What is new is not the existence of AI, but its accessibility, visibility, and integration into everyday software work. For the first time, many engineers, managers, executives, and non-technical professionals can interact with AI directly through natural language. That popularity creates excitement, fear, and confusion simultaneously. So before asking whether AI will replace developers, destroy architecture, or make fundamentals irrelevant, we need a better question: where does AI actually change software development, and where does it simply underscore the importance of engineering fundamentals? That is what the next sections will explore. Will AI Replace the Software Engineer? The short answer is no — but that answer deserves a bit of skepticism rather than blind acceptance. If you look at the history of engineering, every leap in productivity has triggered the same narrative. Compilers didn’t replace programmers; frameworks didn’t eliminate design; cloud didn’t remove infrastructure thinking. Each abstraction removed friction, but also shifted where complexity lives. AI follows the same path. It accelerates code generation, but software engineering has never been about code alone. As John Ousterhout argues in A Philosophy of Software Design, the core problem is managing complexity, not producing lines of code. That part remains stubbornly human. Still, it would be a mistake to treat AI as just another autocomplete. It clearly changes the economics of development. Studies such as McKinsey & Company's “The Economic Potential of Generative AI” highlight how AI can significantly increase developer productivity on specific tasks. Similarly, GitHub research and product insights on Copilot show measurable gains in code-generation speed. But speed is only one dimension of engineering. Surveys from Stack Overflow consistently reveal developers' concerns about correctness, maintainability, and trust in AI-generated outputs. These signals point to a familiar pattern: local efficiency improves, while global system complexity often increases. What changes, then, is not the existence of engineers, but their focus. We are already seeing the emergence of new responsibilities — engineers acting as reviewers of machine output, curators of codebases, and guardians of architectural integrity. You could argue that “writing code” becomes less central, while “ensuring that the right code exists” becomes more critical. This includes cleaning technical debt introduced by generated code, validating domain alignment, and enforcing consistency across systems. In that sense, AI does not replace engineers; it raises the bar. The more code we can generate, the more discipline we need to sustain it — and that is exactly where experienced engineers become even more valuable. One of the most interesting aspects of AI in software engineering is that the data does not fully support the popular narrative of “always faster and cheaper.” In fact, several recent studies suggest the opposite in specific contexts. For example, a controlled study by Model Evaluation & Threat Research found that experienced developers using AI tools were 19% slower than those without AI assistance, largely due to time spent prompting, reviewing, and correcting outputs (Business Insider). This contradicts developers' perceptions of their own productivity, revealing a gap between how they feel and how they actually deliver. Beyond speed, the long-term cost is even more critical. Research analyzing large-scale codebases shows that AI-generated code tends to increase code churn, duplication, and technical debt, making systems harder to maintain over time (DevOps.com). This aligns with academic findings that AI-assisted development can shift the burden toward experienced engineers, who end up spending more time reviewing and fixing code rather than producing new value. In one study, senior developers saw a drop in their own productivity while their review workload increased, highlighting that AI may redistribute effort rather than eliminate it (arXiv). There is also a structural cost often ignored in hype-driven discussions: maintainability and system evolution. While AI can generate working code quickly, it lacks a deep understanding of the system, leading to inconsistencies and hidden coupling. Studies comparing AI-generated and human-written code show different defect profiles, including increased security risks and structural issues that require additional quality assurance (arXiv). In practice, this means teams may move faster initially but pay the price later when evolving the system — exactly the kind of trade-off that experienced engineers have recognized for decades in software development. So your intuition is correct: there is growing evidence that in the long term, teams that deeply understand their systems may outperform those that rely heavily on AI without proper control. Not because AI is bad, but because software engineering is not about generating code — it is about sustaining systems over time. AI optimizes the former; engineers are still required for the latter. Should I Use AI — or Is It an Enemy? Framing AI as an enemy is the wrong abstraction. In engineering terms, it’s a tool that shifts constraints — and ignoring it is not a neutral decision. Throughout history, engineers who refused to understand new tools did not preserve craftsmanship; they simply became less effective. AI follows the same trajectory. It introduces a new layer of abstraction that can increase productivity and impact, but only if approached critically. As with Docker or Kubernetes before it, understanding how and when to use it becomes part of the job. Used well, AI can remove friction from repetitive tasks. It can generate boilerplate code, assist with test creation, improve documentation, and even help explore alternative implementations. Many engineers already use assistants in their IDEs or more advanced agentic workflows to accelerate delivery. But there is a catch: generated code often lacks coherence with the system as a whole. It may solve the local problem while introducing global inconsistencies. That is why your role shifts — you are no longer just writing code; you are curating, validating, and aligning it. The real skill becomes prompt design, critical review, and architectural enforcement. AI can write code, but it does not understand your system. You still need to be the adult in the room. There are also real, documented risks. Studies show that AI does not always improve productivity — especially for experienced engineers. A study published on arXiv found that developers using AI tools were actually slower in certain scenarios, a result also discussed by Reuters. Research from the Massachusetts Institute of Technology highlights how generative AI changes how engineers spend time and can reduce deep engagement with problem-solving. Additional studies indicate that senior engineers often absorb the cost of reviewing and fixing AI-generated code, shifting — not eliminating — the workload. Meanwhile, industry analysis, such as that from DevOps.com, points to increased technical debt and maintainability challenges. There is also a behavioral dimension: over-reliance on AI can reduce attention and critical thinking, reinforcing the need for deliberate usage. So, should you use AI? Yes — but deliberately. Not using it means falling behind in understanding a tool that is already reshaping the industry. Using it blindly, however, is equally problematic. The goal is not to replace your thinking, but to amplify it. AI should handle the mechanical aspects of development, while you remain responsible for structure, correctness, and long-term sustainability. Or, to adapt a well-known idea often attributed to Bill Gates: the real advantage does not belong to the machine, nor to the human alone, but to the human who knows how to use the machine well. More
Setting Up Claude Code With Ollama: A Guide

Setting Up Claude Code With Ollama: A Guide

By Gunter Rotsaert DZone Core CORE
Nowadays, there are quite a lot of AI coding assistants. In this blog, you will take a closer look at Claude Code, a terminal-based AI coding assistant. Since mid January 2026, Claude Code can also be used in combination with Ollama, a local inference engine. Enjoy! Introduction There are many AI models and also many AI coding assistants. Which one to choose is a hard question. It also depends on whether you run the models locally or in the cloud. When running locally, Qwen3-Coder is a very good AI model to be used for programming tasks. In previous posts, DevoxxGenie, a JetBrains IDE plugin, was often used as an AI coding assistant. DevoxxGenie is nicely integrated within the JetBrains IDEs. But it is also a good thing to take a look at other AI coding assistants. In a previous blog, Qwen Code was used; now it is time to take a look at Claude Code. Claude Code is the popular AI coding assistant from Anthropic. Since mid January 2026, Claude Code can be used in combination with Ollama, a local inference engine. Using local inference engines, you are 100% sure your data is not shared with third parties like Anthropic. In contrast to Qwen Code, which is based on Gemini CLI, Claude Code is not entirely open source; the core executable is closed source. In this blog, you will take a closer look at Claude Code, how to configure it, and how to use it. Sources used in this blog can be found on GitHub. Prerequisites Prerequisites for reading this blog are: Some experience with AI coding assistants.If you want to compare to DevoxxGenie, take a look at a previous post.You will need to have at least Ollama v0.14.0 installed. Installation Installation instructions for Claude Code can be found here. Execute the following bash script. Shell curl -fsSL https://claude.ai/install.sh | bash Setup Claude Code is installed now, but first, some configuration needs to be done. See also the official documentation for Claude Code. 1. Disable Data Usage By default, Claude Code is monitoring data usage. It is not very clear whether this applies only when you use cloud models or cloud APIs. Nevertheless, it is advised to disable this by means of an environment variable. You can configure settings at different scopes: there is a managed, user, project, and local scope. For convenience, the user scope is used here. Navigate to your home directory, and you will see a .claude directory. Create a file settings.json in this .claude directory. Disable the usage statistics. A full list of the environment variables can be found here. JSON { "env": { "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1" } } Do note that this will also disable the auto-updates. In order to avoid this, you can disable all other traffic environment variables. JSON { "env": { "DISABLE_TELEMETRY": "1", "DISABLE_BUG_COMMAND": "1", "DISABLE_ERROR_REPORTING": "1" } } 2. Configure Model In this blog, a local model setup is used, using Ollama as the inference engine and Qwen3-Coder running as a local model. In order to create the setup for this, add the following to the settings.json file. JSON { "env": { "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1", "ANTHROPIC_AUTH_TOKEN": "ollama", "ANTHROPIC_API_KEY": "", "ANTHROPIC_BASE_URL": "http://localhost:11434" } } And if you want to set the default model, you can do so as follows. JSON { "model": "qwen3-coder:30b", "env": { "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1", "ANTHROPIC_AUTH_TOKEN": "ollama", "ANTHROPIC_API_KEY": "", "ANTHROPIC_BASE_URL": "http://localhost:11434" } } 3. System Prompt It is a good practice to add a system prompt to your AI coding assistant. You can add some instructions for the model in this system prompt. You can add it by creating a CLAUDE.md file in the .claude directory. This will ensure a default system prompt for all your projects. If you want to use a more specific system prompt for a particular repository, you can add a CLAUDE.md file in the repository itself. If you are developing in Java, Spring Boot, etc., the following system prompt can be used as an example. Plain Text You are an expert code assistant for a professional Java developer. All code examples, reviews, and explanations must be idiomatic to the following tech stack: * Backend: Java (latest LTS), Spring Boot (latest stable), PostgreSQL. * Frontend: Vue.js (latest stable), Angular (latest stable). * Follow modern best practices for RESTful APIs, object-relational mapping, unit testing (JUnit), and frontend-backend integration. * Prefer Maven for Java dependency management. * Whenever database code is required, use PostgreSQL syntax and conventions. * For frontend, use Vue composition API where applicable. * Always explain your reasoning, and reference documentation when giving architectural advice. * When unsure, ask clarifying questions before producing code. 4. Default Editor The default editor seems to be Visual Studio Code, but you can change this. In order to change it, you need to override the EDITOR environment variable. In the example below, it is set to vim. Add this to your .bashrc file. Shell export EDITOR='vim' 5. Fix Slow Inference Claude Code prepends and adds a Claude Code Attribution header, which invalidates the KV Cache, making inference 90% slower with local models. See this article for more information. You can solve this by setting the CLAUDE_CODE_ATTRIBUTION_HEADER to zero in the settings.json. JSON { "model": "qwen3-coder:30b", "env": { "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1", "CLAUDE_CODE_ATTRIBUTION_HEADER" : "0", "ANTHROPIC_AUTH_TOKEN": "ollama", "ANTHROPIC_API_KEY": "", "ANTHROPIC_BASE_URL": "http://localhost:11434" } } First Startup If you haven't done it already, now is the time to clone the GitHub repository. Be sure to check out the claude-code branch. If you want to execute the commands from this blog, you first delete the CLAUDE.md file and the src/test directory. Claude Code is a terminal application, so you have some different options here: Open a terminal and navigate to the repository.Open your IDE, e.g., IntelliJ, and open a terminal from within IntelliJ (ALT+F12). Do note that using the IntelliJ terminal will react differently than expected to certain commands. It is advised to use a regular terminal instead. Start Claude Code by typing claude in the terminal. The first time you start Claude Code in a repository, it will ask you whether you trust this folder. After confirmation, the Claude prompt is shown. A first simple command is to show the model that is loaded. As expected, Qwen3-Coder is loaded. When you want to show the memory content, you are asked to show the user or project settings. When you choose a user, the CLAUDE.md file is opened. Now, in order to verify whether the connection with the model is functioning correctly, just enter a simple prompt like how are you? The response is the following: Plain Text ● Hello! I'm functioning properly and ready to help with your Java development tasks. I'm set up to assist with Spring Boot backend development, PostgreSQL database integration, and Vue.js/Angular frontend work as specified in your preferences. Is there something specific you'd like me to help you with today? For example, I can help with: - Spring Boot application development - PostgreSQL database queries or schema design - Java code reviews - Vue.js/Angular frontend integration - RESTful API design - Unit testing with JUnit What would you like to work on? No tasks found ● I don't see any existing tasks in the task list. Would you like me to help you with a specific Java development task? I'm ready to assist with Spring Boot backend development, PostgreSQL database work, or Vue.js/Angular frontend integration as per your preferences. You can ask me to: - Review or improve Java code - Help with Spring Boot application development - Assist with PostgreSQL database queries or schema design - Help with RESTful API implementation - Assist with unit testing using JUnit - Help with frontend-backend integration What would you like to work on today? Create a Test Let's continue with something useful and create a test for the CustomersController. Using the @ character, you can add files to the context. When typing, a search is executed, and using the arrows, you can easily select the file you need. Using the tab, you select the file. After that, you can complete the prompt. The prompt used is: Java @src/main/java/com/mydeveloperplanet/myaicodeprojectplanet/controller/CustomersController.java Write a unit test for this code using JUnit. Use WebMvcTest. Use MockMvc. Use AssertJ assertions. Add the test in this repository Claude Code starts analyzing the file and writes the test. Claude Code will ask you whether it may execute bash. The test created can be seen below. Just like with Qwen Coder, a similar error occurs. A type OpenAPICustomer is used, which does not exist. Java package com.mydeveloperplanet.myaicodeprojectplanet.controller; import com.mydeveloperplanet.myaicodeprojectplanet.model.Customer; import com.mydeveloperplanet.myaicodeprojectplanet.openapi.model.Customer; import com.mydeveloperplanet.myaicodeprojectplanet.service.CustomerService; import org.junit.jupiter.api.BeforeEach; import org.junit.jupiter.api.Test; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.boot.test.autoconfigure.web.servlet.WebMvcTest; import org.springframework.boot.test.mock.mockito.MockBean; import org.springframework.http.MediaType; import org.springframework.test.web.servlet.MockMvc; import java.util.Arrays; import java.util.List; import java.util.Optional; import static org.mockito.ArgumentMatchers.any; import static org.mockito.Mockito.*; import static org.springframework.test.web.servlet.request.MockMvcRequestBuilders.*; import static org.springframework.test.web.servlet.result.MockMvcResultMatchers.*; @WebMvcTest(controllers = CustomersController.class) class CustomersControllerTest { @Autowired private MockMvc mockMvc; @MockBean private CustomerService customerService; private Customer customer1; private Customer customer2; private OpenAPICustomer openAPICustomer1; private OpenAPICustomer openAPICustomer2; @BeforeEach void setUp() { customer1 = new Customer(1L, "John", "Doe"); customer2 = new Customer(2L, "Jane", "Smith"); openAPICustomer1 = new OpenAPICustomer() .id(1L) .firstName("John") .lastName("Doe"); openAPICustomer2 = new OpenAPICustomer() .id(2L) .firstName("Jane") .lastName("Smith"); } @Test void customersGet_shouldReturnAllCustomers() throws Exception { // Given List<Customer> customers = Arrays.asList(customer1, customer2); when(customerService.getAllCustomers()).thenReturn(customers); // When & Then mockMvc.perform(get("/customers")) .andExpect(status().isOk()) .andExpect(jsonPath("$.length()").value(2)) .andExpect(jsonPath("$[0].id").value(1L)) .andExpect(jsonPath("$[0].firstName").value("John")) .andExpect(jsonPath("$[0].lastName").value("Doe")) .andExpect(jsonPath("$[1].id").value(2L)) .andExpect(jsonPath("$[1].firstName").value("Jane")) .andExpect(jsonPath("$[1].lastName").value("Smith")); verify(customerService, times(1)).getAllCustomers(); } @Test void customersPost_shouldCreateNewCustomer() throws Exception { // Given when(customerService.createCustomer(any(Customer.class))).thenReturn(customer1); // When & Then mockMvc.perform(post("/customers") .contentType(MediaType.APPLICATION_JSON) .content("{\"firstName\":\"John\",\"lastName\":\"Doe\"}")) .andExpect(status().isCreated()); verify(customerService, times(1)).createCustomer(any(Customer.class)); } @Test void customersIdGet_shouldReturnCustomerWhenFound() throws Exception { // Given when(customerService.getCustomerById(1L)).thenReturn(Optional.of(customer1)); // When & Then mockMvc.perform(get("/customers/{id}", 1L)) .andExpect(status().isOk()) .andExpect(jsonPath("$.id").value(1L)) .andExpect(jsonPath("$.firstName").value("John")) .andExpect(jsonPath("$.lastName").value("Doe")); verify(customerService, times(1)).getCustomerById(1L); } @Test void customersIdGet_shouldReturnNotFoundWhenCustomerNotFound() throws Exception { // Given when(customerService.getCustomerById(1L)).thenReturn(Optional.empty()); // When & Then mockMvc.perform(get("/customers/{id}", 1L)) .andExpect(status().isNotFound()); verify(customerService, times(1)).getCustomerById(1L); } @Test void customersIdPut_shouldUpdateCustomer() throws Exception { // Given when(customerService.updateCustomer(anyLong(), any(Customer.class))).thenReturn(customer1); // When & Then mockMvc.perform(put("/customers/{id}", 1L) .contentType(MediaType.APPLICATION_JSON) .content("{\"firstName\":\"John\",\"lastName\":\"Doe\"}")) .andExpect(status().isOk()); verify(customerService, times(1)).updateCustomer(anyLong(), any(Customer.class)); } @Test void customersIdDelete_shouldDeleteCustomer() throws Exception { // When & Then mockMvc.perform(delete("/customers/{id}", 1L)) .andExpect(status().isNoContent()); verify(customerService, times(1)).deleteCustomer(1L); } } When you fix the import issue, the test is successful. When you generate the mutation test results using mvn verify and check the report in the target/pit-reports directory, you notice that this test has a line coverage of 100% and a mutation coverage of 93%, which is quite good. Generating this test with the Claude models of Anthropic results in similar numbers. The generated test using Qwen Coder also has similar results. Commands Claude Code supports commands. This is a convenient way of interacting with the model. Command /clear clears the history. When using the /init command, Claude Code analyses your repository and creates a CLAUDE.md file in your repository with project-specific information. Executing this command for this repository results in the following CLAUDE.md file. The result is really good. Markdown # CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview This is a Spring Boot 3.5 application that implements a RESTful API for managing customer data. The application uses: - Spring Boot 3.5 with Java 21 - PostgreSQL database with Liquibase for database migrations - JOOQ for database access - OpenAPI/Swagger for API documentation - Maven for build management ## Architecture The application follows a layered architecture pattern: 1. **Controller Layer**: REST endpoints in `CustomersController` 2. **Service Layer**: Business logic in `CustomerService` and `CustomerServiceImpl` 3. **Repository Layer**: Database access in `CustomerRepository` using JOOQ 4. **Model Layer**: Domain objects in `Customer` class 5. **OpenAPI Layer**: Generated API interfaces and models from OpenAPI spec ## Key Files - `src/main/java/com/mydeveloperplanet/myaicodeprojectplanet/MyAiCodeProjectPlanetApplication.java` - Main application class - `src/main/java/com/mydeveloperplanet/myaicodeprojectplanet/controller/CustomersController.java` - REST endpoints - `src/main/java/com/mydeveloperplanet/myaicodeprojectplanet/service/CustomerServiceImpl.java` - Business logic - `src/main/java/com/mydeveloperplanet/myaicodeprojectplanet/repository/CustomerRepository.java` - Database operations - `src/main/resources/db/changelog/migration/db.changelog-1.xml` - Database schema definition - `src/main/resources/static/customers.yaml` - OpenAPI specification ## Development Commands ### Building - `./mvnw clean compile` - Compile the application - `./mvnw clean package` - Build a jar file - `./mvnw clean install` - Install dependencies and build ### Running Tests - `./mvnw test` - Run all tests - `./mvnw test -Dtest=CustomersControllerTest` - Run specific test class - `./mvnw test -Dtest=CustomersControllerTest#customersGet_shouldReturnAllCustomers` - Run specific test method ### Running Application - `./mvnw spring-boot:run` - Run the application - `./mvnw spring-boot:run -Dspring-boot.run.profiles=dev` - Run with specific profile ### Code Generation - `./mvnw generate-sources` - Regenerate JOOQ and OpenAPI code - `./mvnw compile` - Compile with generated code ### Mutation Testing - `./mvnw org.pitest:pitest-maven:mutationCoverage` - Run mutation tests ## Database Setup The application uses Liquibase for database migrations. The database schema is defined in `src/main/resources/db/changelog/migration/db.changelog-1.xml`. The application will automatically create the database schema on startup. ## API Endpoints The API provides standard CRUD operations for customers: - GET `/customers` - Get all customers - POST `/customers` - Create a new customer - GET `/customers/{id}` - Get a specific customer - PUT `/customers/{id}` - Update a customer - DELETE `/customers/{id}` - Delete a customer ## Key Implementation Details 1. The application uses JOOQ for type-safe database access 2. OpenAPI specification is used to generate API interfaces and models 3. The application uses Spring Boot's auto-configuration for database setup 4. Tests are written using Spring Boot's test framework with MockMvc for web layer testing A really nice feature is the option to create custom commands with predefined prompts. Very useful when you want to use prompts repetitively, and you can share them easily with someone else. Create in the .claude directory of your home directory a directory commands. Using extra directories inside this commands directory, you can create namespaces. As an example, the following directory tree. Shell $ tree ├── general │ ├── explain.md │ └── javadoc.md ├── review │ ├── extended.md │ └── simple.md └── test ├── controller.md ├── integration.md ├── repositoryjooq.md └── service.md The controller.md file contains the prompt you used for creating the test. Plain Text Write a unit test for this code using JUnit. Use WebMvcTest. Use MockMvc. Use AssertJ assertions. MCP With Model Context Protocol (MCP) servers, you can enhance the capabilities of the model. The configuration of an MCP server can be added to the .claude.json file in your home directory. The Context7 MCP server can be added as follows. JSON "mcpServers": { "context7": { "type": "stdio", "command": "npx", "args": ["-y", "@upstash/context7-mcp"], "env": {} } } Start Claude Code in the repository and verify whether the MCP server is configured correctly. Plain Text /mcp ─────────────────────────────────────────────────────────────────── Manage MCP servers 1 server User MCPs (/home/<userdir>/.claude.json) ❯ context7 · ✔ connected https://code.claude.com/docs/en/mcp for help Remove the previously created test, add the CustomersController and use the following prompt. Plain Text @src/main/java/com/mydeveloperplanet/myaicodeprojectplanet/controller/CustomersController.java /test:controller create the test in this repository, I am using spring boot 3.5, use the context7 mcp server to retrieve uptodate documentation, do not use deprecated functionality This should result in invoking the Context7 MCP server, and as a result MockBean should not be used in the test, but MockitoBean should be used instead. However, the Context7 MCP server is invoked, but still the test is generated with @MockBean. When you check the source of Context7, which is consulted, you will notice that neither @MockBean or @MockitoBean is mentioned. This works already better than Qwen Coder, where the Context7 MCP server was not invoked at all, even after many attempts. Conclusion Claude Code offers quite a few nice features. There is a lot more to discover, but the first impressions are good. It is also good to experiment with other AI coding assistants now and then, in order to see how they compare to the ones you are using. Compared to Qwen Coder, Claude Code seems to do a better job. More

Trend Report

Security by Design

Security teams are dealing with faster release cycles, increased automation across CI/CD pipelines, a widening attack surface, and new risks introduced by AI-assisted development. As organizations ship more code and rely heavily on open-source and third-party services, security can no longer live at the end of the pipeline. It must shift to a model that is enforced continuously — built into architectures, workflows, and day-to-day decisions — with controls that scale across teams and systems rather than relying on one-off reviews.This report examines how teams are responding to that shift, from AI-powered threat detection to identity-first and zero-trust models for supply chain hardening, quantum-safe encryption, and SBOM adoption and strategies. It also explores how organizations are automating governance across build and deployment systems, and what changes when AI agents begin participating directly in DevSecOps workflows. Leaders and practitioners alike will gain a grounded view of what is working today, what is emerging next, and what security-first software delivery looks like in practice in 2026.

Security by Design

Refcard #403

Shipping Production-Grade AI Agents

By Vidyasagar (Sarath Chandra) Machupalli FBCS DZone Core CORE
Shipping Production-Grade AI Agents

Refcard #388

Threat Modeling Core Practices

By Apostolos Giannakidis DZone Core CORE
Threat Modeling Core Practices

More Articles

Integrating AI-Driven Decision-Making in Agile Frameworks: A Deep Dive into Real-World Applications and Challenges
Integrating AI-Driven Decision-Making in Agile Frameworks: A Deep Dive into Real-World Applications and Challenges

The integration of AI-driven decision-making within Agile frameworks presents a transformative opportunity for optimized workflows and enhanced decision-making processes. This article delves into the real-world applications and challenges of combining AI's analytical prowess with Agile methodologies. Key topics include the benefits of contextual adaptability, AI-augmented retrospectives, and the necessity of human oversight to balance AI autonomy with human intuition. Additionally, industry-specific insights from healthcare and retail demonstrate significant efficiency improvements, while technical implementations such as AI-enhanced CI/CD pipelines and story point estimations offer tangible advantages. However, challenges like the skills gap and lack of standardized methodologies highlight areas for growth and development. The article underscores the importance of a balanced approach, leveraging both AI and human insight for sustainable innovation. Introduction I remember a chilly morning in Woodland Hills, sipping my too-hot coffee and staring at my screen, puzzled by an intricate issue in our latest MuleSoft project. Our team was caught in the weeds, struggling with manual decision-making processes that just weren't cutting it. That's when it hit me — like many organizations, we were at the cusp of a digital transformation wave, but our adaptation rate was feeling sluggish like a hesitant swimmer at the edge of a pool. The solution, as it turned out, was not merely adopting AI but integrating its decision-making capabilities seamlessly into our Agile framework. As someone who has spent years weaving technology threads together, the idea intrigued me, and the journey since then has been nothing short of eye-opening. The AI and Agile Convergence: An Unfolding Opportunity Contextual Adaptability: The New Frontier In today's fast-paced tech environments, AI systems — particularly those that adapt in real-time — are becoming indispensable. Contextual adaptability is critical. For example, during a significant project with Farmers Insurance, I noticed that traditional systems couldn't adjust quickly enough to the dynamic needs of stakeholders. AI-driven solutions, however, offered us the flexibility to modify decision-making processes on-the-fly, taking into account the shifting team dynamics and requirements. It was like having a seasoned project manager who never tired and was always a step ahead. Imagine an AI that not only identifies bottlenecks but also proposes immediate remedies based on historical data and current team performance. AI-Augmented Retrospectives: An Unexpected Ally The retrospective has always been a cornerstone of Agile — an opportunity for teams to reflect and improve. But what if we could leverage AI to turbocharge this process? On a whim, we developed a prototype that analyzed past sprint data using machine learning algorithms. It highlighted workflow inefficiencies and even suggested potential areas of improvement. Skeptical colleagues soon turned advocates as they saw AI providing actionable insights that would have taken hours to deduce manually. The AI didn't just look at defects or missed deadlines; it correlated them with team moods and external factors, presenting a holistic view that we, as humans, often missed. The Great Debate: Autonomy vs. Oversight Why Human Oversight is Crucial The allure of fully autonomous AI systems is strong. Imagine a project where AI makes decisions independently, freeing up human resources for more creative tasks. But — and there's always a 'but' — in our experience, complete autonomy isn't always advantageous. One incident stands out: our AI recommended a drastic change in resource allocation during a critical sprint based purely on quantitative data, ignoring some unquantifiable team morale factors. The oversight nearly caused a rebellion within the team. This underlined the need for a balanced hybrid approach — AI for the number crunching, humans for the intuition and oversight. After all, as much as we credit AI with intelligence, it still lacks the nuanced understanding of human emotions and the unpredictability of team dynamics. Bias: The Invisible Culprit While working on a healthcare project, we ran into an unexpected hurdle. Our AI model for decision-making inadvertently exhibited biases — stemming from pre-existing skewed data patterns. This revelation was a wake-up call, reminding us that AI is only as unbiased as the data it feeds on. We faced a dilemma: how to integrate AI's precision with the necessity for equitable decision-making in Agile frameworks. Our solution was implementing regular audits of AI outcomes, partnering AI decisions with human judgment to ensure fairness — a process that was both enlightening and humbling. AI Across Industries: Lessons from Healthcare and Retail Healthcare: A Case Study in Balancing Precision and Care In the healthcare sector, AI integration into Agile frameworks has delivered some remarkable efficiencies in project management. I recall an instance where AI helped optimize resource allocation during a project aimed at enhancing patient care systems. By analyzing patient intake data and resource availability in real-time, AI allowed us to efficiently plan sprints and allocate development resources where they were most needed. The result? A 20% reduction in project delivery time and an increase in patient satisfaction scores. It was a perfect example of AI's ability to handle the nitty-gritty, leaving the strategic decisions to Agile teams. Retail: Personalization Meets Agile Retail is where AI truly shines in Agile applications. In one retail project, we utilized AI to refine inventory management, dynamically adjusting stock levels based on predictive modeling. The system learned from past sales data to predict future demand — a boon during peak shopping periods. Additionally, AI-driven personalization of the customer experience became a seamless integration into our Agile processes, enhancing customer engagement metrics significantly. Technical Deep Dives: Practical Applications of AI in Agile Integrating AI into CI/CD Pipelines One of the most impactful areas in which I've seen AI enhance Agile practices is within the CI/CD pipeline. Using AI to predict deployment risks and optimize testing processes is akin to having a crystal ball. In my experience, integrating these capabilities reduced deployment-related failures by approximately 30%. Specific tools like Jenkins with AI plugins or proprietary solutions allowed us to predict which builds might fail, vastly improving our time-to-market. AI-Enhanced Story Point Estimation: A Remarkable Time Saver An often overlooked but powerful application of AI is in improving story point estimation accuracy. Traditionally, estimation can be more guesswork than science. However, by training AI models on historical project data, we were able to achieve estimations with minimal discrepancies. This not only helped in better resource planning but also empowered our teams to deliver more reliably within set timelines. Challenges and Insights: A Personal Reflection Bridging the Skills Gap Despite the rapid advances in technology, there's a notable skills gap in AI integration within Agile frameworks. On numerous occasions, I’ve witnessed teams struggle simply due to a lack of expertise in either domain. The solution, in my opinion, lies in targeted education and training, promoting cross-functional skills that allow teams to bridge this gap effectively. Standardization: The Missing Element I must admit, one of the most frustrating aspects of integrating AI in Agile is the absence of standardized methodologies. Every organization seems to reinvent the wheel, leading to inconsistent results. The industry needs a unified framework that outlines best practices for AI adoption within Agile environments. This standardization will not only streamline processes but also facilitate faster innovations. Conclusion: The Path Ahead As AI continues to evolve, its integration into Agile frameworks will undoubtedly expand, offering even more sophisticated decision-making capabilities. This journey has taught me the significance of balance — leveraging AI for its unparalleled analytical prowess while maintaining human oversight to provide ethical and empathetic context. As I look forward, sipping another cup of coffee, I envision a future where AI and Agile coexist not as separate elements but as a seamless part of every project, complementing each other's strengths. My advice to fellow professionals is simple: embrace AI’s potential, but never lose sight of the human element that truly drives innovation.

By Abhijit Roy
Building Fault-Tolerant Kafka Consumers in Spring Boot Using Retry, DLQ, and Idempotent Code Patterns
Building Fault-Tolerant Kafka Consumers in Spring Boot Using Retry, DLQ, and Idempotent Code Patterns

Apache Kafka is a robust distributed streaming platform, but building a fault tolerant consumer requires careful handling of errors and duplicates. In this article, we focus on Spring Boot 3 with Spring Kafka 3.x to implement resilient Kafka consumers using retry mechanisms, dead-letter queues (DLQs), and idempotent processing patterns. We'll walk through how to configure retries, route problematic messages to a DLQ, and ensure that even if the same message is consumed multiple times, it is processed only once. Challenges in Kafka Consumer Fault Tolerance Kafka consumers usually operate in an at least once delivery mode, which means a message might be delivered multiple times if not acknowledged properly. Transient errors can cause message processing failures. Without proper handling, such failures might lead to data loss or duplicate processing. If a consumer fails after processing a message but before committing the offset, Kafka will resend that message to another consumer, leading to a duplicate delivery. A fault tolerant consumer design addresses these scenarios by: Retrying transient failures so that temporary issues don't result in lost opportunities to process the message.Using a Dead Letter Queue (DLQ) to hold messages that repeatedly fail processing, so they can be examined or retried later without blocking the main consumer flow.Implementing idempotent processing to gracefully handle duplicate messages, ensuring each message effect occurs only once. By combining these patterns, we can build consumers that are resilient to errors and avoid unwanted side effects from reprocessing. Implementing Retry Mechanism in Spring Kafka When a consumer fails to process a message, a common approach is to retry a few times before giving up. Spring Kafka provides flexible retry configurations via its error handling mechanisms. The DefaultErrorHandler can automatically retry a message a fixed number of times with a delay between attempts. After retries are exhausted, it can either drop the message or forward it to a recoverer for further handling. Let's configure a listener container with a DefaultErrorHandler using a fixed retry logic. In Spring Boot, we can customize the ConcurrentKafkaListenerContainerFactory to set our error handler: Java @Configuration public class KafkaConsumerConfig { @Bean public ConcurrentKafkaListenerContainerFactory<String, MyEvent> kafkaListenerContainerFactory( ConsumerFactory<String, MyEvent> consumerFactory, KafkaTemplate<String, MyEvent> kafkaTemplate) { ConcurrentKafkaListenerContainerFactory<String, MyEvent> factory = new ConcurrentKafkaListenerContainerFactory<>(); factory.setConsumerFactory(consumerFactory); // Define a DeadLetterPublishingRecoverer to publish failed messages to a ".DLT" topic DeadLetterPublishingRecoverer recoverer = new DeadLetterPublishingRecoverer(kafkaTemplate, (record, ex) -> new TopicPartition(record.topic() + ".DLT", record.partition())); // Configure error handler: 3 retry attempts with 1 second backoff, then send to DLQ DefaultErrorHandler errorHandler = new DefaultErrorHandler(recoverer, new FixedBackOff(1000L, 2)); // (FixedBackOff(1000, 2) means 2 retries = 3 total delivery attempts:contentReference[oaicite:0]{index=0}) // (Optional) Consider certain exceptions as non-retriable errorHandler.addNotRetryableExceptions(IllegalArgumentException.class); factory.setCommonErrorHandler(errorHandler); return factory; } } In this configuration, if a message processing throws an exception, the DefaultErrorHandler will retry up to 2 times with 1 second between retries. If the message still fails after retries, the handler invokes the DeadLetterPublishingRecoverer which publishes the bad record to a dead letter topic. We also mark IllegalArgumentException as a non-retriable exception in this example, so such errors will be handled immediately by the recoverer without retries. By default, Spring’s error handler treats certain exceptions as fatal and skips retries, since they are unlikely to succeed on a second attempt. Additionally, it's possible to handle retries manually by using Spring Kafka's acknowledgment mechanism. By setting the container AckMode to MANUAL and catching exceptions in the listener, you can nack a message to have it re-queued with delay. Dead Letter Queue (DLQ) for Failed Messages A Dead Letter Queue is a designated topic where messages that cannot be processed after all retries are sent. Rather than blocking the main consumer on a poisonous message or losing it, the DLQ acts as a safety net. As Baeldung defines it, a DLQ is used to store messages that cannot be correctly processed due to various reasons. These messages can be later removed from the DLQ for analysis or reprocessing. In our configuration above, we used a DeadLetterPublishingRecoverer which automatically sends the record to <topic>.DLT after the final failure. To leverage this, we must ensure that the DLQ topic exists (Kafka does not auto-create topics by default in many setups). DLQ Handling in Spring Kafka: By default, the recoverer will publish the message with the same key and value, and include headers such as the original topic and partition. We can customize the target topic name or even route to different topics based on exception type using a lambda in the recoverer. After publishing to DLQ, the DefaultErrorHandler will commit the offset of the failed message in the main topic, preventing it from being redelivered endlessly. This design effectively offloads problematic records to the side queue and allows the main consumer to continue with subsequent messages. One important consideration: if message order in the primary topic is critical, moving one message to a DLQ means it will be processed out of band and can break strict ordering guarantees in the overall system. Use DLQs judiciously in such cases. In most scenarios, though, a DLQ greatly improves system resiliency by preventing one bad message from holding up the entire queue. Idempotent Consumer Code Patterns (Handling Duplicates) Even with retries and DLQs, duplicate message deliveries can occur. An idempotent consumer ensures that processing the same message more than once has the same effect as processing it once. In other words, the consumer can consume the same message any number of times, but only actually processes once. This is crucial for avoiding inconsistent state or side effects in systems where the consumer might crash or reprocess messages. The recommended way to implement an Idempotent Consumer pattern is to use a persistent store to track processed message IDs. Typically, the producing system should include a unique identifier for each message. The consumer can then use this ID to decide if it has seen the message before. A common approach is to maintain a database table of processed message IDs. Using Spring Data JPA for example: Java @Entity @Table(name = "processed_events") public class ProcessedEvent { @Id private String eventId; // ... other fields like timestamp if needed } public interface ProcessedEventRepository extends JpaRepository<ProcessedEvent, String> {} Here, eventId serves as the primary key, ensuring uniqueness. Now, in the Kafka listener, we can implement idempotency logic using this repository. We attempt to insert a record for the new message ID and only proceed if it was not already present: Java @Component public class OrderEventsListener { @Autowired private ProcessedEventRepository processedRepo; @Autowired private OrderService orderService; // hypothetical service to process the event @KafkaListener(topics = "orders", groupId = "orders-group") @Transactional // ensure atomicity between DB operations public void onMessage(OrderEvent event) { String eventId = event.getId(); try { // Try to record this event as processed processedRepo.saveAndFlush(new ProcessedEvent(eventId)); } catch (DataIntegrityViolationException e) { // Event ID already exists in processed_events table // This is a duplicate, so skip processing return; // exiting without error will ack the message } // If we reach here, it means this event ID was not seen before // Proceed with main business logic orderService.processOrder(event); } } In the above code, we use saveAndFlush() to insert the new ProcessedEvent immediately to the database. If the event ID already exists, the database throws a DataIntegrityViolationException, which we catch to detect a duplicate message. Upon catching such an exception, we simply return without processing the event again. Because we did not throw an error in this case, the Kafka listener will acknowledge the message offset as processed. Thus, the duplicate message is effectively skipped with no side effects in the downstream system. A few important notes for this idempotent pattern: Wrapping the listener logic in a transaction (@Transactional) ensures that if the orderService.processOrder(event) fails and throws an exception, the insertion of the ProcessedEvent will be rolled back as well. This prevents a scenario where we mark an event as processed but fail to actually perform the business logic. If an exception occurs after the insert, the whole transaction is rolled back, and the Kafka message will be retried. On the next attempt, since the prior insert was rolled back, we can try again. This keeps the processing logic and the tracking table in sync.If the processing succeeds, both the processed-event record and any side effects are committed. If the application crashes after that but before the Kafka offset is committed, Kafka will deliver the message again on restart. In that case, the ProcessedEvent table already contains the ID so our code will detect it and skip orderService.processOrder on the second delivery. We then acknowledge the message immediately. This achieves atleast once processing with idempotent guarantee which is effectively exactly once from the perspective of the business logic.It's wise to periodically clean up or partition the tracking table if it grows large or use an TTL strategy if reprocessing old duplicates is not a concern after a certain period. The storage and lookup overhead should be considered but for moderate volumes this pattern is very manageable. Alternatives include using an external cache or key value store for tracking but a relational DB with a primary key or unique index works well when using JPA. Conclusion Building a fault tolerant Kafka consumer in Spring Boot involves orchestrating retries, dead-letter handling, and idempotent processing. By using Spring Kafka’s DefaultErrorHandler with a backoff policy, we can gracefully handle transient failures via retries. Integrating a Dead Letter Queue ensures that messages which consistently fail are routed to a side topic for inspection rather than blocking the main consumer or getting lost. Finally, employing an idempotent consumer pattern with a simple JPA-backed deduplication table guarantees that even if a message is delivered multiple times our business logic runs only once for each unique event. Through these patterns, our Kafka consumers become significantly more resilient to errors. We prevent data loss by not silently dropping messages, prevent infinite reprocessing loops by isolating bad messages in a DLQ, and we maintain data consistency by avoiding duplicate processing. Implementing these best practices in a Spring Boot 3 application with Spring Kafka 3 can greatly increase the reliability of event-driven microservices in production. By combining retry, DLQ and idempotency techniques, engineers can ensure their Kafka consumers are truly fault tolerant and robust in the face of real world issues.

By Mallikharjuna Manepalli
Understanding MCP Architecture: LLM + API vs Model Context Protocol
Understanding MCP Architecture: LLM + API vs Model Context Protocol

Suppose you want a chatbot that works with PDFs: extract text, search across documents, summarize sections. You can build it two ways: by calling an LLM API directly and wiring tools yourself, or by exposing those tools through the Model Context Protocol (MCP). Same user experience — different architecture. This article uses a PDF example to walk through both routes and explain what MCP adds. The Goal User asks in natural language → chatbot reads/searches PDFs → returns an answer. Example prompts: "What's in the introduction of report.pdf?""Search all PDFs in ./docs for 'quarterly revenue'""Summarize the key points from these three documents" Same behavior either way. The difference is how the chatbot gets PDF capabilities. Route 1: LLM + API (You Own the Loop) In this approach, your app talks directly to the LLM API (Claude, GPT, etc.), defines tools as part of each request, and runs the agentic loop yourself. You implement the PDF logic and you decide when to call it. Architecture Plain Text ┌─────────────────────────────────────────────────────────┐ │ Your App (single process) │ │ │ │ ┌──────────────┐ tools + messages ┌──────────┐ │ │ │ Agentic Loop │ ──────────────────────► │ LLM API │ │ │ └──────┬───────┘ └──────────┘ │ │ │ │ │ │ tool_use │ │ ▼ │ │ ┌──────────────┐ │ │ │ executeTool()│ ──► read_pdf, search_pdf, extract_text │ │ └──────────────┘ (your code, same process) │ └─────────────────────────────────────────────────────────┘ You define the tools, send them with every API call, and when the model returns tool_use, you run the matching function and feed the result back. PDF Tools (Inline) Plain Text const tools = [ { name: "read_pdf", description: "Extract full text from a PDF file", input_schema: { type: "object", properties: { path: { type: "string", description: "Path to the PDF" } }, required: ["path"] } }, { name: "search_pdf", description: "Search for a keyword across PDFs in a directory", input_schema: { type: "object", properties: { directory: { type: "string", description: "Directory containing PDFs" }, keyword: { type: "string", description: "Search term" } }, required: ["directory", "keyword"] } }, { name: "list_pdfs", description: "List all PDF files in a directory", input_schema: { type: "object", properties: { path: { type: "string", description: "Directory path" } }, required: ["path"] } } ]; // You maintain the dispatch async function executeTool(name, input) { if (name === "read_pdf") return await extractTextFromPdf(input.path); if (name === "search_pdf") return await searchPdfs(input.directory, input.keyword); if (name === "list_pdfs") return await listPdfFiles(input.path); throw new Error(`Unknown tool: ${name}`); } In the agentic loop, when the API returns stop_reason: "tool_use", you call executeTool(block.name, block.input) and append the result as a tool_result message. Loop until stop_reason: "end_turn". What This Route Gives You Single process — one app, no subprocessesFull control — you own the loop, tool definitions, and executionStraightforward — just the LLM SDK and a PDF library (e.g. pdf-parse)Tight coupling — only this app can use these PDF tools Route 2: MCP (Protocol + Tool Server) In this approach, you build a PDF MCP server that exposes the same operations as tools. Your chatbot (or Cursor, Claude Desktop, etc.) connects to it, discovers the tools at runtime, and sends tool calls over the protocol. The server runs the PDF logic; the client only orchestrates. Architecture Plain Text ┌─────────────────────────────────────────────────────────────────────────┐ │ Client (chatbot, Cursor, Claude Desktop, etc.) │ │ │ │ ┌──────────────┐ tools/list, tools/call ┌──────────────────────┐ │ │ │ MCP Client │ ◄────────────────────────► │ PDF MCP Server │ │ │ └──────┬───────┘ (JSON-RPC over stdio) │ (separate process) │ │ │ │ └──────────┬───────────┘ │ └─────────┼──────────────────────────────────────────────┼───────────────┘ │ │ │ messages + tool_use │ read_pdf, ▼ │ search_pdf, ┌─────────────────────────────────┐ │ list_pdfs │ LLM API (Claude, GPT, etc.) │ │ └─────────────────────────────────┘ ▼ ┌───────────────────┐ │ PDF filesystem │ │ (your machine) │ └───────────────────┘ The MCP server is a separate process that speaks the Model Context Protocol. Clients connect (e.g. via stdio or HTTP), call tools/list to discover tools, and tools/call to run them. The client then passes tool results to the LLM and continues the conversation. MCP Server: PDF Tools Plain Text // pdf-mcp-server.js import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; import { z } from "zod"; import { extractTextFromPdf, searchPdfs, listPdfFiles } from "./pdf-utils.js"; const server = new McpServer({ name: "pdf-server", version: "1.0.0" }); server.tool( "read_pdf", { path: z.string().describe("Path to the PDF file") }, async ({ path }) => { const text = await extractTextFromPdf(path); return { content: [{ type: "text", text }] }; } ); server.tool( "search_pdf", { directory: z.string().describe("Directory containing PDFs"), keyword: z.string().describe("Search term") }, async ({ directory, keyword }) => { const matches = await searchPdfs(directory, keyword); return { content: [{ type: "text", text: matches }] }; } ); server.tool( "list_pdfs", { path: z.string().describe("Directory path") }, async ({ path }) => { const files = await listPdfFiles(path); return { content: [{ type: "text", text: files.join("\n") }] }; } ); await server.connect(new StdioServerTransport()); The client never defines these tools. It discovers them: Plain Text // chatbot.js — connects to MCP, discovers tools const mcp = new Client({ name: "chatbot", version: "1.0.0" }); await mcp.connect(new StdioClientTransport({ command: "node", args: ["pdf-mcp-server.js"] })); const { tools } = await mcp.listTools(); // No hardcoding // When LLM returns tool_use: const result = await mcp.callTool({ name: block.name, arguments: block.input }); What MCP Adds ConceptLLM + APIMCPTool definitionHardcoded in your appDeclared in server, discovered by clientTool executionYour executeTool()Server runs it; client sends tools/callWho can use the tools?Only your appAny MCP client (Cursor, Claude Desktop, your chatbot)ProtocolAd hoc (your loop)JSON-RPC (tools/list, tools/call)BoundaryEverything in one processClear split: client = chat, server = tools MCP introduces a protocol between the AI client and the tool provider. The client doesn't need to know how PDFs are read; it just calls tools/call with a name and arguments. The server implements the logic. Add a new tool — e.g. summarize_pdf — and all connected clients see it without code changes. Using the PDF Example to Explain MCP Further 1. Separation of Concerns LLM + API: Your app does everything — chat, tool dispatch, PDF handling. One codebase, one deployable. MCP: The PDF server is a standalone service. It can be developed, tested, and versioned independently. The chatbot (or any client) only needs to know how to speak MCP. 2. Discovery Instead of Configuration LLM + API: You manually add each tool to your tools array and to executeTool. New tool = update client code. MCP: The client calls tools/list and gets the current set of tools. Add summarize_pdf to the server — clients automatically have it. No client changes. 3. Reuse Across Clients LLM + API: If you want Cursor or Claude Desktop to use your PDF tools, you must integrate with each client separately (if they support it at all). MCP: The same PDF server works with Cursor, Claude Desktop, VS Code Copilot, and your own chatbot. One server, many consumers. 4. Transport Flexibility MCP supports stdio (subprocess) and HTTP. Your PDF server can run locally as a subprocess or be deployed as an HTTP service. The protocol stays the same; only the transport changes. When to Use Which Use LLM + API when: You have a single app (internal tool, custom chatbot)You want minimal setup — one process, one deployOnly this app needs PDF (or whatever) capabilitiesYou prefer to own the entire flow Use MCP when: Multiple clients should use the same toolsYou want a reusable "PDF assistant" others can plug intoYou value a clear boundary between tool provider and chat clientYou're building toward an ecosystem of composable AI tools How External LLM Tools Call MCP — and Why It Helps When you expose tools via MCP, any LLM-powered app that speaks the protocol can connect to your server and use them — without you writing a single line of integration code. The Connection Flow: From Natural Language to the MCP Server Trace the path — from your typed question to the MCP server that runs the tool: User asks in natural language — e.g. "What's in report.pdf?"Client sends the question to its LLM — Cursor, Claude Desktop, or Copilot forwards your message to Claude, GPT, etc.LLM decides it needs a tool — It infers you want to read a PDF and chooses read_pdf(path: "report.pdf").Client sends tools/call to the MCP server — The client does not run the tool itself. It sends a JSON-RPC request to your MCP server.MCP server receives the call — Your pdf-mcp-server.js gets the request, runs read_pdf, extracts the text.Server returns the result — The MCP server sends the extracted text back to the client.Client passes the result to the LLM — The LLM receives the PDF content, formats a natural-language answer.User sees the answer — "The report covers Q3 revenue, product updates, and forecasts..." So: Natural language → LLM intent → tool choice → tools/call → MCP server executes → result flows back → LLM formats → user sees the answer. The external tool never implements PDF logic. It only needs to speak MCP: receive tool calls, forward them to your server, and return results. What Must Be in Place First (Setup) Before that flow can happen, the client must know which tools exist and where to send calls: User configures the MCP server in their client — they point to node pdf-mcp-server.js (stdio) or https://your-pdf-mcp.example.com (HTTP).Client starts or connects to the server and sends tools/list.Server returns the tool schemas — read_pdf, search_pdf, list_pdfs, etc. The client stores these so the LLM knows what tools are available when it interprets the user's question. Once that setup is done, every natural-language question follows the path above: NL → LLM → tool choice → tools/call → MCP server. Benefits for External LLM Tools BenefitWhat it meansZero integrationCursor, Claude Desktop, Copilot, etc. already support MCP. They don’t need a custom plug-in for your PDF server — they use the same protocol for every MCP server.Vendor neutralityYour PDF MCP server works with any MCP client. You’re not tied to one vendor’s SDK or approval process.Install and useUsers add your server to their MCP config (e.g. ~/.cursor/mcp.json or .vscode/mcp.json) and get your tools immediately. No forking, no wrapping.Same tools, many UIsOne PDF server powers chat in Cursor, in Claude Desktop, and in your own chatbot. Build once, reuse everywhere. What You Provide vs. What the Client Does You provideThe external LLM tool doesMCP server binary or URLConnects and discovers toolsTool implementations (read, search, etc.)Translates user intent into tool callsSchema (parameters, descriptions)Passes tool results back to its LLM—Renders the conversation to the user You focus on making your PDF tools correct and useful. The external tool focuses on conversation and UX. MCP is the contract between the two. The Bottom Line LLM + API is the direct path: you call the model, define tools, run them yourself. Simple and sufficient for one app. MCP is the protocol path: you expose tools in a server, clients discover and call them. More moving parts, but you gain discovery, reuse, and a standard interface. The PDF example shows the same capabilities — read, search, list — implemented two ways. Use it to decide where your next project belongs: tight and self-contained (LLM + API) or open and reusable (MCP). Further Reading Model Context Protocol — specification and conceptsMCP TypeScript SDK — server and client implementationAnthropic Tool Use — function calling in the Messages API

By Sanjay Mishra
How to Log HTTP Incoming Requests in Spring Boot
How to Log HTTP Incoming Requests in Spring Boot

In developing REST APIs, you often need to log HTTP incoming requests. You want to see exactly what data your application is receiving and how it is processed. You want a detailed view of the passed data to ease troubleshooting and development. CommonsRequestLoggingFilter is a class of Spring Boot that allows you to log requests with simple configuration steps. In this article, you'll see how to configure request logging in Spring Boot and inspect request payloads and parameters. Why You Need to Log HTTP Incoming Requests? Sometimes, you need a thorough look at the data that comes into our application. A typical scenario is when you need to solve a subtle bug, and you don't have enough information to understand what's going on. Logging HTTP requests allows you to improve control over the APIs' development to find issues in the shortest time possible. You will be able to inspect the request payloads and query parameters and verify the correct integration between services. It also represents a viable way to monitor APIs and catch unexpected behavior in time. If you don't use specific tools, the best you can do is write logs throughout the application. This is far from ideal, because all those tracings scattered in your code are hard to maintain, and they can contain errors. With CommonsRequestLoggingFilter you have a more centralized way of handling this. Request Logging Configuration To configure logging of incoming requests, you should follow some simple steps. First of all, you should create a request logging filter by defining a configuration bean of type CommonsRequestLoggingFilter: Java import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.web.filter.CommonsRequestLoggingFilter; @Configuration public class ReqLoggingConfig { @Bean public CommonsRequestLoggingFilter requestLoggingFilter() { CommonsRequestLoggingFilter loggingFilter = new CommonsRequestLoggingFilter(); loggingFilter.setIncludeQueryString(true); loggingFilter.setIncludePayload(true); // Enables request body logging loggingFilter.setMaxPayloadLength(10000); // Limits payload size loggingFilter.setIncludeHeaders(false); // Avoids logging headers loggingFilter.setAfterMessagePrefix("REQUEST DATA: "); return loggingFilter; } } The above filter allows you to log: The query parametersThe request bodyClient informationThe request URI An important setting in the above example is the setMaxPayloadLength() instruction. It prevents excessive memory consumption by limiting the payload size. The above is not enough, though. To make logging work, you should take a few additional steps. By default, the filter will not produce output unless the logging level is enabled. Add the following configuration to your application.properties: Plain Text logging.level.org.springframework.web.filter.CommonsRequestLoggingFilter=DEBUG This enables debug logging for the filter so that requests will appear in the application logs. If you also want to log request parameters, add this property: Plain Text spring.mvc.publish-request-params=true This allows Spring to expose them when the request is processed inside the controller. Here is a summary of some important points to remember: Use CommonsRequestLoggingFilter to log request payloads and parameters.Enable debug logging for the filter.Limit payload size to prevent memory issues.Avoid logging sensitive information in production. Inspect How Spring Processes HTTP Requests If you want to perform a more detailed inspection, in terms of request parameter resolution, request body conversion, and HTTP message processing, you have to add some extra configuration: Plain Text logging.level.org.springframework.web=DEBUG logging.level.org.springframework.web.servlet.mvc.method.annotation.HttpEntityMethodProcessor=DEBUG Request parameter resolution works by taking the query parameters, path variables, and headers and mapping them to the controller's method parameters. For example, if you have a controller like this: Java @PostMapping("/test") public String test(@RequestParam boolean active) { return "ok"; } Spring will map "active=true" to the boolean active parameter. In the log, you will find something like: Plain Text Resolved argument [0] [type=boolean] = true. The request body will also be converted from raw JSON into Java objects. Consider this controller service: Java @PostMapping("/test") public String test(@RequestBody User user) { return "ok"; } Spring will convert an incoming JSON body, like in the following example, into the User object argument: Plain Text {"name":"Mary","age":30} In the log, you will see: Plain Text Reading [application/json] into [com.example.User] You can also see how Spring processes the request into Java objects and, from Java objects, returns the response as JSON. Spring uses HTTP message converter objects to do this. In the log, you will see something like: Writing [application/json] with MappingJackson2HttpMessageConverter Example With a Simple REST Controller As an example, consider a simple REST service: Java import org.springframework.web.bind.annotation.PostMapping; import org.springframework.web.bind.annotation.RequestBody; import org.springframework.web.bind.annotation.RequestMapping; import org.springframework.web.bind.annotation.RestController; import java.util.Map; @RestController @RequestMapping("/api") public class DemoController { @PostMapping("/test") public Map<String, Object> test(@RequestBody Map<String, Object> body) { return Map.of( "message", "Request received", "data", body ); } } The above service accepts a JSON payload and returns it in the response. To test this API endpoint, you can use a curl command: Plain Text curl -X POST http://localhost:8080/api/test \ -H "Content-Type: application/json" \ -d '{"name":"Miriam","age":32}' When the request reaches your application service and is elaborated, something like this will be logged: Plain Text REQUEST DATA: uri=/api/test;payload={"name":"Miriam","age":32} Further Considerations CommonsRequestLoggingFilter has some limitations: The request body is only logged if it is read by the application.You have to limit large payloads, as they may impact performance.You shouldn't log sensitive data in production. If you want to log both requests and responses, you should implement a custom filter using ContentCachingRequestWrapper. Conclusion Spring Boot provides an easy way to log HTTP requests using CommonsRequestLoggingFilter. You need just a few configuration settings. It's an essential tool for diagnosing problems and maintaining the REST APIs. In the context of microservice architectures, this improves the whole observability stack. You find an example on GitHub.

By Mario Casari
End-to-End Event Streaming With Kafka, Spring Boot and AWS SQS/SNS (Production-Ready Code Guide)
End-to-End Event Streaming With Kafka, Spring Boot and AWS SQS/SNS (Production-Ready Code Guide)

Event-driven applications often demand high throughput, reliable delivery and flexible fan out messaging. Each platform in our stack plays a distinct role: Apache Kafka provides a distributed high volume event log, Amazon SQS offers durable point to point queues and Amazon SNS enables pub/sub broadcasting to multiple subscribers. Using them together yields a robust pipeline teams commonly use Kafka for streaming, SQS for decoupled processing and SNS for multicasting events. This synergy leverages the strengths of each platform to build scalable, loosely coupled systems. Architecture Overview The pipeline involves multiple components working together in sequence. Below is the event flow: Producer Service (Spring Boot & Kafka) – A microservice publishes an event message (in JSON format) to an Apache Kafka topic.Kafka Broker – The Kafka cluster durably persists the event and makes it available to consumers. Multiple services can consume from the topic in parallel if needed.Bridge Service (Kafka to SNS) – A Spring Boot service consumes the Kafka topic and forwards selected events to an AWS SNS topic.AWS SNS (Topic) – The Simple Notification Service fans out the event to all its subscribers. In our setup, an SQS queue is subscribed to this SNS topic.Consumer Service (SQS) – Another Spring Boot service listens on the AWS SQS queue and processes the incoming event. This hybrid design uses Kafka’s high throughput stream as the backbone, while AWS SNS/SQS handle distribution and decoupling at the edges. In practice, Kafka consumers (or connectors) often push critical events to SQS for ordered, independent processing or to SNS for real time fan out. By leveraging SNS’s fan out and SQS’s queuing, we gain additional durability and failure isolation the Kafka to SNS-SQS pattern enhances system reliability through AWS managed persistence and simplified failure handling. The result is a resilient, maintainable architecture that combines on-premises or cloud-based streaming with AWS’s managed messaging services. Kafka Producer Service First, we build a Spring Boot service to produce events to Kafka. Include the Spring for Apache Kafka library in the project and configure the Kafka broker address. For JSON data, you can send text strings or use a JSON serializer. Below is a REST controller that publishes incoming JSON payloads to a Kafka topic using Spring’s KafkaTemplate: Java @RestController public class EventProducerController { @Autowired private KafkaTemplate<String, String> kafkaTemplate; @PostMapping("/publish") public String publishEvent(@RequestBody String eventJson) { // send to Kafka topic kafkaTemplate.send("events-topic", eventJson); return "Event published"; } } In a real application, the producer might validate or transform the payload before sending. Here we directly send the raw JSON string to Kafka for simplicity. Once this endpoint is called (via an HTTP POST), the event message is written to the events topic on the Kafka cluster. Kafka-to-SNS Bridge Service Next, we create a service to bridge Kafka and SNS. Add the Spring Cloud AWS SNS integration (e.g. spring-cloud-aws-starter-sns) and configure the target SNS topic’s ARN in application properties (so we can inject it via @Value). The bridge service uses a @KafkaListener to consume messages from the Kafka topic and then publishes them to the SNS topic: Java @Component public class KafkaToSNSBridge { @Autowired private NotificationMessagingTemplate snsTemplate; @Value("${aws.sns.topic-arn}") private String topicArn; @KafkaListener(topics = "events-topic", groupId = "bridge-group") public void forwardEvent(String eventJson) { // forward Kafka event to SNS snsTemplate.convertAndSend(topicArn, eventJson); } } With Spring Cloud AWS, the NotificationMessagingTemplate (or SnsTemplate) simplifies publishing to SNS. The bridge listens on events topic (Kafka) and sends each message to the configured SNS topic ARN. We assume AWS credentials and region are set (via Spring Cloud AWS properties), so this code will authenticate and publish to SNS. In practice, you might filter or transform events here, only forwarding certain types to SNS. This Kafka consumer acts as a bridge that pushes important events into AWS services for external notifications. SQS Consumer Service Finally, a consumer service will receive the SNS-forwarded events from an SQS queue. Add the Spring Cloud AWS SQS integration (spring-cloud-aws-starter-sqs), and ensure an SQS queue is subscribed to the SNS topic (with raw message delivery enabled so the queue receives the JSON payload directly). Here’s a component that listens for messages on the queue: Java @Component public class SqsEventListener { @SqsListener("${aws.sqs.queue}") public void handleEvent(String eventJson) { // process event (currently just log it) System.out.println("Processing event: " + eventJson); // ... perform business logic ... } } When a message arrives in the queue, Spring Cloud AWS automatically invokes this listener with the payload. The JSON can be deserialized into a POJO if the method signature uses a custom type and Jackson is configured. In this example, we simply log the event. Note that the event flowed from the original Kafka producer through SNS into this SQS consumer, without the producer or final consumer needing direct knowledge of each other. This decoupling allows each component to scale and evolve independently. Production Considerations To make this integration production-ready, consider these best practices: Error Handling & Retries: Implement retry logic in Kafka consumers to handle transient failures. Leverage Kafka dead-letter topics or SQS dead letter queues for messages that repeatedly fail processing.Message Idempotency: Events might be delivered more than once (e.g. Kafka at-least-once semantics or SQS redelivery). Design consumers to handle duplicates safely (using unique IDs or de-duplication).Monitoring & Tracing: Combine CloudWatch metrics with Kafka logs in one dashboard for unified monitoring of throughput and errors; include correlation IDs in messages to trace events end-to-end.Security: Enforce secure access in production. Use IAM roles for AWS credentials (instead of static keys), and restrict Kafka topic access to authorized services.Managed Services: Consider managed solutions to reduce ops overhead – e.g. run Kafka on Amazon MSK, or use AWS Lambda / Kafka Connect to bridge Kafka with SQS/SNS without custom code.Ordering Guarantees: If message order is critical, use FIFO SNS topics and SQS queues with message group IDs to preserve ordering. Standard SQS queues do not guarantee order. By following these practices, you can build a resilient, production ready event pipeline that integrates Kafka with AWS’s messaging ecosystem. In summary, the combined Kafka-SNS-SQS stack forms a powerful backbone for scalable, event driven architectures, uniting Kafka’s streaming capabilities with the reliability of SNS and SQS. Thanks to Spring Boot’s integration support, much of this wiring is handled for you requiring minimal boilerplate and allowing you to focus on business logic while the system reliably delivers events end-to-end.

By Mallikharjuna Manepalli
Performance Optimization Techniques in Flutter 3.41 for Mobile App Development
Performance Optimization Techniques in Flutter 3.41 for Mobile App Development

Even in 2026, Flutter still continues to be the top framework for mobile app development for high-performance, visually rich, cross-platform apps (iOS, Android & Web) using one single codebase. The framework already provides strong performance thanks to its custom rendering engine and widget-based architecture. Flutter 3.41 continues improving the framework’s efficiency, rendering pipeline and developer tooling. But even with these improvements, developers still need to follow certain best practices to ensure that their applications remain responsive and efficient on real devices. In this article, I will summarize several practical performance optimization techniques that will help the mobile App developer's to use it as a reference. Understand How Flutter Renders UI Before diving into optimizations, lets understand how Flutter builds the UI. Flutter uses a widget–element–render object architecture. Every time the state changes, Flutter rebuilds the relevant widgets and updates the rendering tree. The framework is designed to rebuild widgets frequently, but unnecessary rebuilds can still affect performance when large widget trees are involved. The key idea is simple: rebuild only what is necessary. 1. Use Const Constructors Wherever Possible One of the easiest performance wins in Flutter is using const constructors for widgets that do not change. When a widget is declared as const, Flutter can reuse the existing instance instead of rebuilding it during UI updates. Example Without const: Plain Text class MyHomePage extends StatelessWidget { @override Widget build(BuildContext context) { return Column( children: [ Text("Welcome"), Icon(Icons.home), ], ); } } Optimized version: Plain Text class MyHomePage extends StatelessWidget { @override Widget build(BuildContext context) { return const Column( children: [ Text("Welcome"), Icon(Icons.home), ], ); } } In larger UI trees, this small change can significantly reduce unnecessary widget rebuilds. 2. Avoid Rebuilding Entire Widgets Sometimes developers accidentally rebuild an entire screen when only a small part of the UI needs to change. A better approach is to isolate the changing portion into smaller widgets. Example Instead of rebuilding the entire widget: Plain Text setState(() { counter++; }); Break the UI into smaller components: Plain Text class CounterWidget extends StatelessWidget { final int counter; const CounterWidget({required this.counter}); @override Widget build(BuildContext context) { return Text("Counter: $counter"); } } Now only the counter widget rebuilds, not the entire page. This technique becomes very important when building complex mobile UI layouts. 3. Use ListView.builder for Large Lists Displaying large datasets is common in mobile applications like chat apps, product lists, or feeds. Using a regular ListView loads every item at once, which increases memory usage and slows down rendering. Instead, use lazy loading with ListView.builder. Example Plain Text ListView.builder( itemCount: 1000, itemBuilder: (context, index) { return ListTile( title: Text("Item $index"), ); }, ); ListView.builder creates widgets only when they are visible on screen. This drastically improves scrolling performance in large lists. 4. Use RepaintBoundary for Complex Widgets Sometimes a portion of the UI contains expensive drawing operations, such as animations or charts. When Flutter rebuilds the UI, the entire screen may repaint unnecessarily. Wrapping expensive widgets with RepaintBoundary prevents unnecessary redraws. Example Plain Text RepaintBoundary( child: CustomPaint( painter: ChartPainter(), ), ) This tells Flutter to isolate the rendering of that widget so it doesn’t trigger repaints across the entire screen. 5. Optimize Image Loading Images are one of the most common sources of performance issues in mobile applications. Large images consume memory and slow down rendering. Best practices include Compress images, Use appropriate resolution and Cache network images Example using cached_network_image: Plain Text CachedNetworkImage( imageUrl: "https://example.com/image.jpg", placeholder: (context, url) => CircularProgressIndicator(), errorWidget: (context, url, error) => Icon(Icons.error), ) Caching prevents repeated downloads and improves scrolling performance in image-heavy applications. 6. Avoid Heavy Work on the Main Thread Flutter runs UI rendering on the main thread. If you perform expensive operations such as JSON parsing or large computations, the UI can freeze. Use isolates to move heavy work off the UI thread. Example Plain Text Future<int> heavyCalculation(int value) async { return await compute(calculate, value); } int calculate(int value) { return value * value; } This ensures that expensive computations do not block UI rendering. 7. Use Flutter DevTools for Performance Profiling Flutter provides powerful tools for analyzing performance issues. Flutter DevTools helps developers identify slow rendering frames, excessive widget rebuilds, memory leaks and layout issues To launch DevTools: Plain Text flutter run --profile Then open DevTools from the browser to inspect performance metrics. Profiling your application regularly helps detect performance problems early. 8. Minimize Overuse of State Management Updates State management solutions like Provider, Riverpod, Bloc, or GetX are commonly used in Flutter apps. However, poorly structured state updates can trigger unnecessary rebuilds. For example, updating global state too frequently may cause large portions of the UI to rebuild. Instead keep the state localized, update only the required widgets and Use selectors or granular listeners. This improves rendering efficiency and keeps UI updates predictable. Final Thoughts Flutter already provides excellent performance out of the box, but building a high-performance mobile application still requires careful attention to how the UI is structured and updated. In Flutter 3.41, improvements in the rendering engine and developer tooling make it easier to diagnose performance issues, but the fundamentals remain the same - minimize unnecessary rebuilds, reduce heavy work on the UI thread and structure widgets efficiently. Small optimizations like using const widgets, lazy loading lists, isolating expensive repaints and profiling with DevTools can make a significant difference in real-world mobile applications. Ultimately, performance optimization is not about premature tuning. It’s about understanding how the framework works and making thoughtful design choices that keep your application efficient as it evolves.

By Muhammed Harris Kodavath
Designing a Production-Grade Multi-Agent LLM Architecture for Structured Data Extraction
Designing a Production-Grade Multi-Agent LLM Architecture for Structured Data Extraction

Problem statement: Many enterprise systems rely on large volumes of documents that are similar in purpose but inconsistent in structure. For example, in the field of medicare insurance, different carriers, vendors, or partners publish documents describing comparable offerings, but each uses its own format, terminology, layouts, unstructured conditional clauses, etc. Many of these documents also contain tables, but are of different structures, sometimes within the same page. Another problem to call out is year-over-year and location-to-location variation, such as by state, county, and ZIP code. As a result, critical data is trapped in these documents, which require extensive manual review. Traditional rule-based parsers break with eventual formatting shifts and need extensive code deployments, tests, and releases. Regex-based approaches fail under real-world conditions and need constant maintenance. In fact, when scaling, even a single prompt (per document) LLM extraction fails and would work only for proof of concepts or demos. This is where a multi-agent LLM architecture becomes necessary. Why Single Agent LLM Extraction Fails A single agent would probably work for 5 documents, but it won’t work for 5,000. Some of the common failures which I’ve observed: Hallucinated values for missing fieldsContext window limits causing incorrect outcomesNo reliable confidence signalAutomatic assumptions in outcomes from previous memory In high-impact production environments, silent errors can be worse than explicit failures. With the probabilistic outcomes, there is also a need for deterministic guardrails. The solution is not just refined prompt engineering, but it is profound architectural decomposition, a mix of core software engineering tightly coupled with AI engineering. System Architecture Overview The system decomposes responsibilities into independent agents. PDF → Preprocessing → Extraction Agent → Validation Layer → Judge Agent → Structured Output Design Principles Separation of concernsStrict schema contractsDeterministic QA before acceptanceConfidence scoring and JudgingHuman in the loop feedback for low confidence judgingObservability metrics Each agent has a detailed and defined scope, thus increasing reliability Deployment Context and Infrastructure The architecture is deployed as a set of lightweight services rather than a single monolithic script. Each stage in the pipeline runs independently and communicates through structured JSON messages. This allows the system to scale horizontally when document volume increases. In a production environment, the pipeline would run on: A containerized backend (Docker-based deployment)A queue-based processing system (For processing asynchronously)A storage layer for processing and versioningA structured output store — database This setup allows multiple documents to be processed simultaneously without blocking the system. If the extraction agent fails on one document, it does not interrupt the processing of others. Extraction Agent The extraction agent’s sole responsibility is to convert document chunks into structured JSON that adheres to the predefined schema. Key design decisions include: Low temperatureExplicit JSON schema enforcementChunk-level semantic segmentationCarrier/partner agnostic prompt design Chunking is important here, as a fixed-length token system breaks logical sections. Instead, variable chunking via semantic segmentation improves accuracy. The final RAG-based system is designed to be dynamic, allowing the extractor to look at the top few chunks as needed. Python class ExtractionAgent: def __init__(self, llm_client, prompt_template, schema: dict): self.llm = llm_client self.prompts = prompt_template self.schema = schema def run(self, chunks: list[str]) -> dict: prompt = self.prompts.build_extraction_prompt(chunks) response = self.llm.generate( prompt=prompt, temperature=0.0, response_schema=self.schema ) return response The output contract is strict and is sent for further validation. No validation is performed by the Extraction Agent. Validation Agent We divided validation into two parts, a hybrid approach. Deterministic validation: This enforces JSON Schema integrity, required vs. optional fields, basic QA checks such as data types, range checks, NULLs, etc. All of these are to ensure structural correctness, which is most often tightly coupled with end-user UI.Contextual LLM validation: A second LLM pass compares the extracted output with the original documented text. Its role is primarily to detect mismatches between extracted values and the source. It would identify, flag, and correct hallucinated and missing entries. Python class ValidationAgent: def __init__(self, llm_client, prompt_template): self.llm = llm_client self.prompts = prompt_template def validate(self, extracted: dict, source_chunks: list[str]) -> dict: deterministic = self._deterministic_checks(extracted) contextual_prompt = self.prompts.build_validation_prompt( extracted, source_chunks ) contextual = self.llm.generate( prompt=contextual_prompt, temperature=0.0 ) return { "deterministic": deterministic, "contextual": contextual } def _deterministic_checks(self, extracted: dict) -> dict: errors = [] if "plan_name" not in extracted: errors.append("Missing required field: plan_name") for item in extracted.get("benefits", []): if isinstance(item.get("copay"), (int, float)) and item["copay"] < 0: errors.append("Invalid negative copay detected") return { "valid": len(errors) == 0, "errors": errors } Judge Agent Even after validation, the system needs a decision layer. Here comes the Judge Agent, which receives the extracted output, validation results, and findings, and produces a confidence score, classifies the error, and performs a final decision. During judging, confidence thresholds are bucketed and calibrated against historical datasets that are already processed. This helps transform the output into outcomes that can be tracked and improved operationally. An additional context here — it is important to build the extraction agent and validation agent with two different state-of-the-art LLMs. The judging LLM would also be a different model from the ones used for Extraction and Validation. Python class JudgeAgent: def __init__(self, llm_client, prompt_template, threshold: float = 0.85): self.llm = llm_client self.prompts = prompt_template self.threshold = threshold def evaluate(self, validation_result: dict) -> dict: prompt = self.prompts.build_judge_prompt(validation_result) judgment = self.llm.generate( prompt=prompt, temperature=0.0 ) confidence = judgment.get("confidence_score", 0.0) if confidence >= self.threshold: status = "PASS" elif confidence >= 0.6: status = "REVIEW" else: status = "FAIL" return { "confidence": confidence, "status": status, "details": judgment } Prompt Engineering and Variability Handling In a production-level GenAI system, prompt engineering must be treated like software development, with prompts version-controlled, reusable, and benchmarked against golden historical datasets. As document formats evolve, there can be a degradation in the accuracy of prompts unless they are continuously evaluated. Build a strong, generic base prompt and include custom prompts for specific carriers derived from historical datasets. Python class PromptTemplate: def __init__(self, version: str): self.version = version def build_extraction_prompt(self, chunks: list[str]) -> str: return f""" You are an expert structured data extraction system. TASK: Extract all relevant fields according to the JSON schema. Do NOT infer missing values. Preserve numeric fidelity. Include conditional clauses exactly as written. DOCUMENT CONTENT: {self._format_chunks(chunks)} Return ONLY valid JSON. Prompt-Version: {self.version} """ def build_validation_prompt(self, extracted: dict, source: list[str]) -> str: return f""" Compare the extracted structured output with the source document. Identify: - Hallucinated values - Missing fields - Numeric mismatches - Logical inconsistencies Extracted: {extracted} Source: {self._format_chunks(source)} Return validation findings in structured JSON. """ def build_judge_prompt(self, validation_result: dict) -> str: return f""" Based on the validation findings, assign: - Confidence score (0.0 - 1.0) - Error category (none/minor/major) - Final decision (PASS/REVIEW/FAIL) Validation Result: {validation_result} Return structured JSON only. """ def _format_chunks(self, chunks: list[str]) -> str: return "\n\n".join(chunks) Because document variability is unavoidable, architectures must assume entropy rather than optimizing for ideal inputs. Adaptive chunking, partner-aware prompt conditioning, and nested logic all play a profound role. Plan Benefit Example: Inpatient Services Input excerpt from plan document: A section of an insurance document stating: "For Inpatient Hospital Services, a 20% Coinsurance applies after the annual deductible of $500 is met. This Coinsurance is capped at a maximum of $5,000 out-of-pocket per calendar year." Extraction Agent Output (Excerpt): JSON { "deductible_in_network": 500, "inpatient_coinsurance_rate": 0.20, "inpatient_service_type": "Hospital Services", "coinsurance_condition": "annual deductible of $500 is met", "max_out_of_pocket_inpatient": 5000 } Validation Agent result: Deterministic check: PASS (All required fields present. inpatient_coinsurance_rate is a float between 0.0 and 1.0. max_out_of_pocket_inpatient is a positive integer).Contextual check: PASS (The LLM confirms the $5,000 cap is correctly associated with the inpatient coinsurance, and the conditional trigger for the coinsurance is accurately captured). Judge Agent decision: Confidence: 0.95, Status: PASS. Observability and cost in production: Key metrics for the system include, but are not limited to: Extraction success rateValidation failure rateAverage confidence score (over time)Token usage per document and document type Monitoring the confidence distribution using any industry-standard open-source Python module is the secret sauce that helps indicate prompt drift/regression, completely new and unexpected document structures, and errors. Human-in-the-loop feedback needs to be accounted for, and building a simple UI that can perform actions like ignore or fix goes a long way toward usability. For a cost optimization strategy in production, some foundational practices must be implemented. Batch processing with token observabilitySingle-pass LLM for consistently defined structuresParallelization (which is easily achievable with reusable prompts and LLM REST APIs) Overall Architecture Conclusion LLMs are powerful extraction tools, but without structure, they can lead to unstable or unexpected outcomes. By decomposing responsibilities into extraction, validation, and judgment agents, and combining them with traditional approaches such as enforcing schema contracts, confidence scoring, etc., it becomes possible to transform ‘similar but also varying’ semi-structured documents from multiple inconsistent sources into reliable, structured data at scale. The difference between a proof of concept and a production AI-based system is not the model but the architecture around it.

By Haricharan Shivram Suresh Chandra Kumar
Bucket4j + Infinispan: A Deep Dive Into Implementation
Bucket4j + Infinispan: A Deep Dive Into Implementation

In distributed systems, the biggest challenge for rate limiting is state. How do you ensure that two parallel requests hitting different cluster nodes don't "double-spend" the same token? In this article, we dive into the implementation details of the integration between the Bucket4j rate-limiting framework and Embedded Infinispan (not HotRod). This setup creates a data grid across different pods of a single application, allowing for seamless, distributed token management. Note: This guide is based on Bucket4j 8.16.1, Infinispan 16.1.0, and infinispan-protostream 6.0.4. While the logic should hold for earlier versions, behavior in Infinispan < 10 may require additional verification. To keep this guide focused and readable, I have omitted some of the more granular implementation details. Main Actors The Embedded Infinispan Layer: Functional Map API Infinispan is a high-performance key-value store designed for low latency. For Bucket4j, the most critical feature is the Functional Map API. Java @Experimental interface ReadWriteMap<K, V> extends FunctionalMap<K, V> { ... } Unlike a standard cache.put(), the Functional Map allows us to execute a lambda (an Entry Processor) directly on the node that owns the data with a CAS guarantee. This approach offers three major advantages via ReadWriteMap.eval(key, entryProcessor): Atomicity: Locks are acquired before the lambda executes.Data locality: The function travels to the data, minimizing network traffic.Non-blocking: It returns a CompletableFuture, fitting perfectly into modern asynchronous architectures. The detailed jump into the implementation of Infinispan cache I think is very verbose and we can skip them for clarity of the flow. The Bucket4j Layer: Abstraction and Proxies Bucket A Bucket is a stateful object that maintains a current balance of tokens and a set of rules (bandwidths) for how those tokens are consumed and refilled over time. It uses a Builder pattern to configure not just the bucket’s behavior, but also its execution model (how and where the logic is processed, we look at that later more precisely). It has a sibling, AsyncBucketProxy, which provides methods returning CompletableFuture objects. For the remainder of this article, when I refer to a 'bucket,' I am specifically referring to the AsyncBucketProxy. For simplicity, we could pretend that it has the next method. Java public interface AsyncBucketProxy { CompletableFuture<Boolean> tryConsume(long numTokens); } Of course, you could use the abstract implementation and initialize the object directly through a constructor. However, you have a better option — use the built-in builders, and your flow would look like this Java Bucket.builder() .addLimit(limit -> limit.capacity(50).refillGreedy(10, Duration.ofSeconds(1))) .build(); // or BucketConfiguration configuration = BucketConfiguration.builder() .addLimit(limit -> limit.capacity(50).refillGreedy(10, Duration.ofSeconds(1))) .build(); proxyManager .builder() .build("SYSTEM", () -> CompletableFuture.completedFuture(configuration)); Pretty neat, right? While Bucket.builder() is responsible for building local buckets, the ProxyManager handles distributed buckets, and that is where things get interesting. Proxy Manager The ProxyManager interface (and its base implementation AbstractProxyManager) is the backbone of Bucket4j's distributed logic. It unifies the flow of building bucket behavior and delegates the execution to the specific implementations. To make this, Bucket4j internally uses Remote command and Request interfaces. Java public interface RemoteCommand<T> { CommandResult<T> execute(MutableBucketEntry mutableEntry, long currentTimeNanos); } Java public class Request<T> implements ComparableByContent<Request<T>> { //...... omit for clarity private final RemoteCommand<T> command; public Request(RemoteCommand<T> command, //...... omit for clarity) { this.command = command; } //...... omit for clarity } With a "Remote command" interface, we could wrap and execute any operation on data on the remote server. But we don't have an actor that executes this command: CommandExecutor (AsyncCommandExecutor for AsyncBucket). AbstractProxyManager creates this object inside and enriches the bucket with the implementation. Look at the example below. Java @Override public AsyncBucketProxy build(K key, Supplier<CompletableFuture<BucketConfiguration>> configurationSupplier) { if (configurationSupplier == null) { throw BucketExceptions.nullConfigurationSupplier(); } AsyncCommandExecutor commandExecutor = new AsyncCommandExecutor() { @Override public <T> CompletableFuture<CommandResult<T>> executeAsync(RemoteCommand<T> command) { ExpirationAfterWriteStrategy expirationStrategy = clientSideConfig.getExpirationAfterWriteStrategy().orElse(null); Request<T> request = new Request<>(command, getBackwardCompatibilityVersion(), getClientSideTime(), expirationStrategy); // Pay attention! Supplier<CompletableFuture<CommandResult<T>>> futureSupplier = () -> AbstractProxyManager.this.executeAsync(key, request); return clientSideConfig.getExecutionStrategy().executeAsync(futureSupplier); } }; commandExecutor = asyncRequestOptimizer.apply(commandExecutor); return new DefaultAsyncBucketProxy(commandExecutor, recoveryStrategy, configurationSupplier, implicitConfigurationReplacement, listener); } The Secret Sauce: By delegating executeAsync to the ProxyManager, Bucket4j separates the rate-limiting logic from the underlying storage technology. This is why the same library can support Redis, Postgres, or Infinispan just by switching the manager. Code: Supplier<CompletableFuture<CommandResult<T>>> futureSupplier = () -> AbstractProxyManager.this.executeAsync(key, request); With this knowledge, we could jump into the InfinispanProxyManager to look at the details. Infinispan Proxy Manager The first thing to note in the documentation is that Bucket4j requires specific serialization for Infinispan. This is crucial because Infinispan operates on byte streams. (below, part of documentation). Java import io.github.bucket4j.grid.infinispan.serialization.Bucket4jProtobufContextInitializer; import org.infinispan.configuration.global.GlobalConfigurationBuilder; ... GlobalConfigurationBuilder builder = new GlobalConfigurationBuilder(); builder.serialization().addContextInitializer(new Bucket4jProtobufContextInitializer()); However, our focus should be on the implementation of execution. Let's have a look at this. Java @Override public <T> CompletableFuture<CommandResult<T>> executeAsync(K key, Request<T> request) { try { InfinispanProcessor<K, T> entryProcessor = new InfinispanProcessor<>(request); CompletableFuture<byte[]> resultFuture = readWriteMap.eval(key, entryProcessor); return resultFuture.thenApply(resultBytes -> deserializeResult(resultBytes, request.getBackwardCompatibilityVersion())); } catch (Throwable t) { CompletableFuture<CommandResult<T>> fail = new CompletableFuture<>(); fail.completeExceptionally(t); return fail; } } @Override public <T> CommandResult<T> execute(K key, Request<T> request) { // sync copy of executeAsync } And here we see a special InfinispanProcessor<K, T>. What secrets could we find inside? Java import io.github.bucket4j.distributed.remote.AbstractBinaryTransaction; import io.github.bucket4j.distributed.remote.RemoteBucketState; import io.github.bucket4j.distributed.remote.Request; import io.github.bucket4j.distributed.serialization.InternalSerializationHelper; import io.github.bucket4j.util.ComparableByContent; import org.infinispan.functional.EntryView; import org.infinispan.functional.MetaParam; import org.infinispan.util.function.SerializableFunction; public class InfinispanProcessor<K, R> implements SerializableFunction<EntryView.ReadWriteEntryView<K, byte[]>, byte[]>, ComparableByContent<InfinispanProcessor> { public InfinispanProcessor(Request<R> request) { this.requestBytes = InternalSerializationHelper.serializeRequest(request); } //... omitted for clarity public byte[] apply(EntryView.ReadWriteEntryView<K, byte[]> entry) { if (requestBytes.length == 0) { // it is the marker to remove bucket state if (entry.find().isPresent()) { entry.remove(); return new byte[0]; } } return new AbstractBinaryTransaction(requestBytes) { // ... omitted for clarity @Override protected void setRawState(byte[] newStateBytes, RemoteBucketState newState) { ExpirationAfterWriteStrategy expirationStrategy = getExpirationStrategy(); long ttlMillis = expirationStrategy == null ? -1 : expirationStrategy.calculateTimeToLiveMillis(newState, getCurrentTimeNanos()); if (ttlMillis > 0) { entry.set(newStateBytes, new MetaParam.MetaLifespan(ttlMillis)); } else { entry.set(newStateBytes); } } }.execute(); } } And here is the trick to integrate Infinispan and Bucket4j: SerializableFunction<EntryView.ReadWriteEntryView<K, byte[]>, byte[]>. This is a special interface that allows the system to ship a function to a different node to execute arbitrary code. Infinispan accepts this serializable function and executes it on the remote node where the data actually resides. Crucial requirement: The serialized bytecode must be present on both the sender and the receiver nodes. If your pods are running different versions of the application, the InfinispanProcessor will fail to deserialize on the owner node. So, there is still uncertainty about what is inside AbstractBinaryTransaction.execute(). Let's dive into the code. Java public byte[] execute() { // ... logic to deserialize request ... try { RemoteBucketState currentState = null; if (exists()) { byte[] stateBytes = getRawState(); // get state on the node currentState = deserializeState(stateBytes); } MutableBucketEntry entryWrapper = new MutableBucketEntry(currentState); currentTimeNanos = request.getClientSideTime() != null ? request.getClientSideTime(): System.currentTimeMillis() * 1_000_000; RemoteCommand<?> command = request.getCommand(); CommandResult<?> result = command.execute(entryWrapper, currentTimeNanos); if (entryWrapper.isStateModified()) { RemoteBucketState newState = entryWrapper.get(); setRawState(serializeState(newState, backwardCompatibilityVersion), newState); } return serializeResult(result, request.getBackwardCompatibilityVersion()); } // omit for clarity } In this part of the code, we see execution logic that is involved on the remote server and compute - does the request have enough tokens to come further, or should it be repeated? Execution Flow Previously, the main parts of algorithms were introduced, and we are ready to combine them to get the final view of how we get the final answer to the question: could we consume tokens or not? As discussed, a lot of magic hides in the building phase by wrapping CommandExecutor and Requests on the ProxyManager level, and let's unwrap this envelope and show what's happening in integration. *Key - Infinispan uses the Consistent Hashing technique inside. Summary Integrating Bucket4j with Embedded Infinispan offers a sophisticated solution for distributed rate limiting by moving logic to the data. Data locality: By using readWriteMap.eval(), the rate-limiting decision is executed directly on the node that owns the bucket's state, minimizing network hops.Atomic consistency: Infinispan ensures that the InfinispanProcessor runs with strict atomicity and CAS guarantees, solving the "double-spend" problem without heavy distributed locks.Performance: Operating at the byte[] level ensures that state transitions are extremely fast and the memory footprint remains small. For production, ensure cluster homogeneity: Your Bucket4j versions and application bytecode must be identical across all pods to avoid serialization errors. This setup allows you to build a self-contained, high-performance rate-limiting grid that scales horizontally with your application.

By Arkadii Osheev
Generate Random Test Data in PostgreSQL
Generate Random Test Data in PostgreSQL

When developing and testing applications that use a PostgreSQL database, it's often helpful to populate your tables with random data. Whether you're testing queries, performance, or database functionality, having a set of test data can help ensure your application performs as expected. In this guide, we'll walk through how to create an anonymous PL/pgSQL block that generates random data and inserts it into a PostgreSQL table. The data will include various types such as integers, strings, dates, booleans, and UUIDs. Why Use Random Data? Random data is crucial in testing because it helps simulate real-world scenarios. For example: Stress testing: Populate your tables with a large amount of data to see how your system performs under load.Edge case testing: Generate random values that might help uncover issues with validation or boundaries.Non-deterministic testing: Ensure your application works correctly regardless of the specific data used. The PostgreSQL Code: Generating Random Data The following steps outline how to write a PL/pgSQL block that generates and inserts random data into a PostgreSQL table: 1. Set Up Your PostgreSQL Table First, make sure you have a table that you want to populate with random data. Here's an example of a simple table: SQL CREATE TABLE IF NOT EXISTS test_schema.test_tab2 ( id BIGINT NOT NULL, fname VARCHAR(50), lname VARCHAR(50), create_date DATE, status BOOLEAN, CONSTRAINT test_tab1_pkey PRIMARY KEY (id) ); This table includes: An id (bigint)A fname (string)A lname (string)A create_date (date)A status (boolean) 2. Generate Random Data With PL/pgSQL Now, we can write a PL/pgSQL anonymous block that generates random data and inserts it into the table. This script will: Randomly generate values for each column based on the data type.Insert a specified number of rows (in this case, 10).Print the generated SQL statements for debugging and visibility. Here’s the code: SQL DO $$ DECLARE rec_count INTEGER := 10; -- Limit to 10 records for testing col RECORD; col_list TEXT := ''; val_list TEXT := ''; sql_stmt TEXT; i INTEGER; tbl_schema TEXT := 'test_schema'; tbl_name TEXT := 'test_tab2'; random_date DATE; random_status BOOLEAN; BEGIN -- Construct column names for insert statement FOR col IN SELECT column_name, data_type FROM information_schema.columns WHERE table_schema = tbl_schema AND table_name = tbl_name ORDER BY ordinal_position LOOP col_list := col_list || col.column_name || ', '; END LOOP; -- Trim trailing comma from column list col_list := left(col_list, length(col_list) - 2); -- Loop to insert rows FOR i IN 1..rec_count LOOP -- Initialize val_list for each row val_list := ''; -- Loop through each column type to generate corresponding values for each row FOR col IN SELECT column_name, data_type FROM information_schema.columns WHERE table_schema = tbl_schema AND table_name = tbl_name ORDER BY ordinal_position LOOP -- Generate value for each column based on its data type CASE col.data_type WHEN 'bigint' THEN val_list := val_list || i || ', '; WHEN 'character varying' THEN val_list := val_list || quote_literal(col.column_name || '_' || i) || ', '; WHEN 'text' THEN val_list := val_list || quote_literal(col.column_name || '_' || i) || ', '; WHEN 'date' THEN -- Generate a random date between 2000-01-01 and 2009-12-31 random_date := '2000-01-01'::date + trunc(random() * 366 * 10)::int; val_list := val_list || quote_literal(random_date) || ', '; WHEN 'boolean' THEN -- Generate a random boolean value (TRUE/FALSE) random_status := (i % 2 = 0); -- TRUE if even, FALSE if odd val_list := val_list || random_status || ', '; WHEN 'uuid' THEN val_list := val_list || 'gen_random_uuid(), '; ELSE val_list := val_list || 'NULL, '; END CASE; END LOOP; -- Trim trailing comma from val_list val_list := left(val_list, length(val_list) - 2); -- Prepare the SQL statement with dynamically generated values sql_stmt := format( 'INSERT INTO %I.%I (%s) VALUES (%s);', tbl_schema, tbl_name, col_list, val_list ); -- Print the SQL statement to the console RAISE NOTICE 'Executing: %', sql_stmt; -- Execute the SQL statement EXECUTE sql_stmt; -- Print confirmation of each inserted row RAISE NOTICE 'Inserted row % into %I.%I', i, tbl_schema, tbl_name; END LOOP; END $$; How This Code Works col_list: This variable dynamically collects the column names from the table schema.val_list: For each row, this variable dynamically generates the values for each column, based on its data type (e.g., integers, strings, dates, booleans).Random data generation: Bigint: We use the row number (i) as a simple value for bigint columns.Strings (fname, lname): We concatenate the column name with the row number (e.g., fname_1, lname_1).Date: We generate a random date between 2000-01-01 and 2009-12-31 using the expression '2000-01-01'::date + trunc(random() * 366 * 10)::int.Boolean: The status column is set to TRUE for even rows and FALSE for odd rows.UUID: A random UUID is generated using gen_random_uuid().SQL Statement Execution: The script then dynamically constructs an INSERT INTO SQL statement and executes it for each row, inserting the data into the table. 3. Executing the Code After writing the code, you can run it in your PostgreSQL environment. The script will print the SQL INSERT statements as it executes, so you can verify what is being inserted. 4. Verifying the Results You can use a simple SELECT Query to verify the random data was inserted: SQL SELECT * FROM test_schema.test_tab2; This will display all the records that were inserted with the random data. Benefits of Using This Method Flexibility: The script can easily be modified to generate more rows or handle additional columns and data types.Dynamic data generation: The data is dynamically generated based on the schema of the table, so no manual input is needed.Realistic testing: By generating random values, you simulate a variety of real-world scenarios, making your tests more robust and reliable. Conclusion Generating random test data in PostgreSQL can be a powerful tool for developers and testers. Whether you’re building new features, performing load testing, or ensuring data integrity, using dynamic PL/pgSQL scripts to generate test data allows you to automate the process and focus on the logic of your application. By following this guide, you can easily populate any PostgreSQL table with random data and streamline your testing and development process.

By arvind toorpu DZone Core CORE
Unlocking Smart Meter Insights with Smart Datastream
Unlocking Smart Meter Insights with Smart Datastream

The rollout of smart meters across the UK has fundamentally changed how energy data is generated and used. Millions of devices now capture consumption data at fine-grained intervals, offering a much clearer picture of how energy is used across households and businesses. This shift creates a real opportunity. With the right tools, organizations can move beyond basic reporting and start making informed decisions around efficiency, cost optimization, and sustainability. However, while the potential is clear, working with this data in practice is far from simple. This brings us to one of the core challenges organizations face today. The Challenge of Smart Meter Data Smart meters generate highly granular data, typically at half-hour intervals. At scale, this results in extremely large and continuously growing datasets. Although this data is valuable, organizations often encounter a familiar set of challenges: Integrating with complex smart meter infrastructureMeeting strict regulatory and security requirementsManaging large-scale data ingestion and storageHandling both real-time and historical data streamsMaking the data usable within business applications These challenges are widely recognized. Research into smart grid systems highlights how data volume, velocity, and interoperability remain major barriers to effective adoption and analytics. As a result, many organizations find themselves collecting large amounts of data without being able to fully utilize it. This is exactly the gap Smart Datastream is designed to address. What is Smart Datastream? Smart Datastream is a platform designed to simplify how organizations access and use smart meter data. Instead of dealing with fragmented systems and raw infrastructure, teams can access structured, ready-to-use energy data through APIs and integrate it directly into their applications. The platform provides: Up to 13 months of historical consumption dataHalf-hourly smart meter readingsNear real-time energy data streamsPortfolio-level insights across multiple sites By exposing this data through APIs, Smart Datastream allows organizations to focus less on data collection and more on building meaningful solutions. To make this possible at scale, the platform relies on a modern and robust architecture. Platform Architecture Smart Datastream is built using a cloud-native architecture designed to handle continuous, high-volume data streams. At its core, the platform uses a microservices approach, where independent services are responsible for ingesting, processing, and exposing energy data. This ensures flexibility, scalability, and resilience as the system evolves. One of the key design choices is the use of event-driven processing. Event-Driven Processing Energy data flows through an event-driven pipeline, allowing the system to process updates in real time while maintaining reliability. This approach is widely used in modern data platforms because it enables systems to handle high throughput while keeping services loosely coupled. Scalable Data Infrastructure To support millions of data points, the platform relies on distributed storage and caching technologies. This ensures that large volumes of data can be processed efficiently without compromising performance or availability. As smart meter deployments continue to grow, scalability becomes not just an advantage, but a necessity. This naturally leads to another important aspect of the platform: secure and controlled access. Secure API Access Smart Datastream exposes its capabilities through secure APIs, allowing organizations to retrieve and analyze energy data in a controlled way. This is particularly important in the energy sector, where data privacy and regulatory compliance are critical. Domain-Driven Design To manage complexity, the platform follows domain-driven design principles. This helps structure the system around real-world energy workflows, making it easier to maintain and extend over time. Together, these architectural decisions form the foundation of the platform. Building on this, the choice of technology stack ensures that the system remains performant and scalable. Technology Stack Smart Datastream is built using modern cloud-native technologies designed for reliability and performance. Core components include: .NET Core and C# for backend servicesRedis for caching and performance optimizationCloud messaging systems for event-driven communicationDistributed databases for large-scale data storageMicroservices architecture for independent service scaling This combination allows the platform to process large volumes of energy data efficiently while maintaining low latency. With this technical foundation in place, the real value of the platform becomes clearer when looking at how it is used in practice. Use Cases Smart Datastream supports a wide range of practical applications, depending on how organizations choose to use their energy data. Energy Consumption Monitoring Organizations can gain a clear view of how energy is being used across their operations. By analyzing consumption patterns over time, it becomes easier to identify inefficiencies, reduce waste, and optimize overall energy usage. Portfolio Energy Management For organizations managing multiple sites or properties, Smart Datastream enables a consolidated view of energy consumption. This makes it possible to compare performance across locations, identify outliers, and establish benchmarks for improvement. Sustainability and Carbon Reporting Access to accurate consumption data is essential for tracking emissions and supporting sustainability initiatives. As organizations increasingly align with ESG targets and regulatory requirements, having reliable energy data becomes a key enabler for reporting and compliance. Anomaly Detection With the right analytics in place, unusual consumption patterns can be detected early. This can help identify issues such as equipment faults, energy leaks, or unexpected spikes in usage before they become larger problems. These use cases highlight how raw energy data can be transformed into actionable insights. This, in turn, leads to tangible benefits for organizations. Benefits for Organizations Smart Datastream is designed to make working with energy data more accessible and practical at scale. Simplified Data Access Instead of dealing with complex integrations and infrastructure, organizations can access structured energy data through a consistent set of APIs. This significantly reduces the effort required to get started. Scalable Infrastructure The platform is built to handle large volumes of data from millions of devices, making it suitable for enterprise-level deployments without requiring additional custom infrastructure. Faster Innovation With data readily available and easy to integrate, teams can focus on building solutions rather than managing data pipelines. This shortens development cycles and accelerates the delivery of new features and services. Improved Decision Making Having access to detailed, near real-time energy data allows organizations to make more informed decisions. Whether at an operational or strategic level, better visibility leads to better outcomes. Taken together, these benefits demonstrate how Smart Datastream moves organizations from simply collecting data to actually using it effectively. Conclusion Smart meters are generating more data than ever before, but data alone does not create value. The real challenge lies in making that data accessible, usable, and actionable within real-world systems. Smart Datastream addresses this by providing a scalable and secure platform that bridges the gap between raw energy data and practical applications. By combining modern architecture, event-driven processing, and API-first design, it enables organizations to unlock insights and build smarter energy solutions. As the energy landscape continues to evolve, platforms like Smart Datastream will play a critical role in helping organizations move toward more efficient, data-driven, and sustainable operations.

By Muhammad Rizwan

Culture and Methodologies

Agile

Agile

Career Development

Career Development

Methodologies

Methodologies

Team Management

Team Management

Beyond Conversation: Mastering Context with Claude Code Skills and Agents

May 5, 2026 by Ioan Tinca

AI Didn't Replace Seniors; It Just Made Them the Bottleneck

May 5, 2026 by Abgar Simonean

Cost Is an SLI: Why Your System Is “Healthy” but Burning Cash

May 4, 2026 by David Iyanu Jonathan

Data Engineering

AI/ML

AI/ML

Big Data

Big Data

Databases

Databases

IoT

IoT

Beyond Conversation: Mastering Context with Claude Code Skills and Agents

May 5, 2026 by Ioan Tinca

The Hidden Latency of Autoscaling

May 5, 2026 by David Iyanu Jonathan

Engineering LLMOps: Building Robust CI/CD Pipelines for LLM Applications on Google Cloud

May 5, 2026 by Jubin Abhishek Soni DZone Core CORE

Software Design and Architecture

Cloud Architecture

Cloud Architecture

Integration

Integration

Microservices

Microservices

Performance

Performance

From Monolith to Microservices: Practical Lessons From Real System Modernization

May 5, 2026 by Jayapragash Dakshnamurthy

Spring Boot Done Right: Lessons From a 400-Module Codebase

May 5, 2026 by Dmitriy Kopylenko

Engineering LLMOps: Building Robust CI/CD Pipelines for LLM Applications on Google Cloud

May 5, 2026 by Jubin Abhishek Soni DZone Core CORE

Coding

Frameworks

Frameworks

Java

Java

JavaScript

JavaScript

Languages

Languages

Tools

Tools

Spring Boot Done Right: Lessons From a 400-Module Codebase

May 5, 2026 by Dmitriy Kopylenko

Setting Up Claude Code With Ollama: A Guide

May 5, 2026 by Gunter Rotsaert DZone Core CORE

How We Diagnosed a Hidden Scheduler Failure in a Docker Swarm Cluster Serving 2 Million Users

May 5, 2026 by Denis Tiumentsev

Testing, Deployment, and Maintenance

Deployment

Deployment

DevOps and CI/CD

DevOps and CI/CD

Maintenance

Maintenance

Monitoring and Observability

Monitoring and Observability

The Hidden Latency of Autoscaling

May 5, 2026 by David Iyanu Jonathan

Modernization Is Not Migration

May 5, 2026 by vaibhav Sharma

How We Diagnosed a Hidden Scheduler Failure in a Docker Swarm Cluster Serving 2 Million Users

May 5, 2026 by Denis Tiumentsev

Popular

AI/ML

AI/ML

Java

Java

JavaScript

JavaScript

Open Source

Open Source

Beyond Conversation: Mastering Context with Claude Code Skills and Agents

May 5, 2026 by Ioan Tinca

Engineering LLMOps: Building Robust CI/CD Pipelines for LLM Applications on Google Cloud

May 5, 2026 by Jubin Abhishek Soni DZone Core CORE

Beyond n8n for Workflow Automation: Agent Graphs as Your Universal Agent Harness

May 5, 2026 by Scarlett Attensil

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×