DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Microservices

A microservices architecture is a development method for designing applications as modular services that seamlessly adapt to a highly scalable and dynamic environment. Microservices help solve complex issues such as speed and scalability, while also supporting continuous testing and delivery. This Zone will take you through breaking down the monolith step by step and designing a microservices architecture from scratch. Stay up to date on the industry's changes with topics such as container deployment, architectural design patterns, event-driven architecture, service meshes, and more.

icon
Latest Premium Content
Trend Report
Cloud Native
Cloud Native
Trend Report
Modern API Management
Modern API Management
Refcard #379
Getting Started With Serverless Application Architecture
Getting Started With Serverless Application Architecture

DZone's Featured Microservices Resources

Who Owns the Data Stack?: How AI Is Reshaping Ownership, Architecture, and Accountability Across Teams

Who Owns the Data Stack?: How AI Is Reshaping Ownership, Architecture, and Accountability Across Teams

By Miguel Garcia DZone Core CORE
Editor’s Note: The following is an article written for and published in DZone’s 2026 Trend Report, Cognitive Databases, Intelligent Data: Unified Infrastructure for Vector Search, AI-Optimized Queries, and Hybrid Workloads. For years, some of us have argued that the data stack is part of the product and should be engineered like the application layer: as code and as a service. The market matured toward it, and the data mesh has been the clearest recent expression. AI has eclipsed those debates and settled the matter. The data stack is now product-facing, shaping what users see, what AI answers, and which automated decisions and workflows fire. That makes one question unavoidable: When an answer depends on data across many systems and teams, who is accountable for accuracy? An AI answer is assembled at request time from corporate data. The data stack is inside the response. AI Turns Data Infrastructure Into Product Behavior AI makes the data stack part of product behavior, but raw infrastructure should not leak into the product. The goal is to abstract the stack behind durable, governed interfaces. An AI feature should consume meaning, relationships, permissions, and context. Following data mesh and data contracts, the API layer has to evolve from returning data to exposing capabilities. A consumer, including an AI model, should depend on a contract that carries: Metadata – origin, lineage, meaningQuality – freshness, completeness, confidenceRelationships – how entities compose and traverseSecurity – authorization applied consistently across operational, analytical, and vector stores When meaning lives in the contract, infrastructure becomes interchangeable, and a misbehaving AI feature is no longer an opaque failure — it’s a question with an owner. Where Ownership Breaks First Ownership does not break at the edges of systems but much earlier, in how the organization is designed. Most technology organizations still distribute teams around components and technical specialization: applications, databases, pipelines, governance, indexing, and analytics. Each team owns its layer, though no one owns the end-to-end meaning of the data. That worked when data only fed analytics. It fails in AI-native products, where data is product behavior, and the two lifecycles are inseparable. AI composes its behavior across every layer at once, inheriting each inconsistency in semantics, freshness, permissions, and relationships. So this is not a handoff problem; it is a Conway’s Law problem. Architecture mirrors the organization, and AI makes the organizational seams visible to the user. Platform teams remain essential for shared abstractions, governance primitives, and standards. But product teams need to own both their features and APIs and their data end to end: its lifecycle, meaning, quality, and governance. Splitting teams by technical layer scatters one business entity across many disconnected owners, and AI inherits that fragmentation. AI-native organizations give product teams end-to-end ownership of the data, with platform teams providing shared standards. Accountability Follows Product Behavior When data only fed dashboards, accountability could stay narrow: Did the pipeline run, and did the report match? AI moves that boundary. Once retrieval, copilots, and agents start making decisions and generating answers from data, a correct pipeline, a healthy index, and a valid access policy still don’t guarantee a correct user-facing result. Accountability can’t be pinned to technical layers. It has to follow the behavior the user experiences. The product team that owns an AI capability is responsible for the end-to-end correctness, freshness, explainability, and safety of the data behind it. Its job is to own the contract that defines what the AI may know and retrieve. Platform teams provide the standardized primitives that make this accountability structure possible: semantic contracts, lineage, quality signals, access enforcement, observability, and governance-aware retrieval. The question shifts from “which team owns this layer?” to “which product team owns this behavior, and which platform capabilities guarantee it?” In AI-native systems, accountability rests with the team that owns the behavior, not the system that happened to fail. Table: Accountability Differences Between the Layer-by-Layer and AI-Native Models arealayer-by-layerai-nativeSource of truthEach system decides locallyThe product team owns the authoritative semantic contractQualityThe data team checks pipelinesThe product team owns user-facing correctness; the platform provides quality signalsRetrievalThe platform team owns indexes as infrastructureA governed product capability with explicit SLOsAccessThe security team owns policies separatelyEnforced consistently across product, data, and AI layersIncidentsRouted to whichever layer failedThe product team leads; the platform, data, and security teams support as capability owners Architecture Choices Are Also Operating Model Choices Architecture decisions also decide how an organization governs and evolves meaning. AI-native systems raise the stakes here because copilots and agents consume meaning — entities, relationships, metrics, and permissions — rather than tables. Semantic consistency becomes part of how the product behaves. No central team can own the meaning of every domain, so meaning has to live close to the domain that owns the capability. But decentralization alone backfires: Without platform-enforced standards, the old central bottleneck just turns into semantic fragmentation, with every domain exposing its own definitions and contracts. The fix is to split ownership cleanly: Domains own the meaningPlatform teams own the contracts that keep it consistent Underneath, storage and processing keep churning. What actually lasts is whether stable abstractions (e.g., “employee,” “payroll,” “entitlement”) survive above them. The principle is simple: Infrastructure should be replaceable, and meaning should not. So the real operating-model choice comes down to who owns meaning, and who keeps it consistent. Shared Data Contracts Make Accountability Concrete If organizational fragmentation is the root problem, contracts make ownership explicit. A classic data contract is necessary but insufficient. Schema validation catches a renamed column, but it misses semantic drift, stale meaning, or a changed business definition. Those failures don’t break a build. They break behavior. The contract has to grow from schema into semantics, carrying meaning, lineage, quality, and authorization. Crucially, it abstracts the capability and meaning a domain exposes, not the storage format underneath, so it behaves the same whether the source is a table, a document, an event, or an embedding. That makes the data contract both a producer-to-consumer check and a runtime semantic interface that retrieval, copilots, and agents all consume. Its real value is relocating accountability to the source so drift surfaces in the producing domain while context stays local, which accelerates interoperability rather than centralizing control. Governance Has to Travel With the Data Traditional governance sat beside the data in the form of periodic reviews, approvals, and access checks. AI breaks that model. Data now moves continuously through pipelines, caches, embeddings, indexes, and agents, recomposed at runtime faster than any review can observe. Governance must be part of the execution model itself. Governance travels with meaning, not storage. An embedding holds no raw rows yet reveals sensitive meaning, so policy must follow the semantic classification. The gap is sharpest in authorization. Identity systems stop at the API boundary, and AI doesn’t preserve security boundaries on its own, which turns every embedding, cache, and retrieval step into a new one to defend. Governance therefore becomes a runtime capability that decides what AI may retrieve, infer, expose, and act on. Solving that calls for composable, declarative governance primitives embedded in the platform so auditability becomes a property of the system rather than the outcome of a project. Accountability Gaps That Slow AI Data Work The real cost of fragmented accountability is the constant drag on every data-powered capability. Friction is never neutral, so when teams can’t trust the platform’s freshness, semantics, or governance, they route around it and build their own, resulting in shadow pipelines, local indexes, and duplicated transformations. Each workaround makes sense locally even as it corrodes the whole, fragmenting governance and eroding trust in the very platform it was meant to replace. And piling on more central control only hides the problem — the fragmentation just migrates into those shadow systems. So the deeper gap was missing platform contracts. What Clear Ownership Looks Instead of adding more teams on more layers, clear ownership means aligning accountability with the single product experience the user meets. What you’re really investing in is the stable semantic abstractions that outlast whatever infrastructure comes and goes. And the hardest problem is how to make the organization understandable to its own AI systems. Additional resources: DAMA-DMBOK: Data Management Body of KnowledgeDAMA International – foundational guidance on data ownership, stewardship, and governance rolesOpen Data Contract Standard (ODCS) – an open spec for declaring schema, semantics, quality, and service levels between data producers and consumersOpenLineage – an open standard for collecting data lineage across pipelines and services, useful for tracing what AI features consumeNIST AI Risk Management Framework (AI RMF) – a vendor-neutral framework for accountability and governance of AI systemsCoral – exposes diverse data sources to agents through one declared SQL and semantic layer; an example of meaning being owned per source rather than centrallyGetting Started With Data Quality, DZone Refcard by Miguel García LorenzoData Pipeline Essentials, DZone Refcard by Sudip SenguptaOpen-Source Data Management Practices and Patterns, DZone Refcard by Abhishek GuptaReal-Time Data Architecture Patterns, DZone Refcard by Miguel García Lorenzo“Building Trusted, Performant, and Scalable Databases: A Practitioner’s Checklist” by Saurabh Dashora This is an excerpt from DZone’s 2026 Trend Report, Cognitive Databases, Intelligent Data: Unified Infrastructure for Vector Search, AI-Optimized Queries, and Hybrid Workloads.Read the Free Report More
The Rise of Microservices Architecture in Scalable Applications

The Rise of Microservices Architecture in Scalable Applications

By Mitchell Jhonson
In recent years, building modern applications has changed from what has been seen historically. Usually, in the past, systems were developed with a single, large block of code (referred to as a monolithic design) and would operate fairly well for smaller applications, but with time, as they got larger and more complex, the method of writing software became more of a hindrance to the applications as they required more users and increased speed. Now, companies need their applications to be able to grow quickly, adapt to changes quickly, and be able to support millions of users without any impact on performance, and that is where microservice architecture is so relevant. Microservice architecture has become the way to design scalable applications because applications can be broken into smaller, individual services that can work independently from each other. The trend towards microservice architecture in developing applications that can scale indicates to me that there is a shift in value towards being flexible, quick, and resilient in the highly competitive digital environment we live in today. What Is Microservices Architecture? Microservice architecture is a method of designing an application as a set of distinct parts that operate independently and perform specific tasks. Each microservice communicates with the others via APIs. With a microservice architecture, as opposed to a traditional monolithic system where all of the application’s components are dependent upon one another, developers can modify/update/deploy/scale a single microservice without impacting any of the other microservices in the application. In an e-commerce application, the components include user authentication, product catalog, payment processing, and order processing (each of these services exists as a microservice). Why Microservices Are Gaining Popularity Microservices are more than just a trend; they are the answer to increased demands for scalable, flexible, and high-performing applications. As digital-first business models grow, traditional architectures simply can't keep up, driving a preference for microservices. 1. Scalability Requirements Modern applications often deal with unpredictable user traffic, especially during peak times such as high-volume sales, new product launches, or virally driven surges in user traffic. In a monolithic architecture, scaling means replicating your entire application on expensive resources over a long period, which is inefficient. 2. Quick Development Cycles With the rapid pace of change in the marketplace, speed is key to success in competitive industries today. The use of a microservices architecture enables development teams to develop different services simultaneously without affecting one another’s progress. 3. Technology Flexibility The flexibility of technology is one of the greatest benefits of microservices architecture. Unlike Monolithic systems that typically use only one tech stack, each microservice can be built using the best programming language, framework, or database. For example, a data-intensive microservice can use a high-performance programming language as its primary language, while the UI microservice can use a more flexible front-end framework. 4. Enhanced Fault Containment Failure is a fact of life for big programs. What you do when it happens can make a difference. In a monolithic program, a single bug or failure can shut down an entire application. Microservices provide better fault containment by isolating faults to independent services. When an individual service fails, the failure won't automatically affect the rest of the program. This results in higher overall system availability and an improved user experience. 5. Agreement With DevOps Microservices architecture aligns well with DevOps practices, which focus on automation, collaboration, and continuous delivery. With microservices, teams can develop CI/CD pipelines for each of their services so that they can deploy frequently and reliably. Automated testing, monitoring, and deployment allow them to release updates efficiently with minimal risk. The Benefits of Microservice Architecture The rise in popularity of microservices aligns with current trends in the enterprise landscape; however, many organizations are beginning to realize significant value in microservice architecture for application development and performance. Through the use of microservices, an organization can break down large, complex systems into smaller parts (components). By creating applications using smaller components or microservices, organizations can develop highly scalable, resilient, and efficient systems. 1. Services Can Be Deployed Independently Deployment of one or more services can occur independently using a microservices-based architecture. In traditional applications, deploying even a small change would require deploying the entire application (which could take a long time and add significant risk). 2. Improved Scalability Since microservices inherently have scalability as a key design feature, software development companies can concentrate just on scaling those parts of their applications that require more resources rather than scaling an entire application as was done with Monolith-type applications. 3. Greater Agility Agility is extremely important in today’s digital market that changes rapidly. By allowing multiple teams that consist of members from different functional areas to independently develop their own services using microservices, microservices allow us to increase development speed and decision-making speed. 4. Easier to Manage Codebase It is common for large codebases to become challenging to manage over time. One of the advantages of using a microservices architecture is the ability to create smaller codebases that can be easily managed. 5. Increased Reliability Reliability is one of the most important aspects of any system, especially those with a large number of users. Microservices can help improve reliability by isolating faults between services. Conclusion The increase in the use of microservice architectures within scalable applications has led to a change of focus to properly design systems that are flexible, durable, and can grow with an organization's business needs. By breaking down large, complex applications into smaller independent services, organizations can take advantage of better speed of development, increased scalability, and greater reliability of their systems. Although there are some challenges associated with implementing microservices, the long-term benefits will more than justify any upfront investment required to adopt this architectural style in a modern enterprise. Businesses that have a plan, the right tools, and the right people can quickly realize the full benefits of the microservice architecture while providing high-quality digital experiences to their customers. More
Implementing Asynchronous Communication Between Microservices Using Kafka and Spring Boot
Implementing Asynchronous Communication Between Microservices Using Kafka and Spring Boot
By Mallikharjuna Manepalli
Your AI Coding Agent Can't Steal What It Never Had: The Docker Sandbox Isolation Story
Your AI Coding Agent Can't Steal What It Never Had: The Docker Sandbox Isolation Story
By Shamsher Khan DZone Core CORE
Parallel Kafka Batch Processing With Kotlin Coroutines in Spring Boot
Parallel Kafka Batch Processing With Kotlin Coroutines in Spring Boot
By Erkin Karanlık
Runtime Formula Evaluation With MVEL Library in Spring Boot
Runtime Formula Evaluation With MVEL Library in Spring Boot

In our software development processes, business units constantly want to update discount rates, loyalty points, or salary calculation logic. If this logic is within the code, between when-or-if-else blocks, every change means a new unit test process, code analysis, CI/CD pipeline work, and ultimately a "deployment." In this article, we will separate the business logic from the code, making it manageable in the database and reliably interpretable at runtime. By increasing flexibility, we will ensure the system's stable operation continues without interruption. To do all this, we will examine how to use the MVEL (MVFLEX Expression Language) library below. The Cost of Static Code: Why Should We Avoid It? Generally, point calculations are as follows: Kotlin fun calculatePoints(pointType: String, factor: Int): Long { return when (pointType) { "INITIAL" -> 100L "BIRTHDAY" -> 50L "TENURE_5_10" -> factor * 10L "TENURE_10_20" -> factor * 20L "TENURE_20_PLUS" -> factor * 30L else -> 0L } } When looking at the code, what appears is more of a maintenance burden than a simple function. If the factors change or a new rule is added, the code is triggered from the beginning. However, these values are actually data, not code. Architectural Approach Below, you will find how it works when we add the Formula engine. Kotlin import org.mvel2.MVEL val formula = "factor * 20" val vars = mapOf("factor" to 5) val result = MVEL.eval(formula, vars) In this architecture, the code does not know "how to calculate"; It only knows how to call the 'Formula engine.' Database Design Converting Rules to Data We can store business rules in a flexible table. This ensures manageability. PLSQL CREATE TABLE t_point_type ( point_type_id NUMBER PRIMARY KEY, point_type_name VARCHAR2(100), point_formula VARCHAR2(500), description VARCHAR2(1000) ); Sample data: Plain Text | point_type_id | point_type_name | point_formula | |:-------------:|:---------------:|:-----------------------:| | 1 | INITIAL | `100` | | 2 | BIRTHDAY | `50` | | 3 | TENURE_5_10 | `factor * 10` | | 4 | TENURE_10_20 | `factor * 20` | | 5 | TENURE_20_PLUS | `factor * 30` | | 6 | PROMOTIONAL | `factor * multiplier` | Application Layer The most critical point to consider in MVEL integration is performance and error management. 1. Entity Definition Kotlin @Entity @Table(name = "t_point_type") data class PointTypeEntity( @Id @Column(name = "point_type_id") val pointTypeId: Long? = null, @Column(name = "point_type_name") val pointTypeName: String? = null, @Column(name = "point_formula") val pointFormula: String? = null ) 2. MvelUtil: Performance-Oriented Helper Class Considering the CPU cost of parsing strings in every request, we should use compiled expressions and caching mechanisms. Kotlin @Component class MvelUtil { fun evaluateFormula(formula: String, factor: Int): Long { return try { val variables = mapOf("factor" to factor) val result = MVEL.eval(formula, variables) when (result) { is Number -> result.toLong() else -> 0L } } catch (e: Exception) { throw BusinessException( errorCode = ErrorCodes.MVEL_FORMULA_EVALUATION_FAILED, errorDesc = "Formula evaluation failed: $formula, factor: $factor — ${e.message}" ) } } fun evaluateFormulaAsString(formula: String, factor: Int): String { return try { val variables = mapOf("factor" to factor) MVEL.eval(formula, variables).toString() } catch (e: Exception) { throw BusinessException( errorCode = ErrorCodes.MVEL_FORMULA_EVALUATION_FAILED, errorDesc = "Formula evaluation failed.: $formula — ${e.message}" ) } } } 3. Service Layer and Business Logic Therefore, our service layer simply receives the data and triggers the formula engine. Kotlin @Service class PointCalculationService( private val pointTypeRepository: PointTypeRepository, private val mvelUtil: MvelUtil ) { fun calculatePoints(pointTypeId: Long, factor: Int): Long { val pointType = pointTypeRepository.findById(pointTypeId) .orElseThrow { BusinessException(ErrorCodes.POINT_TYPE_NOT_FOUND) } val formula = pointType.pointFormula ?: throw BusinessException(ErrorCodes.POINT_FORMULA_NOT_DEFINED) val points = mvelUtil.evaluateFormula(formula, factor) if (points <= 0) { log.info("The formula gave a score of 0 or negative: type=$pointTypeId, factor=$factor") return 0L } return points } } Call service: Kotlin val factor = inputData.factorSpecificForPoint ?: 1 val points = calculatePoints(inputData.pointTypeId, factor) if (points > 0) { savePointDetail(points, subscriptionId, inputData.pointTypeId, inputData.operationId) } Advanced Usage: Multivariable and Conditional Formulas MVEL has the ability to decode complex strings. Its true power lies in this. For example, the formula in the database might look like this: SQL UPDATE t_point_type SET point_formula = 'factor * multiplier + bonus' WHERE point_type_id = 6; Kotlin fun evaluateWithMultipleVars(formula: String, vars: Map<String, Any>): Long { return try { val result = MVEL.eval(formula, vars) (result as? Number)?.toLong() ?: 0L } catch (e: Exception) { throw BusinessException(ErrorCodes.MVEL_FORMULA_EVALUATION_FAILED) } } val vars = mapOf("factor" to 5, "multiplier" to 3, "bonus" to 10) evaluateWithMultipleVars("factor * multiplier + bonus", vars) Conditional Statements MVEL supports ternary expressions and Boolean logic: Plain Text factor > 10 ? factor * 20 : factor * 10 (factor >= 5 && factor < 10) ? 50 : (factor >= 10 ? 100 : 25) This provides truly dynamic rules without any code changes. We must not ignore these three rules, as everything is necessary; Strict validation: The formula must be validated with MVEL.compileExpression() before being saved to the database. An incorrect syntax error can disrupt the entire flow at runtime.Sandbox and security: MVEL is robust; it can access Java classes. Therefore, formula entry should only be done from authorized (admin) panels, and if necessary, MVEL's secure mode should be configured.Default value: There can always be a fallback mechanism. We determine how the system will behave if the formula receives an error or the result returns null (e.g., 0 points). Conclusion MVEL makes it easy for us to dynamically implement business rules in Spring Boot projects. It reduces code complexity while allowing you to respond to business unit requests within minutes (without deployment!). XML Dependency (Maven): XML <dependency> <groupId>org.mvel</groupId> <artifactId>mvel2</artifactId> <version>2.5.0.Final</version> </dependency>

By Erkin Karanlık
Building a Multi-Agent Orchestration Capability: Architecture and Code Walkthrough
Building a Multi-Agent Orchestration Capability: Architecture and Code Walkthrough

Artificial intelligence (AI) is quickly changing from simple conversation models to systems that can tackle complex problems through teamwork. As products become smarter, one key approach that is gaining traction today is multi-agent orchestration. A single AI model can handle straightforward tasks like answering questions or generating content. Yet, modern product features increasingly need: Multi-step reasoningSpecialized expertiseTool integrationsDynamic decision makingExecution of actionsContinuous feedback Trying to manage all of these with one model often leads to complexity, decreased accuracy, and limited growth potential. Multi-agent orchestration solves these issues by establishing a system where multiple specialized agents work together within a coordinated framework. This article explains how to create a general multi-agent orchestration capability and shows a practical example using code. One example we could think of, where we would require multi-agent orchestration, is: Intelligent Travel Assistant A user says, "Plan my trip to New York for three days under $1500." An intent agent understands the needs. The weather agent checks for the best time to visit. A search agent finds flights and hotels. A planning agent creates the itinerary while an execution agent makes the reservations. All this happens without the user being aware of it in the backend. Understanding the Architecture A multi-agent system generally consists of four major components: Agents Agents are specialized AI units designed for particular responsibilities. Examples: Intent agent, planning agent, search agent, recommendation agent, execution agent, and validation agent Tools Agents require access to external systems. Examples: APIs, databases, search engines, knowledge repositories, and workflow systems Shared Context Agents need access to common information: Python { "user":"User", "goal":"Recommend an action", "history":[], "constraints":[] } This prevents agents from operating independently without awareness. Orchestration Layer The orchestrator acts as the central coordinator. Responsibilities include: Task decompositionAgent selectionContext managementWorkflow executionResult aggregation The orchestrator acts as the "brain." Example: Suppose users interact with a product capability using: "Help me find and recommend the best option based on my needs." The workflow might involve "Understand user intent", "Retrieve information", "Analyze findings", "Generate recommendations," and "Execute actions." Step 1: Define Base Agent Structure Create a generic agent abstraction. Python from abc import ABC, abstractmethod class Agent(ABC): @abstractmethod def execute(self, context): pass All agents inherit from this class. Step 2: Create Specialized Agents Intent Agent Responsible for understanding user objectives. Python class IntentAgent(Agent): def execute(self, context): query=context["query"] print("Intent Agent running...") context["intent"]=f"Intent identified from: {query}" return context Search Agent Responsible for retrieving information. Python class SearchAgent(Agent): def execute(self, context): print("Search Agent running...") context["results"]=[ "Option A", "Option B", "Option C" ] return context Recommendation Agent Generates recommendations. Python class RecommendationAgent(Agent): def execute(self, context): print("Recommendation Agent running...") recommendations=context["results"][:2] context["recommendations"]=recommendations return context Step 3: Create Tool Integrations Tools provide external capabilities. Example: Python class SearchTool: def search(self,query): return [ "Data 1", "Data 2", "Data 3" ] Modify the search agent to use tools. Python class SearchAgent(Agent): def __init__(self): self.tool=SearchTool() def execute(self,context): query=context["query"] data=self.tool.search(query) context["results"]=data return context Agents now become capable of interacting with external systems. Step 4: Build the Orchestrator The orchestrator coordinates the execution flow. Python class Orchestrator: def __init__(self): self.agents=[ IntentAgent(), SearchAgent(), RecommendationAgent() ] def run(self,query): context={ "query":query } for agent in self.agents: context=agent.execute(context) return context Step 5: Execute the Workflow Run the orchestration system. Python orchestrator=Orchestrator() response=orchestrator.run( "Recommend something useful" ) print(response) Output: Python Intent Agent running... Search Agent running... Recommendation Agent running... { 'query': 'Recommend something useful', 'intent': 'Intent identified from: Recommend something useful', 'results':[ 'Data 1', 'Data 2', 'Data 3' ], 'recommendations':[ 'Data 1', 'Data 2' ] } The user sees a single interaction, while multiple agents collaborate behind the scenes. Adding Dynamic Agent Selection Real systems should not execute every agent for every request. The orchestrator can dynamically decide which agents participate. Example: Python class DynamicOrchestrator: def get_agents(self,query): agents=[IntentAgent()] if "search" in query: agents.append(SearchAgent()) if "recommend" in query: agents.append( RecommendationAgent() ) return agents def run(self,query): context={ "query":query } agents=self.get_agents(query) for agent in agents: context=agent.execute( context ) return context Now execution becomes adaptive. Parallel Execution Many tasks can run simultaneously. Python supports parallel processing: Python from concurrent.futures import ThreadPoolExecutor with ThreadPoolExecutor() as executor: futures=[] futures.append( executor.submit( searchAgent.execute, context ) ) futures.append( executor.submit( recommendationAgent.execute, context ) ) results=[ f.result() for f in futures ] Parallelism significantly reduces latency. All in all, multi-agent orchestration marks a significant shift in how intelligent systems are designed and operated. As product capabilities evolve from separate interactions to complex, goal-driven workflows, depending on a single AI component becomes harder to scale and maintain. Sharing responsibilities among specialized agents leads to systems that are more modular, flexible, and able to handle complicated reasoning and execution patterns. From an engineering viewpoint, the real benefit goes beyond just connecting multiple models. Success relies on creating a strong orchestration layer that can manage context, route tasks wisely, integrate with tools, coordinate workflows, and monitor the entire execution process. Production-grade systems must also tackle important issues like state management, fault tolerance, security boundaries, minimizing latency, and controlling costs. The future of AI-powered products will probably look more like distributed systems than traditional applications. Just as microservices changed software architecture by breaking down monolithic systems into specialized services, multi-agent orchestration is bringing a similar change for intelligent systems by separating generalized intelligence into collaborative, specialized abilities. Organizations that focus on building strong orchestration capabilities now are not just adding AI features; they are laying the groundwork for adaptable systems that can understand goals, coordinate actions, and consistently deliver valuable results at scale.

By Narendra Lakshmana gowda
Operationalizing Enterprise AI at Scale: Architecture, Governance, and Adoption
Operationalizing Enterprise AI at Scale: Architecture, Governance, and Adoption

Most enterprise AI initiatives stall after the proof of concept because the operational foundation around them is not ready. That failure rarely comes from a single problem. It comes from a combination of fragmented data ecosystems, compliance gaps, poor observability, and governance structures that were never built to handle production-scale AI in the first place. To close this gap, we need the kind of operational discipline that only comes when engineering and platform are driving AI transformation. Building the Enterprise AI Foundation Organizations often discover that AI deployment challenges stem less from model quality and more from inconsistent data pipelines, weak governance controls, and limited operational visibility. Building a scalable enterprise AI platform requires several foundational capabilities working together. Data Readiness for Enterprise AI Data readiness determines the project's potential functionality before it runs in production. If the data is poorly governed, the state-of-the-art LLM will produce unreliable outputs. In contrast, a simpler model trained on clean, well-structured data will outperform it every time. Enterprise data is usually available in two primary forms: structured vs. unstructured. Both structured and unstructured data sets are required for managing AI and GenAI workloads. Moreover, a consistent data pipeline is required for the preparation of enterprise AI and to remove duplication of data. It is essential to establish contracts and keep clear data lineage from source to model. The retrieval-augmented generation (RAG)-ready data layer is essential for teams building RAG architectures (to ground LLM outputs in enterprise data). Data readiness typically involves: Using lakehouse architectures, including Delta Lake, unifies batch and streaming data.Using vector databases to enable semantic search over unstructured content.Feature engineering pipelines to prepare structured data for ML models.Using data catalogs and metadata management to make data trustworthy.Enforcing schema agreements through data contracts between data producers and consumers. Governance as an Engineering Problem Many AI projects lose momentum during governance. It completely slows down the deployment process when handled as a manual checklist. The solution is simple: embed governance directly into AI development workflows and automate it. Automated governance in CI/CD means policy checks must run at build time, not at the end of the deployment. Key technical patterns for governance automation include: RBAC models can be used for role-based access to AI servicesAudit logging for model execution and configuration changesPII masking and tokenization to be used in data pipelines before model trainingSecure API gateways to monitor all external and internal AI service callsPolicy enforcement engines validate AI workflows against enterprise rules Centralized vs. Federated AI Platforms Enterprises have to make a structural choice. They can either manage AI from a central platform or let individual business domains build their own. A centralized approach offers standard governance and cost efficiency, while the federated platform allows domain teams to iterate faster. Most successful organizations adopt a hybrid strategy, creating a clear line between the shared infrastructure and localized services. The centralized platform engineering team handles core AI needs by offering managed GPU quotas, Kubernetes-based compute clusters, and reusable inference services. Meanwhile, federated domain teams handle application engineering to build localized workflows. The hybrid approach eliminates engineering redundancy across teams and preserves the autonomy needed to accelerate enterprise-wide AI adoption. layerfunctionkey c0mponents Shared (Central) AI platform Foundational Infrastructure Tenant isolation Kubernetes clusters, GPU quotas, shared model registries, and reusable inference services. Domain (Federated) AI platform Specialized application engineering Localized workflows, Fine-tuned models, Domain-specific logic AI/MLOps and AI Lifecycle Management Traditional DevOps is insufficient for AI systems. Code deployment is a deterministic task that changes with time. This is why AI/MLOps is used to address the inherent complexity. To build reliable and repeatable AI deployment pipelines, enterprises need to manage models, datasets, and configurations with the same importance as application code. The following is the list of AI/MLOps toolchains: CI/CD for machine learning: Automated pipelines that retrain, evaluate, and deploy models on triggersFeature stores: To centralize feature engineering and ensure consistency between training and servingCanary deployments and shadow mode: Gradually routing production traffic to new models before full promotionModel versioning: Tracking every model artifact with the dataset and code that produced itExperiment tracking: To compare parameters and outputs across training runsDrift detection: Continuously monitoring for statistical shifts in input distributions and model predictionsRollback strategies: Automated triggers to revert to a previous model version if performance is disrupted Observability and Reliability for AI Workloads AI observability doesn’t work like traditional application monitoring. With AI, the already available models are capable of producing harmful, inaccurate outputs. Production AI also faces operational risks, including model drift, token overruns, and prompt observability. You need real-time behavioral tracking to manage these risks. The solutions include logging prompts for quality checks, monitoring token usage for cost governance, monitoring GPU utilization, and estimating latency percentiles against AI services SLAs. Due to this, various platforms now use automated hallucination detection to ensure system reliability through LLM-as-judge methods. Enabling Enterprise Adoption Once the organizations successfully scale enablement platforms and align them with their metrics, the engineering focus must naturally shift towards adoption strategies. Building Internal AI Enablement Platforms One of the most hidden bottlenecks in enterprise AI adoption is developer friction. Many developers struggle to use AI platforms, even when a central one exists. Internal AI enablement platforms help make AI accessible for various engineering teams through the following: Internal AI developer portals: Provide model catalogs and API references for AI servicesReusable AI APIs: Give teams pre-set endpoints for repeatable tasksPrompt libraries: The trialed and tested collections of prompts Internal copilots: AI assistants are combined with internal tools to boost workflowsShared inference endpoints: Teams can use the shared AI infrastructure instead of creating their own. Aligning AI Systems With Business Outcomes Successful enterprise AI initiatives are designed around measurable operational outcomes from the start. AI can be efficiently scaled to provide business value with the help of operational telemetry. Organizations can estimate usage patterns by embedding event tracking directly into AI-assisted workflows. Feedback loops can also be used to flag unhelpful/incorrect outputs, sending signals back to retraining pipelines. Various dashboards, including AI usage analytics, are used to track models used by different teams. Measuring AI Impact in Production The accuracy of an AI model is not a direct determinant of business impact. For instance, if a model’s accuracy is 95% in solving a specific task, it can still have minimal impact on operations while addressing low-frequency edge cases. Here is a set of metrics required to measure real-world AI effectiveness. Adoption metrics: To find the percentage of target users actively using AI-powered featuresCost-per-request analysis: To estimate the cost of each AI interaction, including tokens, computation, and engineering overheadAI reliability metrics: SLA compliance rates, availability, and time required to make the recovery after incidentsPerformance degradation tracking: Monitors model quality metrics in production for weeks and monthsOperational efficiency dashboards: State business-level KPIs attributed to AI projects Responsible and Future-Ready AI Engineering Sustaining high adoption requires more than just accessible AI platforms; it demands engineering integrity and long-term system responsibility. Responsible AI in Production Environments Another engineering discipline is responsible AI, which is not just a list of rules and guidelines to remember. Instead, it consists of a set of principles (design, development, and deployment) that must be integrated into the system's core architecture and treated as engineering software. Features: Bias detection pipelines (automated statistical tests)Human-in-the-loop validation (transfer disputed data to human reviewers)Prompt filtering (sanitize input from the users and block complex prompts)Output moderation (Scan final responses to block inappropriate, harmful content)Compliance logging (Store records to regulate audit trails)Secure model endpoints (authentication and authorization of all inference APIs) Preparing for Agentic and Autonomous AI Systems Over the past years, AI has transformed from a suggestive platform to one that acts. The upcoming phase of enterprise AI will not only assist humans; instead, it will be able to take multi-step actions within the enterprise system. Agentic AI systems will be able to browse the web, call APIs, and execute approved actions across enterprise systems. Engineering teams will require a tool orchestration framework to align the actions and Model Context Protocol (MCP) patterns to standardize external connections. Summing Up The organizations that succeed with enterprise AI are not necessarily those with the most advanced models. They are the ones that build reliable data foundations, automate governance, operationalize observability, and create platforms that allow teams to scale innovation safely and repeatedly.

By Aravind Nuthalapati DZone Core CORE
A Spring Boot App With Half the Startup Time
A Spring Boot App With Half the Startup Time

The MovieManager project has been updated to use JDK 25 and the AOT cache from project Leyden. Project Leyden is part of the OpenJDK project and provides cached linking and cached performance statistics. That means the time spent linking at startup is moved to build time, and the statistics are created during a test run at build time as well. Because of that, the JVM loads the needed classes already linked and starts compiling the hot code paths immediately. The MovieManager application starts in less than half the time with these optimizations without any code changes. All these advantages come with preconditions: Exactly the same JVM version at build time, training time, and run timeThe same OS(Linux is used here) and libc at all steps -> (No Alpine-based Docker Images)Same CPU architecture, for example, AMD64 or ARM64 The steps to use Project Leyden: Build the Spring Boot ApplicationExtract the Spring Boot ApplicationDo a training run with the extracted Application to create the AOT cacheCreate the Docker Image with the extracted Application and the AOT cache Building and Training the Application The first step is to build the Spring Boot JAR. The MovieManager project has an integrated build that builds the Angular frontend and the Spring Boot backend with this Maven command: Shell ./mvnw clean install -Ddocker=true -Dnpm.test.script=test-chromium Project Leyden does not support Spring Boot Jars. The Jar has to be extracted to help Project Leyden find the used library jars of the project. To do that, this command needs to be used: Shell java -Djarmode=tools -jar backend/target/moviemanager-backend-0.0.1-SNAPSHOT.jar extract --destination extracted The result is the directory ‘extracted’ with the application jar and a sub-directory ‘lib’ that contains the used libraries. The second step is to create the AOT cache. To do that, the application has to run in production conditions. That means using a real PostgreSQL database with the database driver. That enables the JDK to record all the needed classes of the project and to create realistic performance statistics for the code compilation. To do this, a PostgreSQL database has to be started(done here in a Docker container), and the Application has to do the full startup. These commands are needed: Shell docker pull postgres:13 docker run --name local-postgres -e POSTGRES_PASSWORD=sven1 -e POSTGRES_USER=sven1 -e POSTGRES_DB=movies -p 5432:5432 -d postgres java -XX:+UseG1GC -XX:MaxGCPauseMillis=50 -XX:+UseCompressedOops -XX:+UseCompactObjectHeaders -XX:+ExitOnOutOfMemoryError -XX:MaxDirectMemorySize=64m -XX:+UseStringDeduplication -Xlog:aot -XX:AOTCacheOutput=app.aot -Dspring.context.exit=onRefresh -Djava.security.egd=file:/dev/./urandom -jar extracted/moviemanager-backend-0.0.1-SNAPSHOT.jar --spring.profiles.active=prod The Java command runs the application with the parameter ‘-Dspring.context.exit=onRefresh’ that makes Spring Boot do the full startup and exit then. The parameters ‘-Xlog:aot -XX:AOTCacheOutput=app.aot’ enable the logging of the AOT process and the creation of the ‘app.aot’ that is the AOT cache. The AOT cache contains everything that is needed for a fast startup of the application. If the AOT cache should also contain information to improve production performance, it would have to start up and process realistic production requests. That is beyond the scope of this article. The third step is to test the new application setup: Shell java -XX:+UseG1GC -XX:MaxGCPauseMillis=50 -XX:+UseCompressedOops -XX:+UseCompactObjectHeaders -XX:+ExitOnOutOfMemoryError -XX:MaxDirectMemorySize=64m -XX:+UseStringDeduplication -Xlog:class+path=info -XX:AOTCache=app.aot -Xlog:aot -Djava.security.egd=file:/dev/./urandom -jar extracted/moviemanager-backend-0.0.1-SNAPSHOT.jar --spring.profiles.active=prod The start-up time of the new setup with the AOT cache can be compared to the start-up time of the Spring Boot jar. On a medium-powered laptop, the times are: 9 seconds for the Spring Boot Jar3.5 seconds for the new setup with the AOT cache Creating a Docker Image To use the application in production, it needs to be packaged into a Docker image. The Docker image needs to contain the extracted application setup and the AOT cache. The base image needs to have the exact same JDK version, OS, and the same libc. That means small base images like Alpine cannot be used. The created Image can not be small because it contains 180 MB of AOT cache and a larger base image. This can be done with this Dockerfile: Dockerfile FROM eclipse-temurin:25.0.3_9-jdk-jammy WORKDIR /application ARG JAR_FILE=extracted/*.jar COPY ${JAR_FILE} moviemanager-backend-0.0.1-SNAPSHOT.jar COPY extracted/ ./ COPY app.aot app.aot ENV JAVA_OPTS="-XX:+UseG1GC \ -XX:MaxGCPauseMillis=50 \ -XX:+UseCompressedOops \ -XX:+UseCompactObjectHeaders \ -XX:+ExitOnOutOfMemoryError \ -XX:MaxDirectMemorySize=64m \ -XX:+UseStringDeduplication" ENTRYPOINT exec java $JAVA_OPTS -XX:+AOTClassLinking \ -XX:AOTCache=app.aot \ -Xlog:class+path=info \ -Djava.security.egd=file:/dev/./urandom \ -jar moviemanager-backend-0.0.1-SNAPSHOT.jar It copies the new application setup in the image and adds the AOT cache. The name of the application jar is in the AOT cache and has to be exactly the same as during the creation of the AOT cache. The ‘JAVA_OPTS’ also have to be the same. If the JDK version in the build environment changes, the version of the base image has to be adjusted accordingly. The parameter ‘-Xlog:class+path=info’ makes analyzing AOT problems much easier. The Docker container size is 705 MB. That makes the container about double the size of a Docker container with a Spring Boot Jar and an Alpine-based JDK image. Creating a Build Pipeline Creating Docker images for an application by hand is unsustainable in a production environment. A build pipeline is needed. The MovieManager project is hosted on GitHub; because of that, the project uses a GitHub Workflow as a build pipeline. The complete code for the build pipeline is in the script. The steps of the GitHub pipeline can be recreated in other environments too. The first step is to set up the PostgreSQL database service to be used in this build: YAML jobs: analyze: name: Analyze runs-on: ubuntu-latest env: POSTGRES_URL: jdbc:postgresql://localhost:5432/movies services: postgres: image: postgres:latest env: POSTGRES_USER: sven1 POSTGRES_PASSWORD: sven1 POSTGRES_DB: movies ports: - 5432:5432 options: >- --health-cmd="pg_isready -U sven1 -d movies" --health-interval=10s --health-timeout=5s --health-retries=5 The commands set up the PostgreSQL service in the build pipeline with user, password, dbname, and dbport. The ‘POSTGRES_URL’ is set to access the database later. The second step is to check out the project: YAML steps: - name: Checkout repository uses: actions/checkout@v3 It checks out the contents of the master branch. The third step is to provide the JDK: YAML - name: Setup Java JDK uses: actions/setup-java@v3 with: distribution: 'temurin' java-version: 25 JDK version 25 is the minimum to use the project Leyden with linking and performance statistics. The fourth step builds the Spring Boot Jar: YAML - name: Build with Maven if: matrix.language == 'java' run: | ./mvnw clean install -Ddocker=true That is the Maven command to build the project. The fifth step is to find the Spring Boot jar: YAML - name: Find fat jar if: matrix.language == 'java' id: jar run: | JAR_PATH=$(find ./backend/target -type f -name "*SNAPSHOT.jar" | head -n 1) echo "Found JAR: $JAR_PATH" echo "jar=$JAR_PATH" >> $GITHUB_OUTPUT The sixth step is to extract the Spring Boot jar: YAML - name: Unpack fat jar if: matrix.language == 'java' id: UNPACK run: | java -Djarmode=tools -jar ${{ steps.jar.outputs.jar } extract --destination extracted EXTRACTED_PATH=$(find . -type d -name "extracted" | head -n 1) echo "Found directory: $EXTRACTED_PATH" echo "extracted=$EXTRACTED_PATH" >> $GITHUB_OUTPUT The seventh step is to get the name of the extracted application jar: YAML - name: find extracted jar if: matrix.language == 'java' id: EXTRACT run: | EXTRACTED_JAR=$(find "${{ steps.UNPACK.outputs.extracted }" -type f -name "*.jar" | head -n 1) EXTRACTED_JAR=${EXTRACTED_JAR#./} echo "Found extracted JAR: $EXTRACTED_JAR" echo "extracted=$EXTRACTED_JAR" >> $GITHUB_OUTPUT The eighth step is to create the AOT cache: YAML - name: Create AOT cache if: matrix.language == 'java' id: AOT env: JAVA_TOOL_OPTIONS: "" _JAVA_OPTIONS: "" JDK_JAVA_OPTIONS: "" run: | EXTRACTED_JAR="${{ steps.EXTRACT.outputs.extracted }" echo "jar=$EXTRACTED_JAR" echo "JAVA_TOOL_OPTIONS=$JAVA_TOOL_OPTIONS" echo "_JAVA_OPTIONS=$_JAVA_OPTIONS" echo "JDK_JAVA_OPTIONS=$JDK_JAVA_OPTIONS" JAVA_OPTS="-XX:+UseG1GC -XX:MaxGCPauseMillis=50 -XX:+UseCompressedOops -XX:+UseCompactObjectHeaders -XX:+ExitOnOutOfMemoryError -XX:MaxDirectMemorySize=64m -XX:+UseStringDeduplication" java $JAVA_OPTS \ -XX:+AOTClassLinking \ -XX:AOTCacheOutput=app.aot \ -Xlog:aot \ -Dspring.context.exit=onRefresh \ -Dspring.datasource.url="${{ env.POSTGRES_URL }" \ -Dspring.profiles.active=prod \ -jar "$EXTRACTED_JAR" || echo "AOT Training finished with exit code $?" This runs the application startup with the PostgreSQL database to create the AOT cache. The ninth step shows the exact JDK version used in the AOT cache generation: YAML - name: Show Jdk version if: matrix.language == 'java' id: JDK run: | JDK_VERSION=$(java -version 2>&1) VERSION=$(echo "$JDK_VERSION" | sed -n 's/.*build \([^[:space:]]*\)-LTS.*/\1/p') echo "JDK_VERSION=$JDK_VERSION" echo "VERSION=$VERSION" MY_VERSION="jdk=$VERSION" In case of problems with using the AOT cache. The first check is the version shown here against the JDK version in the Docker base image. The tenth step creates the Docker image: YAML - name: Build and push uses: docker/build-push-action@v6 if: matrix.language == 'java' with: context: . file: ./Dockerfile build-args: | JAR_PATH=${{ steps.EXTRACT.outputs.extracted } LIB_PATH=${{ steps.aot.outputs.extracted } push: false tags: angular2guy/moviemanager:latest This step can push the Docker image to an image repository. Conclusion The results of using the AOT cache of project Leyden are impressive. Cutting the startup time in half without any code change is amazing. The effort to create the AOT cache and set up the new application is a one-time investment. The impact of the larger Docker Images is low. That makes scaling application instances in Kubernetes clusters up and down much more flexible because the time to the availability of a new application instance is much lower. In Kubernetes environments with scaling of application instances, the AOT cache is a significant step forward and should be used. For serverless applications 3.5 seconds startup time is too slow. Their project, CrAC or Native Image, would be needed. Project CrAC needs code changes and testing. Native Image has the closed-world assumption, which makes it hard to prove that larger applications work correctly. Alternatives are Node.js with Nest.js and TypeScript, or Go with its libraries. Project Leyden is not finished in JDK 25. There are plans to add compiled code to the AOT cache in the future. The JVM is an impressive piece of technology that is still improving further.

By Sven Loesekann
Architecting Proactive IT: NinjaOne Remote Monitoring and Management
Architecting Proactive IT: NinjaOne Remote Monitoring and Management

It's 3 PM on a Friday when the security advisory hits: a critical zero-day vulnerability in a widely used Windows service. You're managing 5,000 endpoints across 50 locations, each with different maintenance windows, backup schedules, and criticality levels. You need to patch everything — but only after verifying sufficient disk space, confirming recent backups, and respecting production schedules. With traditional tools, you're looking at a weekend of manual work and spreadsheet tracking. With a modern RMM platform, it's a policy configuration problem. This is the reality of modern IT operations: the shift from reactive firefighting to proactive, policy-driven infrastructure management. For system administrators, architects, and DevOps engineers, this demands an RMM platform built on modern architectural principles. Principles that enable automation, intelligent alerting, and seamless integration. This article explores the technical foundations of NinjaOne's Remote Monitoring and Management solution, examining how its cloud-native architecture, policy engine, and scripting capabilities address the challenges of managing infrastructure at scale. Cloud-Native Architecture: Built for Scale NinjaOne is built on a fully cloud-native SaaS architecture, a fundamental departure from legacy RMM platforms that evolved from on-premises software. This matters because traditional RMM tools often carry technical debt from decades of feature additions. Bloated codebases, inefficient database schemas, and scaling bottlenecks that require constant infrastructure investment are just a few examples. The architecture follows a hub-and-spoke model: Agent layer: A lightweight agent (typical footprint: 50-100MB RAM, <1% CPU at idle) deploys to each endpoint. The agent operates asynchronously, accumulating health metrics, system state, and event logs locally before transmitting to the control plane. This helps the agent to continue monitoring even during network disruptions.Control plane: The centralized SaaS platform provides multi-tenant management across Windows, macOS, and Linux systems. The console delivers real-time visibility into CPU, memory, disk I/O, network throughput, and service states across your entire fleet.API layer: RESTful API (v2.0) enables programmatic access to nearly every console function, facilitating integration with PSA systems, ITSM platforms, and custom tooling. The practical impact of this architecture is deployment velocity. Unlike legacy platforms that require weeks of server provisioning, database tuning, and infrastructure setup, cloud-native RMM deployments typically reach production in 2-3 weeks. Most of that time is spent on policy design rather than infrastructure provisioning. The Policy Engine: Configuration as Code for IT Operations At the operational core of NinjaOne lies a hierarchical policy management system. If you're familiar with Infrastructure as Code concepts, think of policies as the Terraform modules of endpoint management: reusable, inheritable configurations that serve as the single source of truth for your fleet. Policy Types and Inheritance Policies are scoped by asset type: Agent policies: Windows, macOS, and Linux endpointsNMS policies: Network devices (switches, routers, firewalls)VM policies: Virtual machine-specific configurations The inheritance model allows you to define organization-wide defaults while permitting location-specific or role-specific overrides. For example: Plain Text Global Policy (Base) ├── North America Policy (inherits + adds region-specific monitoring) │ └── Production Servers (inherits + adds strict alerting) └── Europe Policy (inherits + adds GDPR compliance checks) Each child policy inherits parent settings but can override specific parameters — similar to CSS cascade rules or OOP inheritance patterns. Policy Conditions: From Monitoring to Action Within each policy, monitoring operates through Policy Conditions — defined thresholds or states that trigger automated responses. This is where simple monitoring evolves into intelligent orchestration. Each condition configuration includes: ParameterFunctionWhy It MattersSeverityDefines operational impact (Critical, Moderate, Low)Routes alerts to appropriate teams and determines escalation pathsPrioritySets response urgency (High, Medium, Low)Integrates with ticketing systems to set SLA timersAuto-resetAutomatically clears condition after specified timePrevents alert noise from transient issues (network blips, momentary CPU spikes)Ticketing RuleDefines if/how service tickets are createdEnables automated incident creation with pre-populated contextAutomation TriggerLaunches script execution on condition matchTurns monitoring into self-healing infrastructure The power lies in chaining these configurations. A disk space condition doesn't just alert — it can automatically trigger a cleanup script. It can create a ticket with disk usage analytics attached and suppress alerts for 24 hours while the remediation runs. Advanced Alert Logic: Compound Conditions Simple threshold monitoring generates alert fatigue. A CPU spike could be a crypto miner or just a scheduled backup. A stopped service might be critical on a production server but irrelevant on a developer workstation. This is where Compound Conditions become essential. Compound Conditions allow you to stack multiple criteria that must all be true before triggering an alert or action. This is Boolean logic applied to infrastructure monitoring. This works through a condition evaluation engine that processes device state changes in near-real-time. When any monitored metric changes, the engine evaluates all applicable policy conditions against that device's current state and custom field values. Only when the complete condition set evaluates to true does the system trigger actions. This approach dramatically reduces false positives. Automation and Scripting: Infrastructure as Executable Code Monitoring identifies problems; automation solves them. NinjaOne supports five scripting languages: PowerShell, JavaScript, Batch, Bash/Shell (macOS/Linux), and VBScript. Scripts are centrally managed in the Automation Library and deployed through four execution models: Policy-scheduled: Run on fixed intervals for all devices assigned to a policyCondition-triggered: Execute automatically when policy conditions matchScheduled tasks: Run against filtered device groups (e.g., "all production servers in EU region")Ad-hoc execution: Manual on-demand execution for troubleshooting API Integration: Programmatic Control For integration specialists and DevOps engineers, the Public API (v2.0) provides comprehensive programmatic access. The API essentially replicates any action available in the console, enabling integration with ticketing systems, asset management databases, and custom automation workflows. Key API Endpoints HTTP GET /v2/devices # List all managed devices GET /v2/devices/{id} # Get device details GET /v2/alerts # List active alerts POST /v2/devices/{id}/scripting/run # Execute script on device GET /v2/automation/scripts # List available scripts GET /v2/queries/software # Query installed software GET /v2/policies # List all policies PATCH /v2/devices/{id} # Update device properties Unified Security Management Modern security requires the convergence of IT operations and security operations. NinjaOne provides the foundation for this by unifying several critical security functions: Automated Patch Management The patch management engine operates on a policy-driven model. You define approval rules, testing groups, and deployment schedules within policies. The system then: Continuously scans for available OS and third-party application patchesApplies approval rules (auto-approve security patches, hold feature updates)Deploys to test groups first, monitors for issuesRolls out to production groups based on success criteriaReports compliance status across the fleet Patches can be deployed with flexible scheduling: immediate for critical zero-days, phased rollout for feature updates, or maintenance-window-only for production systems. EDR/AV Integration Rather than treating security tools as separate silos, NinjaOne integrates endpoint detection and response (EDR) and antivirus solutions directly into the management console. Supported integrations include WatchGuard, SentinelOne, Windows Defender, and others. This integration enables: Unified agent deployment: Push EDR agents via NinjaOne automationPolicy-based enforcement: Automatically install EDR on devices matching criteriaConsolidated alerting: Security alerts appear alongside IT alerts in a single dashboardAutomated response: Trigger isolation scripts when EDR detects threats Device Hardening and Compliance The platform supports mass configuration management for security hardening: Registry modification: Deploy security settings via PowerShell scripts across thousands of devicesEncryption monitoring: Track BitLocker/FileVault status and automatically enable on non-compliant devicesBaseline enforcement: Define configuration baselines in policies and receive alerts on driftAudit reporting: Generate compliance reports for frameworks like CIS, NIST, SOC 2 When to Use NinjaOne (And When Not To) Ideal Use Cases Managed Service Providers (MSPs): Multi-tenant architecture is purpose-built for MSPs managing multiple client environments from a single consoleWindows-heavy environments: Best-in-class support for Windows Server and Desktop management, though macOS and Linux support continues improvingOrganizations requiring compliance reporting: Built-in audit trails and reporting for SOC 2, ISO 27001, HIPAATeams needing unified IT/Security operations: Integration of patching, monitoring, EDR, and automation in single platform Less Ideal For Pure DevOps/container environments: If your infrastructure is primarily Kubernetes and Docker, tools like Prometheus/Grafana or Datadog may be better fitsOrganizations standardized on Ansible/Puppet/Chef: If you've already invested heavily in configuration management tools, NinjaOne may be redundantVery small teams (<50 endpoints): The platform's power comes from scale; very small deployments may be over-engineeredLinux-first environments: While Linux support exists, the platform's heritage is Windows-centric Integration Considerations NinjaOne works best when integrated with: PSA systems: ConnectWise, Autotask, Kaseya BMSDocumentation platforms: IT Glue, HuduSIEM tools: Splunk, Elastic Security (via API/webhook integration)Collaboration platforms: Slack, Microsoft Teams (for alert notifications) Conclusion: Policy-Driven Infrastructure at Scale Transforming from reactive IT support to proactive infrastructure management requires platforms built on modern architectural principles. NinjaOne's cloud-native foundation, policy-driven configuration model, intelligent alerting logic, and extensive automation capabilities provide the technical foundation for this transformation. For system architects, administrators, and developers managing infrastructure — the platform offers several key technical advantages: Declarative infrastructure: Policies define desired state; the platform handles implementationProgrammable operations: Comprehensive API access enables integration with existing toolchainsContext-aware automation: Compound conditions ensure actions execute only when all prerequisites are metLanguage flexibility: Native support for PowerShell, Bash, JavaScript enables leveraging existing scripting expertiseUnified visibility: Single pane of glass for monitoring, security, and compliance across heterogeneous environments The platform supports sophisticated workflows — from simple automated disk cleanup to complex, multi-phase patching orchestrations with validation gates and automated rollback. Whether managing a single large environment or multiple client infrastructures, the combination of policy-driven configuration, intelligent automation, and programmatic control positions NinjaOne as a platform for organizations serious about operational efficiency and proactive infrastructure management.

By Stelios Manioudakis DZone Core CORE
Beyond REST: Architecting High-Density Agentic Microservices With MCP and WASI-NN
Beyond REST: Architecting High-Density Agentic Microservices With MCP and WASI-NN

The bill for the generative AI integration rush has arrived, and it is denominated in egress costs, token bloat, and idle container memory. For the past two years, engineering teams integrated LLMs via the path of least resistance: layering models on top of existing architectures. For human-facing use cases, this works. Humans provide implicit context, tolerate minor latency, and intuitively course-correct errors. Agents behave differently. They execute tightly coupled orchestration loops where step $N$ strictly depends on the evaluated context of step $N-1$. When an agent triggers a chain of API calls, interprets the JSON responses, and feeds those results back into its reasoning engine, the system stops behaving like a traditional request-response architecture. It becomes a distributed, fragile reasoning engine. The underlying infrastructure was never designed for this. Maintaining Run The Engine (RTE) metrics becomes impossible when your orchestrator times out waiting for 15 sequential REST calls to resolve over a network. Where REST Breaks Under Agent Workloads REST architectures assume a deterministic client that parses data efficiently. Agents violate this assumption. Consider a supply chain endpoint returning a raw inventory array. An agent receiving this must compute available stock, estimate depletion rates, and evaluate business constraints. While these tasks are trivial, executing them inside an LLM inference cycle introduces three structural failures: Latency amplification: There is no caching at the reasoning level. The LLM re-evaluates the same arithmetic on every invocation.The token tax: The model must ingest massive, unrefined data structures rather than a concise summary, burning context windows and budget.Probabilistic drift: Arithmetic and threshold evaluations become non-deterministic. A slight prompt change might cause the agent to miscalculate a threshold that a compiled binary would hit with 100% accuracy. When this pattern repeats, system latency is no longer a function of API performance; it is bottlenecked by the entire reasoning chain. The Shift: From Data Endpoints to Capability Execution To break this bottleneck, we must move from data retrieval to capability execution. Instead of returning raw arrays, microservices must return deterministic decisions. This requires pushing computation to the edge. In a capability-driven model, the agent does not fetch inventory and calculate risk; it invokes a localized capability that already encapsulates that math. The Execution Engine: MCP Paired With WASI-NN The Model Context Protocol (MCP) provides the discovery layer. Unlike Swagger, which requires an agent to guess routing patterns, MCP enforces a consistent interaction contract that aligns with how agents operate. WebAssembly (Wasm) provides the runtime. Instead of 500MB Docker containers, logic is compiled into lightweight modules that execute in-process on the same node as the orchestrator. This eliminates the network boundary entirely. By utilizing WASI-NN (WebAssembly System Interface for Neural Networks), these modules can run localized, small-parameter ML models (e.g., Phi-4-Mini) using the host’s native hardware. This enables sophisticated inference without hitting external model APIs. The Evidence: Wasm vs. Docker Unit Economics Transitioning from containerized services to Wasm modules fundamentally changes execution characteristics. operational metriclegacy pattern (python/REST)capability pattern (WASM/MCP)Cold Start Latency350ms - 800ms< 6msMemory Footprint300MB - 500MB~5MBNetwork Hops1 per tool call0 (Local execution)Contextual Overhead~600 tokens~40 tokens The difference comes from eliminating layers: No guest OS bootNo interpreter startupNo network boundary Wasm modules are precompiled bytecode. The runtime simply instantiates them. Model weights are loaded once and reused, allowing thousands of executions to share the same memory. Implementation: A Context-Aware Capability The difference here is the boundary of responsibility. The Rust example below demonstrates a capability that retrieves data, executes a localized model, and returns a decision-ready assessment. Rust // Dependencies: mcp-sdk = "1.x", wasi-nn = "0.x" use mcp_sdk::server::{McpServer, Tool}; use wasi_nn::{self, GraphEncoding, ExecutionTarget, TensorType}; #[mcp_tool] async fn evaluate_supply_risk(sku: String, buffer_days: u32) -> Result<String, anyhow::Error> { // 1. Native data retrieval (bypassing HTTP overhead) let stock_level: u32 = host_bindings::kv_store::get(&sku).await?; // 2. Localized reasoning via WASI-NN let graph = wasi_nn::load( &[include_bytes!("../models/supply_risk_q4.tflite")], GraphEncoding::TensorflowLite, ExecutionTarget::CPU )?; let mut context = wasi_nn::init_execution_context(graph)?; let input_tensor = [stock_level as f32, buffer_days as f32]; wasi_nn::set_input(context, 0, TensorType::F32, &[1, 2], &input_tensor)?; wasi_nn::compute(context)?; let mut output = [0f32; 1]; wasi_nn::get_output(context, 0, &mut output)?; // 3. Return Semantic Context, avoiding raw data dumps Ok(format!( "SKU {} stock: {}. Analysis: {:.1}% risk of stockout within {} days. Action: Route to secondary.", sku, stock_level, output[0] * 100.0, buffer_days )) } fn main() { let server = McpServer::new("supply-chain-node") .add_tool(evaluate_supply_risk) .build(); server.start_stdio(); } The Architectural Hazard: Semantic Drift When multiple Wasm capabilities independently encode similar logic, definitions diverge. If a Fraud_Service defines "High Risk" as $>0.8$ while a Payment_Gateway defines it as $>0.6$, the agent will experience logic oscillation, repeatedly looping as it receives contradictory context. Enforcing Consistency via TypeSpec We mitigate this by enforcing data invariants at compile-time using TypeSpec. This acts as a central ontology for the system. Plain Text @service({ title: "Logistics Context Ontology" }) namespace LogisticsDomain { @doc("Normalized probability of supply chain failure.") scalar RiskScore extends float32; model ContextualRiskAssessment { sku: string; @minValue(0) current_stock: int32; @minValue(0.0) @maxValue(1.0) stockout_probability: RiskScore; recommended_action: "RouteSecondary" | "Hold" | "Expedite"; } } This acts as a compile-time guardrail. Any deviation fails during build, ensuring all capabilities operate within the same semantic model. Where This Architecture Fits This model works best for: high-frequency decision loopsstateless computationsbounded inference tasks It is not suited for: large model hostinglong-running workflowscomplex orchestration logic Trying to force those into WASM introduces more complexity than benefit. Final Thoughts: Evolving the Control Plane This shift is not about replacing REST entirely. It is about recognizing that agents are not traditional consumers. They do not need access to raw systems. They need bounded, deterministic outcomes. As agent workloads scale, pushing reasoning closer to the data becomes less of an optimization and more of an operational requirement. When comparing a 5MB Wasm module executing in milliseconds to a 500MB container spinning up over the network, the trade-offs become difficult to ignore, especially in high-frequency agent workflows. The next phase of backend evolution is not building better APIs. It is building systems that expose executable intent.

By Nabin Debnath
Combining Temporal and Kafka for Resilient Distributed Systems
Combining Temporal and Kafka for Resilient Distributed Systems

Kafka and Temporal address different failure boundaries, and resilient distributed systems often need both rather than one as a substitute for the other. Kafka is built to move ordered, replayable event streams across many consumers and machines, while Temporal is built to keep long-running application logic alive as durable Workflow Executions that recover from crashes, outages, and worker restarts by replaying persisted Event History. The combination becomes compelling when Kafka is used to carry facts and Temporal is used to remember intent, timers, retries, and compensations across the lifetime of a business process. Kafka as the Event Backbone and Temporal as the Control Plane Kafka’s model is centered on totally ordered partitions, consumer groups, and offsets. A partition is consumed by exactly one consumer in a subscribing consumer group at a time, and Kafka keeps consumer state compact by treating progress as an offset that can be checkpointed, committed manually, or even rewound for reprocessing. That model is excellent for integration boundaries, stream processing, and decoupling producers from downstream services. What it does not provide by itself is durable orchestration for business logic that must wait for hours, react to multiple messages over time, and recover mid-process without rebuilding state externally. Temporal fills that gap by treating a Workflow Execution as a durable, reliable, scalable function that owns local state, receives messages through Signals or Updates, and advances by replaying persisted history instead of starting over from scratch after failure. Keep Kafka at the Boundary of Workflow Replay The most important design rule is simple: Kafka client calls do not belong inside Workflow code. Temporal requires deterministic workflow logic on replay, and its documentation explicitly places non-deterministic work, such as API calls and database queries, inside Activities. A Workflow should behave like a compact state machine that decides what should happen next, while Activities perform the side effects that may fail or need retries. That separation is what allows Kafka to remain an external event fabric without corrupting Temporal replay semantics. Java private boolean paymentReceived; private final OrderActivities activities = Workflow.newActivityStub( OrderActivities.class, ActivityOptions.newBuilder() .setStartToCloseTimeout(Duration.ofSeconds(30)) .setRetryOptions( RetryOptions.newBuilder() .setInitialInterval(Duration.ofSeconds(1)) .setMaximumInterval(Duration.ofSeconds(30)) .build()) .build()); @WorkflowMethod public void process(String orderId) { activities.reserveInventory(orderId); boolean paid = Workflow.await(Duration.ofHours(2), () -> paymentReceived); if (!paid) { activities.releaseInventory(orderId); activities.publishTimedOut(orderId); return; } activities.publishConfirmed(orderId); } @SignalMethod public void paymentCaptured(String paymentId) { paymentReceived = true; } This workflow is intentionally boring, which is precisely why it is robust. Inventory reservation and event publication are pushed into Activities, while the workflow itself only keeps state and waits. The two-hour wait is not a sleeping thread in application memory; Temporal persists timers so the execution resumes even after worker or service interruptions. Kafka, in this pattern, supplies the external payment event, but Temporal owns the long-lived timeout and the recovery semantics. A thin Kafka bridge can then translate an incoming record into a Temporal message instead of embedding orchestration logic in the consumer loop. Signal-With-Start is especially useful because it either signals an existing workflow or starts a new one with the same Workflow ID and immediately applies the signal, which removes a large class of race conditions between creation and update. Java public void onMessage(ConsumerRecord<String, PaymentEvent> record) { WorkflowStub workflow = client.newUntypedWorkflowStub( "OrderWorkflow", WorkflowOptions.newBuilder() .setWorkflowId("order-" + record.key()) .setTaskQueue("order-workflows") .build()); workflow.signalWithStart( "paymentCaptured", new Object[] { record.value().paymentId() }, new Object[] { record.key() }); consumer.commitSync(); } That handoff should be designed as duplicate-tolerant rather than duplicate-impossible. Kafka allows manual control over when a record is considered consumed, but a crash after Temporal accepts the signal and before the offset is committed can still trigger redelivery. A practical way to make that safe is to keep the Workflow ID stable for the business entity and to make Activities idempotent, because Temporal may retry Activity executions as part of normal failure handling. Failure Semantics Matter More Than Labels The most common architectural mistake in Kafka and Temporal systems is to over-claim exactly-once semantics. Kafka’s idempotent producer ensures that retries do not create duplicate writes in the stream, and Kafka transactions allow atomic writes across partitions and topics. Kafka Streams goes further by defining end-to-end exactly-once around a very specific boundary: input topic offsets, state stores, and output topics are committed atomically because they are all inside Kafka’s storage model. Temporal, meanwhile, gives an effectively once-scheduled experience for Activities, but still expects Activity implementations to be idempotent because retries can occur after partial execution or worker failure. The combined system, therefore, does not become end-to-end exactly-once by default; that only happens when idempotency keys or transactional guarantees explicitly cover every external side effect that matters. Java public void publishConfirmed(String orderId) { producer.beginTransaction(); try { producer.send(new ProducerRecord<>("order-confirmed", orderId, orderId)).get(); producer.commitTransaction(); } catch (Exception ex) { producer.abortTransaction(); throw ex; } } This kind of publishing Activity is useful when Workflow progress must result in one or more Kafka records that either all appear or all fail together. The producer should be configured for idempotence, durable acknowledgments, and a transactional.id, but the design should still assume that non-Kafka side effects may need compensation. Temporal’s error-handling guidance recommends rollback logic with the Saga pattern for multi-step processes, which maps naturally to workflows that can reserve inventory, attempt payment, publish status, and then compensate in reverse order if one boundary fails after another has already succeeded. Long-Running Streams Need Long-Running Discipline Once Kafka is feeding entity-centric workflows for days or weeks, operational details start to matter as much as API design. Reusing the same business key as the Kafka record key and the Temporal Workflow ID creates a clean ownership model: Kafka uses keys to select partitions, partitions remain totally ordered, and Temporal guarantees that only one Workflow Execution with a given ID is open at a time. That alignment naturally serializes updates for a customer, order, or account across both systems. At the same time, the Kafka side of the bridge should stay thin enough to keep polling regularly, because consumers that stop polling can be considered dead and rebalanced out of the group. Temporal workflows that receive large numbers of Signals or perform many Activity calls also need history management. Event History is the mechanism that makes recovery possible, but it has performance limits and hard ceilings; Temporal warns as history grows and recommends Continue-As-New for long-running executions or workloads that process thousands of events. That becomes especially important in Kafka-driven entity workflows, where a single logical process can become a permanent mailbox unless it periodically rolls forward into a fresh run. Code evolution must also be handled deliberately because workflow logic is replayed; Temporal’s versioning guidance requires patching or worker versioning when changes would otherwise introduce non-determinism for in-flight executions. Conclusion Temporal and Kafka work best together when each is allowed to solve the problem it was built for. Kafka should distribute ordered, replayable events across the system boundary, and Temporal should hold the durable state machine that decides what those events mean over time. With that separation, retries stop leaking into application code, timers stop depending on process uptime, and compensations stop turning into chains of callbacks and ad hoc status flags. The result is not merely a system that survives failures, but a system whose failure semantics remain understandable under load, redelivery, redeployments, and long-running business latency.

By Akhil Madineni
Frame Buffer Hashing for Visual Regression on Embedded Devices
Frame Buffer Hashing for Visual Regression on Embedded Devices

I run test automation for a graphics team that ships software to streaming devices. About a year ago, we changed how our visual regression suite stores and compares its references. The old approach kept around 18GB of PNG golden images in the test repo and ran a pixel-by-pixel diff on every comparison. The new approach stores around 19KB of MD5 hashes in a JSON file and compares hash strings. Storage dropped by roughly three orders of magnitude. Comparisons became effectively free. A category of flaky tests stopped being flaky. This article is about how that works, when it makes sense, and when it doesn't. It also covers the parts that surprised me, because the approach has real downsides and I want to be honest about them up front. How It Works The idea is simple once the constraints are right. On the embedded devices we test, we have access to the raw GPU frame buffer through the graphics stack. The test harness reads it as a bytes object, computes an MD5 hash of those bytes, and compares the hash against a stored reference. If the hashes match, the test passes. If they don't match, the test captures the actual frame and saves it as a failure artifact for a human to look at. The stored reference is a 32-character hex string per screen, kept in a JSON file checked into the test repo alongside the test code. The full implementation is short: Python import hashlib import json from pathlib import Path REFERENCE_FILE = Path("references/visual_hashes.json") def frame_hash(frame_bytes: bytes) -> str: """MD5 of the raw GPU frame buffer.""" return hashlib.md5(frame_bytes).hexdigest() def load_references() -> dict: if REFERENCE_FILE.exists(): return json.loads(REFERENCE_FILE.read_text()) return {} def check_frame(test_id: str, frame_bytes: bytes, references: dict) -> tuple[bool, str]: """Returns (passed, actual_hash).""" actual = frame_hash(frame_bytes) expected = references.get(test_id) if expected is None: return False, actual # no reference yet return actual == expected, actual def on_failure(test_id: str, frame_bytes: bytes, actual: str): """Only called when hashes diverge. Save the frame for review.""" artifact_dir = Path(f"artifacts/{test_id}") artifact_dir.mkdir(parents=True, exist_ok=True) (artifact_dir / f"{actual}.raw").write_bytes(frame_bytes) That's essentially the whole system. Because the references are text, intentional UI changes show up as normal source-control diffs in code review instead of opaque binary blob swaps. Because the comparison is string equality on a hex digest, it's effectively instant regardless of frame size. Why MD5 Specifically MD5 is cryptographically broken. You can construct collisions on demand, and using it for password storage or signature verification is malpractice. None of that matters here. Visual regression testing is not a cryptographic problem. The two inputs being compared are the rendered output of our own GPU yesterday and the rendered output of our own GPU today. There is no adversary trying to construct a frame buffer that hashes to a specific value. What you actually need from a hash function in this context is fast computation, low accidental collision rate on real-world inputs, and stable output across runs and platforms. MD5 covers all three. The accidental collision probability between two different rendered frames at typical buffer sizes is small enough that we have not encountered one. SHA-256 covers the same three properties at slightly higher CPU cost. If the cryptographic concern is going to come up in code review every quarter, just use SHA-256. The Conditions That Have to Hold This approach only works when three things are true about your environment. The first is access to the raw frame buffer before any encoding step. Browser-based testing, mobile UI testing through the standard automation frameworks, and most desktop application testing give you a captured screenshot, which has been through some encoding step before you see it. PNG encoders can vary across versions, and two systems can render the same pixels and produce different PNG files. If your only access point is a captured screenshot, you are comparing post-encoding output, and encoder noise will sink hashing. On embedded devices with a graphics stack you control, you usually do have raw frame buffer access, which is why this worked for us. The second condition is that the rendering pipeline has to be deterministic. Same input, same GPU state, same output bytes. If antialiasing produces different pixels for the same logical input from one run to the next, or if time-based animations get sampled at slightly different moments, or if the GPU driver rounds inconsistently, the hashes will diverge for reasons that aren't real bugs. In our case, the pipeline is deterministic, so this isn't a problem. In a lot of environments, it isn't, and you would need pixel-diff with a tolerance threshold or perceptual hashing to handle the noise. The third condition is that capture points have to be stable. The test harness has to call the capture function at the same logical point in the pipeline every run, after the same set of operations. This is usually the easiest of the three to engineer. Frame buffer access either exists or it doesn't, and determinism is sometimes a property you can't change. Capture point stability is just a discipline about where you instrument your tests. If any of these three conditions fail, frame buffer hashing is the wrong tool. Pixel-diff with a tolerance threshold is the right default for most setups, and perceptual hashing covers the middle ground where you have raw access but some non-determinism. The narrow case this article is about is the one where all three hold. What You Give Up The biggest tradeoff is failure diagnosis. With golden images, when a test fails, you have a stored reference and a new screenshot, and you can render a side-by-side diff or an overlay highlighting the changed pixels. With hash comparison, you have two strings that don't match. The failure handler captures the actual frame on the spot, but the reference image (which doesn't exist anymore in storage) has to be reconstructed by running the same test against a known-good build whenever you want to do a side-by-side comparison. That extra step is annoying when failures are common. In our case, they aren't, so the cost is manageable. If your suite has a high baseline failure rate, the math changes, and you may want to keep both the hashes and the reference images, using the hash for fast pass/fail detection and the image only for diagnosis. The other thing you give up is fuzzy matching, but that's the same point as the determinism condition. Fuzzy matching exists to compensate for non-determinism in the rendering pipeline. If your pipeline is deterministic, you don't need it. If it isn't, you do, and hashing won't work. What It Changed for Us Storage going from 18 GB to 19 KB is the change people notice first, but the second-order effects matter more in day-to-day work. Repository operations got faster because the test repo no longer carries gigabytes of binary history. Cloning a fresh checkout takes a fraction of the time it used to. PR reviews got cleaner because UI changes show up as readable JSON diffs instead of opaque PNG swaps. The flaky-test rate from encoder noise dropped to zero, which was the change that got the most attention from people on the team. Some of the old goldens had been re-saved at some point with slightly different encoder settings, and tests would fail mysteriously even though the rendered pixels were identical to the human eye. The only fix had been to regenerate the golden, which nobody really trusted. Removing the encoder from the comparison loop removed the entire class of failure. CI runs got faster, too, because hash comparison is essentially free compared to image diffing. None of these wins is novel; Skia, PDFium, and the apitrace project have used hash-based comparison of rendered output for years. What was new for us was committing to it as the primary mechanism for an entire UI test suite on embedded hardware, and accepting the implication that the stored reference is text rather than a binary asset. If you're working in an environment where the three conditions hold, the implementation is small enough that a prototype takes a day. If even one of them is missing, this isn't the right tool, and the alternatives are well understood. The interesting part is recognizing which environment you're actually in.

By Rajasekhar sunkara
How to Interpret the Number of Spring ApplicationContexts in Integration Tests
How to Interpret the Number of Spring ApplicationContexts in Integration Tests

When optimizing Spring Boot integration tests, developers often focus on obvious metrics: total build time, test execution time, CPU usage, memory consumption, or the number of failed tests. These metrics are useful, but they do not always explain why an integration test suite is slow. One of the most important hidden metrics in Spring Boot integration testing is the number of distinct ApplicationContext instances created during the test run, check out my other article. Spring’s TestContext framework can cache and reuse ApplicationContext between test classes, but only if the effective test configuration is the same. If the configuration differs, Spring has to create another context. In large enterprise applications, this can become expensive very quickly. How can the number of contexts correctly interpreted?If a test suite creates two contexts, is that good?If it creates six contexts, is that acceptable?If it creates twenty contexts, is that already a design smell?And most importantly: where should such a judgment come from? Spring itself does not define a universal threshold for a “good” or “bad” number of cached ApplicationContext instances. However, the official documentation explicitly points out that a large number of loaded contexts can make a test suite unnecessarily slow. This means the number of contexts is not just an implementation detail. It is a relevant diagnostic signal. This article explains how I derived a practical interpretation table for a real-world Spring Boot integration test suite and why such a table should be understood as a case-study heuristic, not as a universal Spring Framework rule. Test Grouping Is a Valid Concept General testing research supports that tests can be grouped by similarity, cost, coverage, or runtime behavior. This is highly relevant for Spring Boot integration tests. In Spring Boot integration testing, MergedContextConfiguration may be interpreted as one practical grouping dimension: tests with the same effective Spring configuration belong to the same context group. In this case, similarity means shared Spring test configuration. That does not mean all tests should use the same context. It means that tests should not accidentally create different contexts when they are actually testing under the same architectural conditions. Spring’s Context Cache as a Framework-Specific Grouping Mechanism Spring Boot integration tests are not plain unit tests. They often require infrastructure such as dependency injection, database configuration, security configuration, web layer configuration, mock infrastructure, external API clients, messaging components, or tenant-specific setup. Spring’s TestContext framework handles this through the ApplicationContext. The framework can reuse a context if the effective configuration is the same. The cache key is based on configuration parameters such as configuration classes, active profiles, property sources, context customizers, initializers, and other test context settings. Spring’s documentation describes this context caching mechanism and explains that contexts can be reused when the same unique context configuration is encountered again. Let me explain. Two tests may look similar to a developer but still produce different contexts if they use different profiles, properties, mocks, or imported configuration classes. They should normally produce separate context groups. For example, a database-focused test and a test involving an external OData destination may have different infrastructure requirements. In that case, a separate context is not a problem. It reflects a real test configuration group. When every test class introduces a slightly different property, mock, or configuration import without a strong technical reason. Then the number of contexts grows not because the architecture requires it, but because the test suite has configuration drift. Why Multiple Contexts Can Be Legitimate in Enterprise Applications Spring Boot itself supports different testing styles. The documentation describes @SpringBootTest for loading the application context through SpringApplication, and it also provides more focused test annotations for specific slices of an application. Spring Boot’s test slices include annotations such as @WebMvcTest, @DataJpaTest, @JsonTest, and others. These annotations intentionally load only selected parts of the application and import different auto-configurations depending on the target slice. Besides the Spring documentation, many community blogs report that different enterprise systems may have separate integration test groups, such as database-focused tests, web/controller tests, security-related tests, and so on. So, the goal should be to minimize unnecessary context fragmentation while preserving justified test configuration groups, instead of forcing the entire integration test suite into one ApplicationContext. From Test Grouping to a Context-Count Heuristic Based on this reasoning, I used the following interpretation in a case study: 1-3 application contexts show excellent context reuse,4-8 are acceptable if justified,10+ should be investigated, and a signal of a fragmented test configuration. Let's discuss the numbers. 1-3: The most integration tests share the same effective configuration. For example: Plain Text Context 1: default integration test context Context 2: database-specific context Context 3: external-system-specific context Such a structure is usually easy to understand. It suggests that the team has standardized its test profiles, properties, and infrastructure setup. 4-8: This is consistent with broader software-testing research, where test suites are not treated as one homogeneous block. They are often optimized, selected, prioritized, or clustered according to meaningful technical criteria such as coverage, execution cost, change relevance, or runtime behavior. For example: Plain Text Context 1: default SpringBootTest context Context 2: database-heavy context Context 3: external API integration context Context 4: security-specific context Context 5: multi-tenant context Context 6: messaging context Context 7: no-external-destination context Context 8: migration-specific context 10+: Once the number of contexts reaches double digits, investigation becomes worthwhile. This does not automatically mean the test suite is badly designed. Community articles on Spring test optimization show that a very large enterprise platform with many modules, tenant variants, data stores, messaging systems, and external integrations may legitimately require more contexts. So, the number 10+ is not firm, but suggests that the risk of accidental fragmentation becomes higher. Conclusion Test grouping is a recognized concept in software-testing research. Large test suites are often optimized through minimization, selection, prioritization, and clustering. These techniques are based on the idea that tests have different costs, purposes, coverage, runtime behavior, and relevance. For Spring Boot integration tests, context reuse is a framework-specific grouping criterion. (Use the method of test grouping to create Spring application contexts) Tests with the same effective MergedContextConfiguration belong to the same context group and can share the same cached ApplicationContext. Tests with genuinely different infrastructure needs may require different contexts. Therefore, the goal is not to reduce every enterprise test suite to a single context. The goal is to distinguish between justified test configuration groups and accidental configuration fragmentation. The shown numbers are a practical case-study heuristic, and not universal. But the underlying principle is robust: A small number of well-defined context groups is healthy, but a growing number of slightly different contexts is a performance smell. That principle connects Spring’s TestContext cache mechanism with a broader idea from software-testing research: large test suites should be structured intentionally, not allowed to fragment accidentally.

By Constantin Kwiatkowski
The Middleware Gap in AI Agent Frameworks
The Middleware Gap in AI Agent Frameworks

Most AI Agent frameworks treat the model as a black box: you register tools, the model picks one, the tool runs, and the cycle repeats. This pattern is perfect for demos, but for a production system, it requires more complex systems. We need to manage context windows, cache API calls, filter sensitive tools by role, and compact the information history within models to avoid token limits. I landed on middleware while reviewing issues for deepagents and understanding their codebase. This is when I started to wonder what middleware really is in the context of AI agents and its significance. This got me thinking: how do other frameworks handle this problem? So I went ahead and installed Pydantic AI, read the CrewAI source, and checked Langchain and Autogen. This article compares two frameworks that implement middleware as a primitive: Deep Agents (from LangChain) and Pydantic AI, and understands the difference between middleware and callbacks, and explains why this difference matters when running agents at scale. What You Will Learn By the end of this article, you will be able to: Distinguish middleware from tool callbacks and event callbacks, and why this mattersRead working code for deepagents' AgentMiddleware and Pydantic AI's AbstractCapabilityUnderstand the difference between the two frameworks: cross-turn AgentState access, production middleware, and config-driven profiles via HarnessProfile.Understand why frameworks built on callbacks cannot support patterns that middleware enables. What Is Middleware? The term "Middleware" often gets overloaded. In the context of AI agents, it means code that runs before or after every model call, with the ability to read and rewrite the request or response. What Differentiates Middleware From the Rest Middleware is different from: Tool callbacks – fired when the tool is called and not the model.Event callbacks – fire and forget, that can be observed but not changed.Post-processing – wrapping the final output after the agent loop ends. Middleware sits inside the request/response cycle of every LLM call, which gives it unique capabilities. Where the Middleware Sits in the Agent Loop It's the only layer with access to the request before it reaches the model and the response before it reaches the tool executor. CapabilityMiddlewareTool callbackEvent callbackModify system prompt per call✓✗✗Filter tool list dynamically✓✗✗Transform message history✓✗✗Cancel the model call✓✗✗Track state across turns✓Partial✗Observe output✓✓✓ Deep Agents: Middleware as a Composable Hook Installation: Shell pip install deepagents # Requires Python >=3.10 # Docs: https://docs.langchain.com/oss/python/deepagents/overview deepagents ships AgentMiddleware as a base class from langchain.agents.middleware.types. Every middleware subclass can override these key hooks (each has an async variant): Python class AgentMiddleware: def wrap_model_call( self, request: ModelRequest, handler: Callable[[ModelRequest], ModelResponse], ) -> ModelCallResult: # Intercept before AND after the model call. Call handler() to execute it. return handler(request) def before_model(self, state: AgentState, runtime: Runtime) -> dict | None: # Runs before the model is called. Can update agent state. return None def after_model( self, state: AgentState, runtime: Runtime ) -> dict | None: # Runs after the model responds. Can inject new messages into state. return None def wrap_tool_call( self, request: ToolCallRequest, handler: Callable[[ToolCallRequest], ToolMessage], ) -> ToolMessage: # Intercept individual tool calls for retry logic, monitoring, or modification. return handler(request) # async def awrap_model_call(...): ... # async versions of each hook also available The key insight: wrap_model_call receives the full request: messages, tools, settings, and can return anything, including a modified request passed to the next middleware in the stack. Multiple middleware compose like nested functions: Request -> Middleware A -> Middleware B -> Model Response <- Middleware A <- Middleware B <- Model Deep Agents middleware composition (innermost = closest to model) Built-In Middleware Deep Agents Ships Deep Agents includes several production-grade middleware out of the box: Python from deepagents.middleware import ( FilesystemMiddleware, # Filesystem read/write tools + permission enforcement MemoryMiddleware, # Injects relevant memories into system prompt each turn SkillsMiddleware, # Injects SKILL.md definitions into system prompt SubAgentMiddleware, # Spawns synchronous subagents as tools AsyncSubAgentMiddleware, # Spawns async background subagents SummarizationMiddleware, # Auto-compacts history when token budget fills SummarizationToolMiddleware,# Exposes compact_conversation as an explicit tool ) Writing a Custom Middleware Here is a practical example: a rate-limiting middleware that counts tool calls per turn and injects a warning into a system message when the agent is being "chatty": Python from langchain.agents.middleware.types import ( AgentMiddleware, ModelRequest, ModelResponse, ModelCallResult ) from langchain_core.messages import SystemMessage from collections.abc import Callable class ToolBudgetMiddleware(AgentMiddleware): """Warn the model when it has used many tools in a single turn.""" def __init__(self, budget: int = 5) -> None: self.budget = budget self._call_count = 0 def wrap_model_call( self, request: ModelRequest, handler: Callable[[ModelRequest], ModelResponse], ) -> ModelCallResult: # Count tool messages in the conversation (each = one tool call made) tool_calls_this_turn = sum( 1 for m in request.messages if hasattr(m, "tool_call_id") ) if tool_calls_this_turn >= self.budget: warning = ( f"\n\n[Budget notice: you have called {tool_calls_this_turn} tools " f"this turn. Prefer to synthesize results rather than calling more tools.]" ) system = request.system_message if system: new_content = str(system.content) + warning request = request.override( system_message=SystemMessage(content=new_content) ) return handler(request) You can wire this custom middleware alongside built-ins: Python from deepagents import create_deep_agent from deepagents.middleware import FilesystemMiddleware, SummarizationMiddleware from deepagents.backends import FilesystemBackend backend = FilesystemBackend(root_dir="/workspace") summarizer = SummarizationMiddleware( model="anthropic:claude-haiku-4-5", backend=backend, trigger=("fraction", 0.85), keep=("fraction", 0.10), ) agent = create_deep_agent( model="anthropic:claude-sonnet-4-6", middleware=[ FilesystemMiddleware(backend=backend), summarizer, ToolBudgetMiddleware(budget=5), # custom ], ) Middleware runs in list order: FilesystemMiddleware wraps first, then SummarizationMiddleware, then your custom one. Innermost is the closest to the model. The Profiles API: Middleware Configuration Without Code deepagents v0.5.4 added HarnessProfile which lets you declare middleware changes declaratively — add extra middleware, exclude a few middleware, override tool descriptions without touching create_deep_agent call sites. HarnessProfile merge semantics (additive, model-specific overrides, provider-level): Python from deepagents.profiles import HarnessProfile, register_harness_profile register_harness_profile( "anthropic:claude-haiku-4-5", HarnessProfile( system_prompt_suffix="Be concise. Prefer short answers.", excluded_middleware={SummarizationMiddleware}, # Haiku has small context, skip extra_middleware=[ToolBudgetMiddleware(budget=3)], ), ) # Now any agent using claude-haiku-4-5 automatically gets this profile applied agent = create_deep_agent(model="anthropic:claude-haiku-4-5") You can also load from a YAML file for a config file-driven deployment: YAML # haiku-profile.yaml system_prompt_suffix: "Be concise. Prefer short answers." excluded_middleware: - SummarizationMiddleware Python import yaml from deepagents.profiles import HarnessProfileConfig, register_harness_profile with open("haiku-profile.yaml") as f: register_harness_profile( "anthropic:claude-haiku-4-5", HarnessProfileConfig.from_dict(yaml.safe_load(f)), ) Pydantic AI: Capabilities as the Closest Parallel Installation: Shell pip install pydantic-ai # Docs: https://ai.pydantic.dev Pydantic AI's AbstractCapability is the closest architectural equivalent to LangChain's deepagents middleware. Subclass it from pydantic_ai.capabilities and override any of these lifecycle hooks: Python from pydantic_ai.capabilities import AbstractCapability class MyCapability(AbstractCapability): # Run-level hooks async def before_run(self, ctx, ...): ... # Before run starts async def after_run(self, ctx, *, result): ... # Observe/modify result async def wrap_run(self, ctx, *, handler): ... # Full wrap — intercept + resume async def on_run_error(self, ctx, *, error): ... # Handle run-level errors # Graph-node hooks async def before_node_run(self, ctx, *, node): ... # Before each graph node async def wrap_node_run(self, ctx, *, node, handler): ... async def on_node_run_error(self, ctx, *, node, error): ... # Model-request hooks — intercept the raw LLM call async def before_model_request(self, ctx, request_context): ... # Modify messages/tools async def wrap_model_request(self, ctx, *, request_context, handler): ... async def after_model_request(self, ctx, *, request_context, response): ... async def on_model_request_error(self, ctx, *, request_context, error): ... Note on granularity: Pydantic AI's before_model_request hook receives a ModelRequestContext containing messages, model_settings, and model_request_parameters (which includes the tool list). You can return a modified ModelRequestContext to rewrite what gets sent to the model, which is similar to deepagents' wrap_model_call. The key remaining difference is state persistence: these hooks operate within a single run's context, not across agent turns via a shared graph state. A practical example — wrapping a run to add timing and error context: Python from pydantic_ai import Agent from pydantic_ai.capabilities import AbstractCapability import time class TimingCapability(AbstractCapability): async def wrap_run(self, ctx, *, handler): start = time.monotonic() try: result = await handler() elapsed = time.monotonic() - start print(f"Run completed in {elapsed:.2f}s") return result except Exception as e: elapsed = time.monotonic() - start print(f"Run failed after {elapsed:.2f}s: {e}") raise agent = Agent( "anthropic:claude-sonnet-4-6", capabilities=[TimingCapability()], ) For injecting dynamic content into system prompts, you can use before_model_request to return a modified ModelRequestContext with updated instruction_parts, or use the instructions field and callable system_prompt at agent construction time. Pydantic AI vs. Deep Agents Middleware: The Key Differences DimensiondeepagentsPydantic AIHook classAgentMiddlewareAbstractCapabilityHook granularityPer LLM request, tool call, node, runPer LLM request, node, and runSystem prompt injectionvia ModelRequest in wrap_model_callvia ModelRequestContext in before_model_requestError hooksNo dedicated hookon_run_error, on_node_run_error, on_model_request_errorState persistence across turnsAgentState dict shared with LangGraphPer-run context onlyTool list access & filteringModelRequest.tools in wrap_model_callvia ModelRequestContext.model_request_parametersCross-framework portabilitydeepagents / LangGraph onlyPydantic AI onlyConfig-driven (no code)Yes - HarnessProfile + YAMLNoBuilt-ins included7 production middlewareNone - user-defined The biggest practical difference is that Deep Agent's middleware has access to AgentState (the full LangGraph graph state across turns) through after_modelwhich means middleware can read message history, inject summary nodes, and write back to the state. Pydantic AI capabilities are scoped to a single run's context. This means that there is no shared graph state across agent turns. What Other Frameworks Do Instead LangChain Callbacks (v0.1 Style) Python from langchain_core.callbacks.base import BaseCallbackHandler class MyCallback(BaseCallbackHandler): def on_llm_start(self, serialized, prompts, **kwargs): ... def on_llm_end(self, response, **kwargs): ... You cannot modify or cancel the request, and it is not composable in any way. This is useful for logging, but not useful in request transformation. CrewAI Step Callbacks Python from crewai import Crew def my_step_callback(output): print(f"Step completed: {output}") crew = Crew(agents=[...], tasks=[...], step_callback=my_step_callback) step Callbacks are called after each task step completes. This has no access to the request, and you cannot modify the list of tools or even the system prompt. This has similar limitations to LangChain callbacks. AutoGen v0.4 Message Middleware AutoGen's message-passing model means you can inject agents into the conversation (e.g., a logging proxy agent), but there's no formal pre or post-hook around model calls. The closest equivalent is a UserProxy agent that intercepts messages, but it's a peer agent and not a transparent middleware layer. What the Middleware Gap Can Actually Cost You Token budget. When a particular conversation is approaching the model limit, you would want to summarize old tool outputs before the model call and not after. A callback fires too late to help, and you might run out of tokens or overshoot your token usage.Per user tool filtering. In any given organization, there are different roles for different users and different access permissions. Without middleware, it's hard to filter out tools that certain users cannot run. Consider a scenario where you don't have middleware to filter, and you just call the LLM, which in turn calls the tools, only to find out that the tool call failed because of access permissions. That's wasted resources and tokens, and unnecessary LLM calls, which could be easily avoided.Prompt caching across providers. Anthropic's prompt caching requires cache_control in the request. AnthropicPromptCachingMiddleware rewrites the message and tool definitions of every model call to apply cache breakpoints in the right places. Without middleware, this would have required changes to every call site. Conclusion The middleware gap is why some production agents are trivially simple in Deep Agents and PydanticAI, but not possible in other frameworks. Summarizing message history before the model call, filtering tools based on roles, and injecting cache-control blocks in the right position are all possible with middleware, not with a callback that fires after it completes. For teams choosing a framework today: if you need to transform what the model sees on every call rather than just observe it, the choice narrows to Deep Agents or Pydantic AI. If you want that transformation to reference or rewrite history spanning multiple turns, deepagents with LangGraph is the only framework that supports this today. Middleware is not the most visible feature of an agent framework, but it is a primitive that sets the ceiling for everything else.

By Ninaad Rao

Top Microservices Experts

expert thumbnail

Jubin Abhishek Soni

Senior Software Engineer,
Yahoo

Jubin Soni is a Senior Software Engineer with 14+ years of experience building scalable systems, real-time data pipelines, and AI-driven platforms for industry leaders in technology and media. With deep expertise spanning cloud-native architectures, distributed systems, and applied machine learning, Jubin brings a rare combination of engineering depth and research breadth to every problem he tackles. He is a published researcher with work appearing in IEEE and other peer-reviewed venues, and a Manning Publications author. Jubin holds IEEE Senior Member status and has spoken at technical conferences including P99 CONF, ACM and APIdays, sharing his expertise in distributed systems, serverless architectures, and AI with engineering communities globally. He is passionate about pushing the boundaries of what scalable software can do — and sharing those insights with fellow engineers through writing, research, and open source.
expert thumbnail

Satrajit Basu

Chief Architect,
TCG Digital

Satrajit, a visionary Chief Architect and an AWS Ambassador, brings unparalleled expertise in architecting and directing mission-critical projects for industry leaders across various sectors. From banking to aviation, Global Distribution Systems (GDS) to restaurant and travel e-commerce, Satrajit has mastered the art of migrating and modernizing workloads on AWS. With an unwavering passion for technology, Satrajit ensures that applications on AWS are not just well-architected but also leverage the latest cutting-edge technologies. An architect par excellence, Satrajit's dedication extends beyond project delivery. He generously shares his vast knowledge through insightful technical blogs, enlightening aspiring architects and developers worldwide

The Latest Microservices Topics

article thumbnail
Architecting Trustworthy AI: Engineering Patterns for High-Stakes Environments
This post presents three domain-agnostic engineering patterns for building AI systems that remain safe even when the model is wrong.
June 29, 2026
by Sujay Puvvadi
· 541 Views
article thumbnail
Building Production-Safe Agentic Remediation With Docker MCP Gateway: Lessons From 43% to 100% Accuracy
We built an AI Docker remediation system on MCP Gateway. First version: 43% correct. After 9 engineering fixes: 100%. Here's what changed.
June 29, 2026
by Mohammad-Ali Arabi
· 638 Views
article thumbnail
Implementing Asynchronous Communication Between Microservices Using Kafka and Spring Boot
Kafka decouples services, buffers spikes, and routes failures to a DLT. Schemas are contracts; consumers must be idempotent.
June 24, 2026
by Mallikharjuna Manepalli
· 1,611 Views
article thumbnail
Who Owns the Data Stack?: How AI Is Reshaping Ownership, Architecture, and Accountability Across Teams
Build AI-native data systems with clear ownership, semantic contracts, and governance. Learn how accountability, retrieval, and data quality shape AI behavior.
June 24, 2026
by Miguel Garcia DZone Core CORE
· 999 Views · 1 Like
article thumbnail
Your AI Coding Agent Can't Steal What It Never Had: The Docker Sandbox Isolation Story
Docker Sandbox runs AI agents in microVMs. The API key never enters the sandbox — the host proxy authenticates on the agent's behalf.
June 19, 2026
by Shamsher Khan DZone Core CORE
· 1,388 Views
article thumbnail
The Rise of Microservices Architecture in Scalable Applications
Microservices architecture enables scalable applications by breaking systems into independent services, improving flexibility and scalability.
June 17, 2026
by Mitchell Jhonson
· 1,808 Views
article thumbnail
Parallel Kafka Batch Processing With Kotlin Coroutines in Spring Boot
Learn how Kotlin Coroutines improve Spring Boot Kafka batch processing with parallel execution, resource throttling, and faster database operations.
June 16, 2026
by Erkin Karanlık
· 2,172 Views · 1 Like
article thumbnail
Runtime Formula Evaluation With MVEL Library in Spring Boot
Learn how to use MVEL in Spring Boot to move business rules into the database, reduce deployments, and support dynamic runtime calculations.
June 16, 2026
by Erkin Karanlık
· 2,091 Views
article thumbnail
Building a Multi-Agent Orchestration Capability: Architecture and Code Walkthrough
An architectural pattern where multiple specialized AI agents collaborate through a central orchestrator and leverage tools to solve complex user objectives.
June 16, 2026
by Narendra Lakshmana gowda
· 1,110 Views
article thumbnail
Operationalizing Enterprise AI at Scale: Architecture, Governance, and Adoption
Enterprise AI success depends on scalable architecture, governance automation, AI operations, observability, and developer-first enablement strategies.
June 12, 2026
by Aravind Nuthalapati DZone Core CORE
· 2,009 Views · 3 Likes
article thumbnail
A Spring Boot App With Half the Startup Time
Learn how Project Leyden and AOT caching can cut Spring Boot startup time in half, improving Kubernetes scaling and application responsiveness.
June 12, 2026
by Sven Loesekann
· 2,575 Views · 3 Likes
article thumbnail
Architecting Proactive IT: NinjaOne Remote Monitoring and Management
Transforming from reactive IT support to proactive infrastructure management requires platforms built on modern architectural principles.
June 12, 2026
by Stelios Manioudakis DZone Core CORE
· 2,667 Views · 2 Likes
article thumbnail
Beyond REST: Architecting High-Density Agentic Microservices With MCP and WASI-NN
REST APIs waste tokens. UMA uses MCP to bridge agents to local Wasm/WASI-NN, slashing costs and latency by replacing raw data with deterministic, executable intent.
June 12, 2026
by Nabin Debnath
· 1,317 Views
article thumbnail
Combining Temporal and Kafka for Resilient Distributed Systems
Kafka handles durable event streaming while Temporal manages long-running workflow state, retries, and recovery to build resilient distributed systems.
June 9, 2026
by Akhil Madineni
· 1,734 Views · 1 Like
article thumbnail
Frame Buffer Hashing for Visual Regression on Embedded Devices
Learn how frame buffer hashing reduced visual regression storage from 18GB to 19KB while speeding up CI and eliminating flaky image diffs.
June 9, 2026
by Rajasekhar sunkara
· 739 Views
article thumbnail
How to Interpret the Number of Spring ApplicationContexts in Integration Tests
When optimizing Spring Boot integration tests, developers often focus on obvious metrics, but they do not always explain why an integration test suite is slow.
June 8, 2026
by Constantin Kwiatkowski
· 1,423 Views
article thumbnail
The Middleware Gap in AI Agent Frameworks
Most agent frameworks observe model calls and allow rewriting them only after they reach the model, making an understanding of callbacks and middleware essential.
June 8, 2026
by Ninaad Rao
· 1,477 Views · 4 Likes
article thumbnail
Is the Data Warehouse Dead? 3 Patterns From Enterprise Architecture That Answer This Question
No, but its role has fundamentally changed. Here is what I have seen work, after building data platforms at enterprise scale across multiple industries.
June 5, 2026
by Nabarun Bandyopadhyay
· 3,845 Views · 1 Like
article thumbnail
Why Your Test Automation Is Always Behind the Code And the Architecture That Fixes It
Most QA teams are stuck in a manual scripting loop. Here's the requirement-driven architecture that eliminates the coverage gap permanently.
June 5, 2026
by Waqar Hashmi
· 2,240 Views
article thumbnail
Multi-Scale Feature Learning in CNN and U-Net Architectures
Multi-scale feature learning helps CNNs and U-Net models combine global context with fine details, improving accuracy in tasks like image segmentation.
June 3, 2026
by Akhil Madineni
· 1,110 Views
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • ...
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×