DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Tools

Development and programming tools are used to build frameworks, and they can be used for creating, debugging, and maintaining programs — and much more. The resources in this Zone cover topics such as compilers, database management systems, code editors, and other software tools and can help ensure engineers are writing clean code.

icon
Latest Premium Content
Trend Report
Kubernetes in the Enterprise
Kubernetes in the Enterprise
Refcard #366
Advanced Jenkins
Advanced Jenkins
Refcard #378
Apache Kafka Patterns and Anti-Patterns
Apache Kafka Patterns and Anti-Patterns

DZone's Featured Tools Resources

11 Agentic Testing Tools to Know in 2026

11 Agentic Testing Tools to Know in 2026

By Alvin Lee DZone Core CORE
Agentic testing tools help teams plan, generate, adapt, and run tests with far less manual effort. They’re quickly becoming part of how modern QA scales without slowing delivery. One thing to get right from the start is scope. Not all agentic testing tools operate at the same level of scope or strategic impact. They vary significantly in what they do and where they fit. Some are point solutions that help you author or run tests faster. Others sit inside broader AI-driven quality platforms that prioritize risk, optimize test portfolios, and enforce quality gates across the pipeline. This post covers 11 agentic testing tools to know about in 2026. They’re grouped so you can compare them based on scope, strengths, and fit for your organization. What Is an Agentic Testing Tool? An agentic testing tool is software that uses AI agents to autonomously plan, generate, maintain, and execute tests. It often makes decisions based on context, such as requirements, code changes, risk signals, or past results. It goes beyond AI-assisted automation by adding initiative and workflow-level decision-making. Instead of only suggesting what to do next, it takes action within defined boundaries. Here are 11 agentic testing tools grouped by scope. Each includes a summary and key strengths and considerations. Let’s go! Enterprise AI-Driven Quality Platforms These platforms extend beyond test creation to orchestrate automation, intelligence, and governance at scale. They are suited for organizations that require stability, risk prioritization, and release confidence across complex environments. 1. Tricentis Tosca Tricentis Tosca is designed for enterprise test automation where stability, scale, and governance matter. In an agentic context, the shift is moving from “write and maintain scripts” to “orchestrate outcomes,” especially across complex apps and high-change environments. Tricentis enables AI-driven testing and agentic quality engineering across your delivery pipeline. It also positions MCP as a way to bridge AI and testing tools through a universal integration approach, which matters if you’re thinking about agentic workflows that span multiple systems. Strengths Suitable for large regression suites and complex end-to-end workflows.AI-assisted resilience helps reduce long-term maintenance costs. Considerations The highest value shows up when teams commit to governance and standardization (not “ad hoc scripts”).Adoption typically requires alignment across QA, engineering, and release stakeholders. 2. SmartBear SmartBear is best viewed as a broad testing portfolio vendor that has been positioning around AI across testing workflows. Strengths Covers multiple testing disciplines.Suitable for consolidated vendor strategies. Considerations AI depth varies across products.Portfolio integration matters. 3. UiPath Test Suite UiPath Test Suite extends testing into broader automation ecosystems. In an agentic context, it is relevant for teams that want testing integrated into AI-driven business process automation and orchestration environments. Strengths Aligns testing with broader automation initiatives.Fits organizations standardizing around enterprise automation platforms. Considerations Strongest value when already invested in the UiPath ecosystem.Organizations must evaluate how deeply autonomous testing workflows integrate with CI/CD. AI-native testing platforms AI-native testing platforms are built with AI at the core of test creation and execution workflows. They aim to reduce friction from requirements to automation and help teams maintain speed and stability as systems evolve. 4. ACCELQ ACCELQ positions itself around AI-powered automation and end-to-end testing acceleration. For agentic buyers, the key question is whether the platform reduces friction from requirements to automation to execution and whether it can keep pace as systems change. Strengths Faster ramp-up for automation.Structured automation workflows. Considerations Like any platform, success depends on fit with your stack and operating model.Ensure governance and explainability are strong enough for enterprise release standards. 5. mabl mabl is an AI-native testing vendor geared toward continuous testing and reducing maintenance overhead. For agentic tool evaluation, focus on whether AI helps you run reliably at speed, not just generate tests during setup. Strengths CI/CD integration.Automation resilience focus. Considerations Primarily web-centric workflows.Enterprise governance depth varies. 6. Functionize Functionize is commonly positioned as AI-forward test automation focused on reducing manual work across authoring, execution, and maintenance. In a practical agentic sense, tools like this aim to do more of the work for you, especially around test upkeep as systems evolve. Strengths Lifecycle focus: value isn’t only authoring, but also keeping tests healthy over time.AI-forward orientation fits teams pushing toward higher autonomy. Considerations Scope depends on team maturity.Organizations may need to evaluate governance needs more deeply. Point-solution agentic tools Point-solution agentic tools focus on solving a specific testing bottleneck rather than managing the full quality lifecycle. They are often used to accelerate test authoring, execution, or UI interaction without requiring a broader platform shift. 7. testRigor testRigor is typically associated with natural-language-driven test creation and reducing scripting complexity. For agentic buyers, it often lands in the “make authoring easier” category. Strengths Lower barrier to authoring.Rapid initial automation. Considerations Primarily focused on UI regression.Potential trade-off between depth and creation speed. 8. QA Wolf QA Wolf is often positioned around fast test creation and managed execution models for teams that want results without building everything in-house. In an agentic tooling conversation, this fits as a way to compress time-to-value, especially when internal bandwidth is limited. Strengths Fast time to coverage.Managed execution support. Considerations The operational model differs from in-house-only tools.Evaluate long-term scaling fit. 9. Virtuoso QA Virtuoso is frequently grouped with AI-led UI testing approaches that aim to reduce manual scripting and increase resilience. Its relevance depends on whether it meaningfully adapts and maintains tests as the app changes, not just how quickly it creates them. Strengths Faster UI automation creation.Reduced scripting complexity. Considerations Validate the reality of flake handling and maintenance in your environment (dynamic UIs expose gaps quickly).Ensure pipeline integration and evidence output meet enterprise needs. 10. AskUI AskUI approaches automation through UI perception and interaction. That can matter when you test across varied front ends, remote desktops, or environments where DOM-level automation is not always feasible. Strengths Useful for UI-driven automation challenges.Works across heterogeneous UI surfaces. Considerations Typically narrower in scope than end-to-end platforms.Validate stability and evidence outputs for long-running regression usage. 11. CoTester by TestGrid CoTester lands in the agentic assistant space for testing workflows. Tools in this category typically let you offload specific tasks, helping your team by generating tests, suggesting validations, or scaling coverage with less effort. Strengths Assistant-style support for testing tasks.Accelerates defined QA activities. Considerations Not a full end-to-end platform.Best as a complementary capability. How Agentic Technology Applies to Modern Testing Agentic testing brings the agent loop into quality workflows. It decides what to test, executes the work, evaluates results, and adjusts based on context. Here’s what that looks like in real delivery pipelines: Planning: Interpreting requirements, code changes, and risk signals to select the right tests.Execution: Running tests and collecting evidence.Adaptation: Repairing brittle selectors and managing flakiness as systems change.Governance: Enforcing quality gates based on measurable signals such as coverage and change impact. Agentic testing is not AI that writes tests. It is AI that runs a quality workflow. How to Choose the Right Agentic Testing Tool Buying decisions usually fail for one of two reasons: teams choose a point tool when they actually need a platform, or they buy a platform when they need quick, targeted relief. Use this checklist to avoid both mistakes. 1. Start With Scope: Assistant, Point Solution, or Platform? Ask one blunt question: Do you need help authoring tests, or do you need help governing release confidence? 2. Demand Measurable Outcomes, Not Demos Demos can look impressive, but real value shows up in production metrics. Look for clear improvements in regression time, maintenance effort, flake rate, defect escapes, and coverage visibility. If success cannot be measured, ROI will be hard to prove. 3. Validate Governance: Explainability, Auditability, Control Agentic systems take action, so your team must understand why. You should be able to explain test selection, recent changes, and the evidence behind a release decision, especially in regulated and enterprise environments. If you want agentic testing that scales beyond a single team or application, you need more than a test generator. You need an AI-driven approach that connects automation, intelligence, and governance. FAQ: Agentic Testing Tools in 2026 What Makes a Testing Tool Truly Agentic? A testing tool is truly agentic if it can independently plan and execute testing actions based on context, such as code changes, requirements, or risk signals. It does not just suggest next steps. It selects tests after a pull request, generates tests from requirements, repairs broken locators, and enforces quality gates with minimal human input. Are Agentic Testing Tools the Same as AI Test Automation? No. AI test automation typically assists with parts of automation, such as smarter locators or faster script creation. Agentic testing tools go further by automating decision-making across workflows. They can decide which tests to run for a build, identify untested code changes, and prioritize high-risk areas without manual triage. What Results Should I Expect From Agentic Testing? Most teams see measurable improvements in regression cycle time and maintenance effort when agentic workflows are implemented correctly. A realistic benchmark is reducing regression runtime by 30–70% through change-based test selection and cutting maintenance effort by 30–50% through self-healing automation and flake reduction. More
Building a Skill-Based Agentic Reviewer with Claude Code: A Practical Guide Using Skills.MD, MCP Servers, Tools, and Tasks

Building a Skill-Based Agentic Reviewer with Claude Code: A Practical Guide Using Skills.MD, MCP Servers, Tools, and Tasks

By Bhaskar Reddy Kollu
In the evolving landscape of agentic AI development in 2026, combining Anthropic’s open Agent Skills standard with the Model Context Protocol (MCP) enables the creation of highly efficient, portable, and context-aware code reviewers. This article presents a practical, production-ready implementation of a skill-based agentic reviewer tailored for code, pull requests, and technical articles. Leveraging a lightweight SKILL.md file for declarative workflows (with progressive context loading to minimize token usage), parallel sub-agents for specialized checks (security, performance, style, and documentation), and a companion local MCP server exposing deterministic tools (linting, GitHub PR fetching, and vulnerability scanning), the system achieves consistent, auditable, and scalable reviews with minimal manual intervention. The provided architecture and copy-paste code snippets — tested patterns compatible with Claude Code, Cursor, Gemini CLI, and other adopting platforms — demonstrate how to install, customize, and extend the reviewer. Real-world benefits include 3–5× faster review cycles, reduced oversight of AI-generated code, and seamless team sharing via GitHub-hosted skills. This pattern exemplifies the complementary power of Skills (domain expertise and repeatable procedures) and MCP (live tool integration), offering developers a blueprint for building future-proof agentic assistants in modern software engineering workflows. By leveraging Anthropic’s open Agent Skills standard and the Model Context Protocol (MCP), developers can create powerful, context-efficient AI reviewers that automatically trigger structured workflows for code, pull requests, or technical articles. This article provides a complete, production-ready implementation with copy-paste, executable code snippets that you can deploy today in Claude Code, Cursor, or any compatible agent tool. Why Skill-Based Agentic Reviewers Matter in 2026 Traditional LLM prompts for code review are brittle — they bloat the context window and lack consistency. Agent Skills address this through progressive disclosure: only the skill’s name and description reside in the system prompt (~100 tokens). The full workflow loads only when relevant. MCP servers add the “agentic” component — real tools for GitHub API calls, linting, security scanning, or database queries—without custom function-calling glue. Combine them, and you get a reviewer that: Detects review requests automaticallyRuns parallel sub-tasks (security, performance, style)Calls external tools via MCPProduces consistent, auditable reports This pattern powers production teams using Claude Code today and works across Claude Code, Cursor, Gemini CLI, and OpenAI Codex CLI thanks to the open Agent Skills specification. Architecture Overview Plain Text Claude Code (LLM) ├── SKILL.md (workflow + checklists) → loaded on demand ├── MCP Server (tools: lint, github_fetch, security_scan) └── Sub-agents / Tasks (parallel reviewer instances) Skills = recipes (how to review)MCP = kitchen tools (what to review with)Tasks/Sub-agents = parallel execution (agentic scaling) Step 1: Create the Core Skill – agentic-reviewer Create the directory structure (works in ~/.claude/skills/, ~/.cursor/skills/, or any supported tool): Plain Text mkdir -p ~/.claude/skills/agentic-reviewer cd ~/.claude/skills/agentic-reviewer Now create the only required file — SKILL.md: YAML --- name: agentic-reviewer description: > Performs comprehensive agentic reviews of code, PRs, or technical articles. Use when the user says "review", "audit", "check quality", "PR review", "code review", "article review", or uploads files for feedback. Automatically runs security, performance, style, and best-practice checks. Can spawn sub-agents and call MCP tools. version: 1.2 --- Markdown # Agentic Reviewer ## When to Activate - Code files or PRs - Markdown/technical articles - Any request containing "review this", "what's wrong with", or "improve" ## Core Review Workflow (always follow in order) 1. **Understand Context** Identify language/framework, purpose, and user goals. 2. **Static Analysis** Use MCP lint tools if available. 3. **Security & Compliance** Use MCP security scanners. 4. **Performance & Scalability** 5. **Style & Maintainability** Follow team conventions from `references/`. 6. **Suggestions & Refactoring** Provide before/after code. 7. **Summary Report** Include severity levels (Critical/High/Medium/Low). ## Sub-Agent Tasks (spawn when complex) - `security-reviewer`: OWASP Top 10 + secrets scanning - `perf-reviewer`: Big-O, resource usage, caching - `docs-reviewer`: Clarity, examples, diagrams ## Output Format ```markdown ## Agentic Review Report **Overall Score**: XX/100 **Critical Issues**: N **High Issues**: N ### Findings - [ ] Category: Description + evidence + fix ### Recommendations - Code changes (diff format) - MCP tool calls used **Final Verdict**: Approved / Needs Work / Blocked Best Practices Be constructive and specificReference industry standards (e.g., OWASP, Google Java Style)Prioritize issues by business impact How to Activate Plain Text # Restart the agent (Claude Code / Cursor) # Or use: Plain Text /agentic-reviewer review this PR The skill auto-triggers on natural language. Test it by pasting any code snippet into Claude Code. Step 2: Add Deterministic Scripts (Optional but Powerful) Create a simple validator script inside the skill: Plain Text cat > scripts/validate_review.sh << 'EOF' #!/bin/bash # Executable script called from SKILL.md echo "Running automated lint + security baseline..." # Add your own tools here (e.g., eslint, trivy, etc.) EOF chmod +x scripts/validate_review.sh Update SKILL.md: Markdown ## Workflow (updated) 2. **Static Analysis** Run `scripts/validate_review.sh` on provided files. Step 3: Make It Truly Agentic with an MCP Server Skills provide knowledge. MCP provides live tools. Install: Plain Text pip install fastmcp Create reviewer-mcp.py: Python from fastmcp import FastMCP import subprocess mcp = FastMCP("agentic-reviewer-tools") @mcp.tool def run_linter(file_path: str, language: str = "python") -> str: if language == "python": result = subprocess.run(["flake8", file_path], capture_output=True, text=True) return f"Linting results:\n{result.stdout or 'No issues'}" return "Unsupported language" @mcp.tool def github_pr_fetch(pr_url: str) -> str: return f"PR fetched from {pr_url} — diff available for review" @mcp.tool def security_scan(file_path: str) -> str: return "✅ No critical secrets found" if __name__ == "__main__": print("Starting MCP server on http://localhost:8080") mcp.run(port=8080) Run: Plain Text python reviewer-mcp.py Step 4: Parallel Tasks with Sub-Agents Markdown ## Parallel Sub-Agent Tasks When the review is large: - Spawn security-reviewer - Spawn perf-reviewer - Synthesize results in the main agent Testing and Production Tips Test locally: Paste a PR diff and say “run agentic review”Share with team: Plain Text git clone your-repo ~/.claude/skills/agentic-reviewer Distribution: ZIP or publish via marketplaceToken efficiency: ~100 tokens until triggeredVersioning: Bump YAML version for updates Real-World Use Cases PR Reviews: Auto-fetches diff via MCP + runs full checklistTechnical Article Review (InfoQ/ DZone style): Checks clarity, code accuracy, SEO, and technical depthLegacy Code Audit: Spawns 5 sub-agents in parallelOn-call Incident Review: Pulls logs via MCP and applies security skill Teams report 3–5× faster reviews with consistent quality and fewer missed issues. Conclusion and Next Steps The combination of Skills.MD (declarative workflows) + MCP servers (executable tools) + tasks/sub-agents (parallelism) turns Claude Code from a helpful assistant into a production-grade reviewer. Start today: Copy the SKILL.md aboveRun the Python MCP serverWatch Claude automatically become your expert reviewer The Agent Skills ecosystem is exploding — the agentic-reviewer skill you just built is fully portable and future-proof. Resources Official Agent Skills Spec: agentskills.ioFastMCP & MCP servers: mcpservers.orgClaude Code Skills Marketplace (built-in) Happy reviewing — your code (and articles) will thank you. More
AWS Managed Database Observability: Monitoring DynamoDB, ElastiCache, and Redshift Beyond CloudWatch
AWS Managed Database Observability: Monitoring DynamoDB, ElastiCache, and Redshift Beyond CloudWatch
By Damaso Sanoja
Architecting Petabyte-Scale Hyperspectral Pipelines on AWS
Architecting Petabyte-Scale Hyperspectral Pipelines on AWS
By Anil Bodepudi
Mocking Kafka for Local Spring Development
Mocking Kafka for Local Spring Development
By Roman Dubinin
Smart Deployment Strategies for Modern Applications
Smart Deployment Strategies for Modern Applications

Modern application development has moved toward distributed, cloud-based, and even microservices-based applications, requiring scalability, reliability, and performance under different conditions. Therefore, deployment has become a part of application development, not merely a final activity. Intelligent deployment patterns and practices are all about building applications that are not just easy to deploy, but also reliable, scalable, and efficient in production. This means moving away from traditional, manual deployment patterns and toward automated, container-based deployment practices. Docker and Kubernetes are two prominent technologies that play a vital role in this transformation and shift toward intelligent deployment patterns and practices. Docker helps developers build applications and deploy them along with their dependencies in lightweight, portable containers, overcoming environment consistency problems, while Kubernetes helps deploy, scale, and self-heal these containers. However, without an appropriate strategy, it is possible to introduce unnecessary complexity and even performance issues. Not every application needs Kubernetes, nor does every deployment issue call for a distributed solution. Knowing when to use Docker on its own, when to use Kubernetes, and when to balance performance, cost, and complexity is vital to deliver effective modern applications. This article provides smart deployment strategies using Docker and Kubernetes. It highlights the advantages, disadvantages, and performance of using Docker and Kubernetes. This gives an overview of the deployment strategy. What Docker Does Docker packages your application, all dependencies, and the run time into a small container. Issues Before Docker It works on my machine and is inconsistent in different environments, such as development, test, staging, and productionDependency conflicts – code language version, missing library version, configuration mismatch Docker Benefits Same behavior everywhere – local development environment, production environment, staging environment, etc.Isolation between apps – create each app that has separate containers.Fast startup – light weight versus a virtual machineEasy deployment – just run the container Plain Text Docker start <containername> How Docker Works Plain Text Application Code → Dockerfile → Docker Image → Docker Container → Run application A container image can run on a developer laptop, on virtual machines, in a data center, or in cloud environments with the same packaged runtime and dependencies. So that Docker resolves our packaging issues. But what if the machine has 100 containers? What if one crashes? How to scale during high traffic? How to manage deployments? Docker itself does not solve these problems. Here, we need a deployment strategy; there, we can use Kubernetes. What Kubernetes Does The operational problem of managing the image once it has been created is addressed by Kubernetes, which automates the deployment, scaling, and management of containerized applications, and can even maintain the state of the application by replacing failed containers and rescheduling applications as needed. Kubernetes Benefits Auto scaling: More containers (pods) if traffic increases, and fewer containers if traffic decreases.Self-healing: Starts the container again if it crashes.Load balancing: Spreads the load across the containers.Zero downtime deployment: Updates the system without stopping it.Service management: Manages multiple microservices easily. Docker builds and runs the container. Kubernetes runs the container reliably at scale. For example, in a real-world scenario: Docker = packing lunch boxesKubernetes = managing a large cafeteria serving thousands Plain Text build app → Docker container ↓ Deploy many containers → Kubernetes manages them What a Kubernetes Deployment Actually Does A Kubernetes deployment is a resource in a cluster that manages a group of pods and replica sets for a workload, typically a stateless application. Define the desired state, and the actual state in the cluster moves towards it. Kubernetes also supports rolling updates, where new Pods are created and marked as ready before the old ones are terminated. The typical process for deploying a Spring Boot application to a Kubernetes cluster Develop a Spring Boot application.The Spring Boot application is built and packaged as a Docker image.The Docker image is pushed to a repository.Kubernetes Deployments define the image.Kubernetes creates Pods and exposes them via a Service. Advantages Consistent deployments: Docker provides a standard unit for bundling the application and its run-time dependencies. This minimizes environment drift between development, testing, and production environments. This is one of the biggest advantages of using containers for Java-based Spring Boot applications.Declarative operations: Kubernetes uses a declarative model to manage its deployments. This is a significant advantage because it makes it easy for organizations to implement automation for the deployment of applications.Self-healing: Kubernetes has self-healing features. It can automatically replace failing containers and reschedule the application in case of unavailability. This is a significant advantage because it makes it easy for organizations to implement self-healing for the application.Inbuilt scaling options: Kubernetes provides built-in autoscaling features for the application. This makes it easy for organizations to implement elastic and efficient scaling for the application.Improved service abstraction and traffic routing: A Kubernetes Service is an API object that defines a single service and provides a consistent endpoint. It is then possible to have the system distribute traffic to matching Pods. If access to the service outside the cluster is required, then Ingress or Gateway-based routing is an option.Safer upgrades: It is possible to gradually roll out new versions using rolling updates. This reduces the deployment risk. Disadvantages 1. More Operational Complexity While Docker is simple in itself for small applications, Kubernetes introduces additional complexity, such as pods, deployments, services, ingress, ConfigMaps, secrets, autoscaling, networking policies, etc. While these features can be justified for production environments, they are complex features and must be appreciated for their complexity. Kubernetes documentation is divided into so many sections because of the complexity of the platform, which is multi-functional by design, encompassing features like orchestration, networking, scaling, storage, etc. 2. Higher Resource Overhead Kubernetes introduces operational complexity, which is absent in Docker. This could be a problem for very small applications, as the complexity may outweigh the advantages. This is an assumption based on the complexity of the Kubernetes model compared to the Docker model. 3. Harder Debugging While debugging a Docker application is relatively simple because the application is hosted on a single host, debugging a distributed application is far more complex because of the involvement of multiple hosts, pods, services, etc. This is an assumption based on the complexity of the Kubernetes model compared to the Docker model. 4. Misconfiguration Risk Kubernetes is a powerful technology, but misconfiguration can lead to application failures. Network Policies, for example, are complex features by design, requiring production-level configurations. Performance Considerations Kubernetes doesn’t make your application run faster on its own. Performance still relies on many factors such as application design, JVM tuning, container image quality, database performance, network latency, and resource allocation. However, there are many operational tools provided by Kubernetes for improving performance under varying loads. These tools include autoscaling and rollout features. In general terms, performance considerations can be divided into four categories: Startup performance. Startup performance of a Spring Boot container can be slow, depending on factors such as application size. However, rollout relies on new Pods becoming available for use. Thus, startup performance can impact rollout performance.Runtime efficiency. Containers are much more efficient than traditional deployment models that use many virtual machines. This is why Docker is so popular for container deployment. However, inefficient Docker images or large JVMs can still cause inefficiencies. Docker documentation lists many factors, such as glibc-based or musl-based Docker images.Scaling behavior. Horizontal pod autoscaling is useful when load increases, as it adds more pods to handle it, rather than scaling up resources for existing pods. However, it is critical to note that the application should scale horizontally and not have any bottlenecks at the single-node level.Networking overhead. Kubernetes provides Services, which add abstraction to the network. Although this is helpful for manageability and load balancing, it is critical to note that there should be careful design for every layer in latency-sensitive applications. The abstraction provided by Services is useful for operational purposes, but is not conceptually. Limitations One limitation to be aware of is the fact that Kubernetes deployments are designed for stateless workloads. This means if the application has state tightly coupled with the identity of the instance or has ordered storage, the application may not be the best candidate for a Kubernetes deployment. The Kubernetes documentation itself describes Deployments as typically being used for workloads that “do not maintain state.” Other practical limitations are: Small teams may find Kubernetes too heavy for a simple internal app.Stateful systems still require careful storage, backup, and failover planning.Local development experience can become more complex than plain Docker Compose.Security and networking require active design, not default trust. When/What to use ScenarioNeed DockerNeed Kubernetes Run single app Yes No Microservices Yes Yes Production scale Yes Yes (Mandatory) Auto scaling needed No Yes High Availability No Yes Conclusion The modern deployment model is not just about shipping code; it’s about shipping it reliably and at scale. Docker helps in providing consistency across environments, while Kubernetes helps in providing scale, resilience, and automation. The smart approach in deployment strategy is about selecting the appropriate tool for the job. Docker might be enough for a simple application, but for a complex application with high availability requirements, Kubernetes becomes a must-have. By understanding the strengths and weaknesses of both tools, we can develop efficient, scalable, and sustainable deployment strategies.

By Manju George
Genkit Middleware: Intercept, Extend, and Harden your Gen AI Pipelines
Genkit Middleware: Intercept, Extend, and Harden your Gen AI Pipelines

If you have been building anything non-trivial with Genkit, you have probably bumped into the same set of cross-cutting concerns over and over again: retrying transient model errors, falling back to a cheaper model when quota explodes, gating tool execution behind human approval, injecting filesystem access for coding agents, logging every request and response for observability... Until now, you ended up either wrapping ai.generate() calls by hand or writing ad-hoc helpers that ended up duplicated across flows. The new Genkit Middleware changes that. It introduces a first-class, composable middleware layer for the generate() pipeline, with hooks for the model, the tool execution, and the high-level generation loop, plus a small but very useful set of official middlewares published in the brand new @genkit-ai/middleware package. This article is a practical tour of what the new middleware system gives you, the built-in middlewares you can drop in today, and how to write your own with generateMiddleware. The official documentation lives at Genkit Middleware. All examples below assume the JavaScript/TypeScript SDK. A quick reminder: although this article focuses on the JS/TS middleware API, Genkit is a multi-language framework. The official SDKs cover JavaScript/TypeScript (primary, stable), Go, Python (preview) and Dart/Flutter (preview), and there is a community-maintained Java SDK used in production. The middleware concepts described here are JS/TS-specific today, but the underlying generate() pipeline exists across all SDKs and the same patterns are landing on the other runtimes. What Is Middleware in Genkit? Conceptually, Genkit middleware behaves like the middleware you already know from Express or Koa, only applied to the LLM lifecycle instead of HTTP requests: A generate() call is intercepted before it reaches the model.Each middleware can inspect or modify the request, decide whether to call next(), and inspect or modify the response on the way back.Multiple middlewares can be chained. They run in the order they are declared and unwind in reverse order, exactly like an onion. What makes Genkit's design interesting is that it does not give you a single chokepoint; it gives you three orthogonal interception phases: model – wraps the call to the underlying model. Perfect for retries, fallbacks, request/response logging, or response transformations.tool – wraps tool execution. Ideal for approvals, sandboxing, audit logs, or input/output validation.generate – wraps the whole high-level generation loop (prompting, tool calling, output parsing). Best for things like injecting tools or system instructions before the loop starts. You opt in per call via a use: array, which keeps things explicit and avoids global side effects: JavaScript const response = await ai.generate({ model: googleAI.model('gemini-flash-latest'), prompt: 'Hello', use: [retry({ maxRetries: 3 }), loggerMiddleware({ verbose: true })], }); Installation The official middlewares ship in their own package, decoupled from the Genkit core: Shell npm install @genkit-ai/middleware # or pnpm add @genkit-ai/middleware You still need genkit itself and a model provider plugin (for example @genkit-ai/google-genai). The Built-In Middleware Catalog Let's go through the five middlewares the Genkit team ships out of the box. filesystem: Give the Model a Sandboxed File System filesystem injects a standard set of file manipulation tools (list_files, read_file, write_file, search_and_replace) into the generation loop, restricted to a root directory of your choice. JavaScript import { genkit } from 'genkit'; import { googleAI } from '@genkit-ai/google-genai'; import { filesystem } from '@genkit-ai/middleware'; const ai = genkit({ plugins: [googleAI()] }); const response = await ai.generate({ model: googleAI.model('gemini-flash-latest'), prompt: 'Create a hello world Node app in the workspace', use: [ filesystem({ rootDirectory: './workspace', allowWriteAccess: true, }), ], Useful options: rootDirectory (required) – sandbox root, all paths are confined to it.allowWriteAccess – defaults to false. Read-only by default is a sane choice for safety.toolNamePrefix – namespace the injected tools to avoid collisions with your own. This is essentially the building block for a "coding agent" pattern, without you having to write tool definitions or path validation logic. skills: Auto-Load Markdown Skills as System Context skills scans a directory for SKILL.md files (plus their YAML frontmatter), injects relevant ones into the system prompt, and exposes a use_skill tool the model can call when it needs more specific guidance. JavaScript import { skills } from '@genkit-ai/middleware'; const response = await ai.generate({ prompt: 'How do I run tests in this repo?', use: [skills({ skillPaths: ['./skills'] })], Think of it as a lightweight, file-based knowledge layer: every skill is a self-contained Markdown file with metadata, and the middleware decides when to surface them. It is a really clean alternative to ad-hoc system prompt soup. toolApproval: Human-in-the-loop for Tool Calls toolApproval enforces an allowlist of tools the model is allowed to execute autonomously. Anything outside the list raises a ToolInterruptError, so you can pause execution, ask the user, and resume. JavaScript import { genkit, restartTool } from 'genkit'; import { toolApproval } from '@genkit-ai/middleware'; const response = await ai.generate({ prompt: 'write a file', tools: [writeFileTool], use: [toolApproval({ approved: [] })], // empty list -> always interrupt }); if (response.finishReason === 'interrupted') { const interrupt = response.interrupts[0]; // ... ask the user, then mark the tool call as approved const approvedPart = restartTool(interrupt, { toolApproved: true }); const resumedResponse = await ai.generate({ messages: response.messages, resume: { restart: [approvedPart] }, use: [toolApproval({ approved: [] })], }); } This is exactly the pattern you want for any agent that touches the real world (filesystem writes, payments, sending emails). No more home-grown approval flags scattered across the codebase. retry: Exponential Backoff With Jitter for Transient Errors The retry middleware retries failed model calls on transient status codes (UNAVAILABLE, DEADLINE_EXCEEDED, RESOURCE_EXHAUSTED, ABORTED, INTERNAL) using exponential backoff with jitter. JavaScript import { retry } from '@genkit-ai/middleware'; const response = await ai.generate({ model: googleAI.model('gemini-pro-latest'), prompt: 'Heavy reasoning task...', use: [ retry({ maxRetries: 3, initialDelayMs: 1000, backoffFactor: 2, }), ], }); Knobs you actually care about: maxRetries (default 3)statuses — which status codes to retry oninitialDelayMs / maxDelayMs / backoffFactornoJitter — if you really want deterministic delays This is one of those things every team writes once, badly. Having it in the framework is a very welcome change. fallback: Gracefully Degrade to a Different Model fallback switches to an alternate model when the primary one fails on configurable status codes. The classic use case is "try Pro first, fall back to Flash when quota is exhausted": JavaScript import { fallback } from '@genkit-ai/middleware'; const response = await ai.generate({ model: googleAI.model('gemini-pro-latest'), prompt: 'Try the pro model first...', use: [ fallback({ models: [googleAI.model('gemini-flash-latest')], statuses: ['RESOURCE_EXHAUSTED'], }), ], }); You can chain multiple fallback models, and isolateConfig lets you decide whether the fallback inherits the original request configuration or starts clean (handy when the fallback model does not support the same options as the primary). Building Your Own Middleware With generateMiddleware The same primitive that powers all the built-ins is exposed for you. The generateMiddleware helper gives you typed config schemas (via Zod) and access to the ai instance. Here is the canonical "logger" example, straight from the docs but lightly annotated: JavaScript import { generateMiddleware, z } from 'genkit'; export const loggerMiddleware = generateMiddleware( { name: 'loggerMiddleware', description: 'Logs requests and responses', configSchema: z.object({ verbose: z.boolean().optional(), }), }, ({ config, ai }) => { return { // Phase 1: intercept the model call model: async (req, ctx, next) => { if (config?.verbose) { console.log('Request:', JSON.stringify(req)); } const resp = await next(req, ctx); if (config?.verbose) { console.log('Response:', JSON.stringify(resp)); } return resp; }, // You could also add `tool: ...` and `generate: ...` hooks here. }; } Using it is identical to the official ones: JavaScript const response = await ai.generate({ model: googleAI.model('gemini-flash-latest'), prompt: 'Hello', use: [loggerMiddleware({ verbose: true })], }); A few patterns I have found very useful: PII redaction – implement a model hook that scrubs the request prompt and the response text against a regex/dictionary, returning the cleaned version.Cost accounting – wrap the model hook to read usage tokens from the response, and emit them to your metrics backend tagged by user/feature.Per-tenant quotas – use the generate hook to check a counter (Redis, Firestore...) before calling next(); throw your own custom error if the tenant is over quota.Caching – keyed on a hash of the model + request, return a cached response if hit, otherwise call next() and persist the result. For more inspiration, the source of the official middlewares is open in the Genkit GitHub repository, and reading them is genuinely educational. Composition: Stacking Middlewares Middlewares compose in array order. A reasonable production stack might look like this: Python const response = await ai.generate({ model: googleAI.model('gemini-pro-latest'), prompt: userPrompt, tools: myTools, use: [ loggerMiddleware({ verbose: false }), // outermost: see everything retry({ maxRetries: 3 }), // recover from transient failures fallback({ // degrade if Pro is overloaded models: [googleAI.model('gemini-flash-latest')], statuses: ['RESOURCE_EXHAUSTED'], }), toolApproval({ approved: ['searchDocs'] }), // gate dangerous tools ], }); The order matters: outer middlewares see the result of the inner ones. Put logging on the outside if you want it to record the final state after retries and fallbacks; put it on the inside if you want to see every individual model attempt. The Importance of Middleware for Production Agents Genkit Middleware is one of those features that does not look flashy in a changelog but quietly fixes a lot of real-world friction. It pushes Genkit closer to a "batteries-included" framework for production agents: Cross-cutting concerns are no longer copied and pasted across flows.Safety-critical behavior (approvals, sandboxes, fallbacks) is declarative.The model / tool / generate split gives you precise control without forcing you to monkey-patch.The middleware contract is small enough that the community can ship plugins that interoperate. If you maintain any non-trivial Genkit application, the upgrade is a no-brainer. Drop in retry and fallback first, you will probably see incidents disappear within the week. Then start writing your own middlewares for the things that are unique to your domain. Conclusion Middleware turns Genkit's generate() from "a function you call" into "a pipeline you compose". The official @genkit-ai/middleware package covers the most common production needs (filesystem access, skills, tool approval, retries, fallbacks), and generateMiddleware makes writing your own a 20-line affair instead of a refactor. For the next steps, take a look at: Genkit Middleware documentationGenkit middleware source on GitHubGenkit flows — middleware composes especially well with typed flowsTool calling and Interrupts — the foundation that toolApproval builds on Happy hacking, and may your fallback models always be cheaper than your primary one.

By Xavier Portilla Edo DZone Core CORE
Spring CRUD Generator v1.1.0 Updates
Spring CRUD Generator v1.1.0 Updates

I’ve just released Spring CRUD Generator v1.1.0 — an open-source generator that helps you bootstrap a Spring Boot CRUD backend from a single YAML specification. If you’ve built more than a couple of CRUD-heavy services, you’ve probably experienced the same pain points: repeating the same layers (entity, repository, service, controller), keeping consistent naming and structure across modules, and constantly adjusting boilerplate when requirements change. Spring CRUD Generator aims to reduce that overhead by letting you define your data model and project options once (in YAML) and generate a consistent project structure around it. This release adds field-level validation, improves Redis caching, and fixes compatibility issues so the generator works reliably with Spring Boot 3 and Spring Boot 4. It also improves behavior when Open Session In View (OSIV) is disabled by adding EntityGraph support in generated resources. What’s New in v1.1.0 1. Field Validation (fields.validation) The headline feature in v1.1.0 is a new optional “validation” section inside each entity field definition. Instead of sprinkling validation rules manually throughout your DTOs and controllers after generation, you can now describe typical constraints directly in the YAML config and have the generator produce the appropriate validation-aware output. This is useful for teams that want a single source of truth for model constraints. A YAML-driven validation model also makes it easier to review and evolve constraints alongside schema changes. For example, you can express “this field must be required” or “string length must be within a range” directly where the field is defined. A notable addition in this release is support for regex-based validation via “pattern.” That’s a practical constraint for fields like passwords, identifiers, or custom-formatted strings. It’s worth mentioning that the validation section is optional: if you don’t need it, you don’t have to add it, and your existing specs remain valid. 2. Cache and Redis Improvements Caching is a common performance layer in CRUD systems, but it becomes tricky when you combine Redis serialization and Hibernate’s lazy-loaded associations. This release includes two important caching-related improvements: The generator previously produced incorrect values for @Cacheable(value=...) in certain cases. That has been fixed, ensuring that cache names/values are generated consistently and correctly.Cache configuration has been updated to include HibernateLazyNullModule. This improves Redis caching behavior when Hibernate lazy-loaded entities are involved. In practice, it reduces the likelihood of serialization issues (or unexpected failures) when caching objects that contain lazily-loaded properties. If your generator output is used in services that rely on Redis for caching, this update should make caching more stable and predictable. 3. Spring Boot 3 and 4 Compatibility Another big part of this release is compatibility. The generator is now fully compatible with both Spring Boot 3 and Spring Boot 4. In ecosystems like Spring, small changes between major versions can break builds, plugins, or code generation assumptions. Ensuring compatibility across versions is essential for teams that want to upgrade gradually (or maintain multiple services on different baselines). v1.1.0 addresses compatibility issues so you can use the generator reliably on either Spring Boot line. 4. OSIV Control + EntityGraph Support Open Session In View (OSIV) is often considered an anti-pattern because it can hide N+1 query problems and produce unpredictable lazy-loading behavior in higher layers. Many teams disable it intentionally. In v1.1.0, the generator introduces a new configuration entry: additionalProperties.spring.jpa.open-in-view (default: false) With OSIV disabled by default, the generated resources now include EntityGraph support to handle lazy relations more safely. The goal here is practical: avoid surprising LazyInitializationException scenarios without relying on OSIV. This approach also nudges generated projects toward better data-fetching discipline. If your project still relies on OSIV, you can explicitly set it to true in the configuration. But the default aligns with a more defensive and production-friendly setup. 5. Improvements and Stability Beyond features, v1.1.0 includes a set of improvements that make the project easier to maintain and more robust: Documentation has been updated to reflect the new fields.validation structure.Tests have been updated and fixed so they work correctly with the new validation model.The generator internals were refactored to improve readability and maintainability.A validation edge case was fixed: when min/max are not provided, the generator no longer throws a NullPointerException (NPE). This prevents configuration mistakes from turning into runtime failures during generation. There is also an updated “full CRUD spec YAML” reference example in the repository. It existed before, but it has now been refreshed to include the new fields.validation configuration and the additionalProperties.spring.jpa.open-in-view setting. If you want to see the complete configuration surface area (not just a short snippet), that reference YAML is the best place to start. Updated CRUD Spec YAML (Short Example) The repository includes an updated full CRUD spec YAML reference example. Below is a shortened snippet that highlights the new and most important parts (validation + OSIV): YAML configuration: database: postgresql javaVersion: 21 springBootVersion: 4 cache: enabled: true type: REDIS expiration: 5 openApi: apiSpec: true additionalProperties: rest.basePath: /api/v1 spring.jpa.open-in-view: false entities: - name: ProductModel storageName: product_table fields: - name: id type: Long id: strategy: IDENTITY - name: name type: String column: nullable: false unique: true length: 10000 validation: required: true notBlank: true minLength: 10 maxLength: 10000 - name: price type: Integer column: nullable: false validation: required: true min: 1 max: 100 - name: users type: UserEntity relation: type: OneToMany fetch: LAZY joinColumn: product_id validation: required: true minItems: 1 maxItems: 10 - name: UserEntity storageName: user_table fields: - name: id type: Long id: strategy: IDENTITY - name: email type: String validation: required: true email: true - name: password type: String validation: required: true pattern: "^(?=.*[A-Za-z])(?=.*\\d)[A-Za-z\\d]{8,}$" Tip: Check the repository’s full CRUD spec YAML example to see the complete supported configuration surface Upgrade Notes fields.validation is optional — add it only where needed.spring.jpa.open-in-view defaults to false. If your project relies on OSIV behavior, explicitly set it to true. If you build Spring Boot services frequently and want to reduce repetitive CRUD boilerplate, feel free to try the generator and share feedback. If you find the project useful, I’d really appreciate a star on the repository — it helps a lot and keeps the momentum going. Repository: https://github.com/mzivkovicdev/spring-crud-generator. Development continues, and more improvements are on the way. Thanks for the support!

By Marko Zivkovic
We Went Multi-Cloud and Almost Drowned: Lessons From Running Across AWS, GCP, and Azure
We Went Multi-Cloud and Almost Drowned: Lessons From Running Across AWS, GCP, and Azure

It started, as most bad architectural decisions do, with a PowerPoint slide from a VP who had just returned from a conference. “We need to avoid vendor lock-in,” he declared, and suddenly our platform engineer team had a mandate to distribute workloads across three public clouds. Eighteen months later, we had something that technically ran on three major public clouds (AWS, GCP, and Azure). We also had a Terraform code that made people cry and an on-call rotation nobody wanted. This is what I learned about multi-cloud strategy, not the vendor pitch but the messy reality of keeping production alive across multi-cloud boundaries. The Most Common Reason vs. The Reason That Logically Matters Vendor lock-in avoidance is the stated justification for most of the multi-cloud initiatives I have seen. It sounds sensible on paper: spread your bets, negotiate better pricing, avoid being held hostage. In practice, the switching cost argument is often considered theoretical. Nobody is ripping a mature workload off AWS and replanting it on GCP over a cost dispute. The expense of moving everything would have exceeded whatever we stood to gain. The real reasons multicloud makes sense are less glamorous if you acquire a company running on a different cloud. A specific managed service, say BigQuery for analytics or Azure AD for identity, is genuinely better than what your primary cloud offers. Sometimes the decision is made for you. A customer's regulatory or compliance obligations may demand data residency in a region that only one specific provider serves, or, in our case, a situation that required all three simultaneously. Those are valid reasons to proceed. If you cannot articulate a concrete scenario where you would actually migrate between clouds, you are paying an operational complexity, data egress cost, and management overhead that act as a tax for insurance you will never claim. Where It Truly Became Painful Initially, things seemed under control. We chose Kubernetes, spun up clusters on EKS, GKE, and AKS, and told ourselves the hard part was over. It wasn't. Networking hit its first wall. Each cloud provider handles virtual networks, traffic routes, and DNS completely differently. We spent three frustrating weeks chasing random connection failures between services split across GCP and AWS, eventually discovering our Transit Gateway was silently discarding packets in an obscure NAT scenario. Nobody warns you that there's no such thing as a common networking standard across clouds, and anyone claiming otherwise is really just pitching you their middleware. Identity and access management became our second major headache. AWS, Google Cloud, and Azure each handle permissions through completely different systems. We attempted to maintain matching role definitions across all three to ensure consistency. That approach collapsed within two months as the configurations slowly diverged into chaos. We eventually consolidated around a single identity provider and built a translation layer between them — messy, but at least we could audit it properly. Observability turned out to be our most stubborn ongoing problem. The moment a request travels across cloud boundaries, distributed tracing falls apart completely. Your metrics end up scattered across three separate platforms, each speaking its own query language, making unified monitoring feel nearly impossible. We consolidated into Datadog, which helped, but the bill was eye-watering, and we still had unknown areas at the edges. Here's what working day-to-day actually feels like. Writing a single Terraform module to spin up a managed Postgres database meant wrestling with three completely different provider APIs simultaneously. That module was 800 lines long for what is theoretically a single resource type. Multiply that by every piece of resource in the infrastructure you run, and you see the maintenance burden. What Indeed Worked After twelve months of struggle, one decision turned everything around. We stopped chasing cloud-agnosticism and started being cloud-intentional instead. That distinction is everything. Cloud-agnostic thinking pushes you toward lowest-common-denominator solutions, wrapping every service in abstraction layers, pretending all clouds are essentially identical. It's a trap that leaves you using none of them effectively. Cloud-native means deliberately matching specific workloads to the right provider, embracing each platform's native strengths, and only building abstractions at the actual boundaries where systems talk to each other. Machine learning training lived on GCP because Vertex AI and TPU access justified it. Transactional workloads stayed on AWS, where our team had years of accumulated expertise. Azure handled enterprise identity because that's simply where our customers' Active Directory already lived. The crucial mindset shift was simply accepting that each environment would naturally look different. We stopped forcing our GCP infrastructure to mirror our AWS setup. Instead, we focused on standardizing the connections between them, API contracts, event schemas, and a shared service mesh — only where cross-cloud communication was genuinely necessary. The Cost Nobody Mentions Multicloud isn't purely a technology challenge. It's fundamentally a people problem. Finding engineers who truly master even one cloud platform is already difficult. Expecting a single team to maintain deep production-level expertise across three simultaneously is simply unrealistic. We naturally drifted toward informal specialization: two people owned GCP, three focused on AWS, and one managed Azure, and those knowledge silos made being on-call genuinely miserable. Training costs are very real, too. The mental overhead of constantly switching between three different consoles, three separate command-line tools, and three completely different mental models of networking and storage is a hidden tax that never appears in any architecture review document. When You Shouldn't Attempt This If you are a startup or a team of under fifty engineers, multicloud is almost a mistake. Pick the cloud your team knows the best, use its managed services aggressively, and ship. The theoretical risk of vendor lock-in is less dangerous than the very real risk of moving slowly because your infrastructure is too complex to operate in different clouds. Even at scale, if your workloads lack a concrete reason to span multi-cloud boundaries, resist the pressure. A well-architected single-cloud deployment with proper DR will serve you better than a disintegrated multi-cloud setup held together by duct tape and YAML. Lessons Learned Treat multicloud as a practical response to specific constraints, never as a default stance. Go cloud-intentional rather than cloud-agnostic; use each provider’s strengths natively and standardize only at cloud integration boundaries. Invest generously in monitoring and networking; that's where multi-cloud complexity bites hardest. Be honest about your team’s capacity before committing to overhead that scales with the number of providers that you support. One Final Reflection In hindsight, the most valuable thing about our multi-cloud journey was not the architecture; it was the discipline forced on our service boundaries. When crossing a cloud boundary is expensive and painful, you think harder about what actually needs to be crossed. That forced intentionality made our systems design better overall, even parts that never left a single cloud. Could we have achieved the same architectural improvements simply by designing cleaner interfaces, without literally spanning multiple clouds? Probably yes. But nobody approves a budget for "Let's build better boundaries." They do approve a budget for a "multi-cloud strategy." Sometimes solid engineering quietly sneaks in through a dubious PowerPoint presentation.

By Pruthvi Raj Seknametla
The Agent Protocol Stack: MCP vs. A2A vs. AG-UI
The Agent Protocol Stack: MCP vs. A2A vs. AG-UI

If you're building AI agents in 2026, you've probably bumped into at least one of these acronyms: MCP, A2A, AG-UI. Maybe all three. And if you're anything like me, your first reaction was: "Are these competing standards? Do I need all of them? Which one do I actually use?" Here's the short answer: They're not competing — they're complementary. Each one solves a different problem at a different layer of the agent architecture. Think of them like TCP, HTTP, and HTML — different protocols at different layers that work together to make the web function. The long answer is the rest of this article. The One-Sentence Version ProtocolCreated ByWhat It ConnectsOne-LinerMCPAnthropicAgent ↔ Tools & Data"How does my agent use tools?"A2AGoogle (Linux Foundation)Agent ↔ Agent"How do agents talk to each other?"AG-UICopilotKitAgent ↔ User Interface"How does my agent talk to the user?" That's the mental model. Now let's go deeper. MCP: The Tool Layer What It Solves Your agent needs to do things — query a database, call an API, read a file, search the web. Before MCP, every integration was bespoke. You'd write custom function-calling code for each tool, each framework, each model. MCP standardizes this into a single protocol. How It Works MCP uses a client-server architecture over JSON-RPC 2.0. The MCP server exposes tools (functions with typed inputs/outputs), resources (data the agent can read), and prompts (reusable templates). The MCP client — typically embedded in your agent framework — discovers these capabilities and invokes them on behalf of the model. Key Concepts Tools are the core primitive functions the model can call. Each tool has a name, description (the LLM reads this to decide when to use it), and a typed input schema. The model sees the tool list, decides which ones to call, and the MCP client executes them. Resources let the server expose read-only data — files, database schemas, configuration — that provides context without requiring a tool call. Transports are flexible. Local tools can use stdio (spawning a subprocess). Remote tools use Streamable HTTP, which is what you'd use for production deployments. AWS Bedrock AgentCore Runtime expects this transport. When to Use MCP Use MCP when your agent needs to interact with external systems: databases, APIs, monitoring tools, file systems, and cloud services. If you're wrapping an existing API for agent consumption, MCP is the protocol. AWS provides a growing library of open-source MCP servers for services like S3, DynamoDB, CloudWatch, and Cost Explorer. You can also build custom MCP servers for your own internal APIs and deploy them to AgentCore Runtime. When Not to Use MCP MCP is not for agent-to-agent communication. If you have a research agent that needs to delegate a sub-task to a coding agent, MCP isn't the right fit — that's A2A territory. MCP is also not designed for frontend communication — it doesn't have event streaming primitives for UI updates. A2A: The Agent Collaboration Layer What It Solves You've built multiple specialized agents. One handles research, another handles code generation, and a third manages deployments. Now you need them to work together on a complex task without sharing their internal state, tools, or prompts. A2A standardizes how agents discover each other, delegate tasks, and exchange results. How It Works A2A follows a client-server model where agents communicate over HTTP using JSON-RPC 2.0 (and optionally gRPC as of v0.3). The key differentiator from MCP is opacity — agents don't expose their internals. They advertise what they can do, not how they do it. Key Concepts Agent cards are JSON metadata documents hosted at /.well-known/agent.json. They describe the agent's name, capabilities (called "skills"), supported input/output types, and authentication requirements. Think of them as a machine-readable business card — any A2A client can discover what a remote agent does without prior knowledge. Tasks are the unit of work. A client sends a message to a remote agent, which creates a task with a lifecycle: submitted → working → completed (or failed, canceled). Tasks can produce artifacts — the actual outputs like generated text, images, or structured data. Interaction patterns are flexible. Simple tasks complete synchronously. Long-running tasks use Server-Sent Events (SSE) for streaming updates. Truly async workflows use push notifications via webhooks. When to Use A2A Use A2A when you have multiple agents that need to collaborate but shouldn't share internal state. Common patterns include a supervisor agent delegating to specialists, cross-organization agent collaboration (your agent talking to a vendor's agent), and multi-framework setups (a LangGraph agent coordinating with a CrewAI agent). A2A is especially valuable when agents are built by different teams or companies. The opacity principle means Agent A doesn't need to know that Agent B uses LangGraph internally — it just sends a task and gets results back. AWS Bedrock AgentCore Runtime supports deploying A2A servers alongside MCP servers, with the same IAM auth, session isolation, and auto-scaling. A2A containers expose their endpoint on port 9000 with an Agent Card at /.well-known/agent-card.json. When NOT to Use A2A A2A adds overhead that isn't necessary for simple single-agent setups. If your agent just needs to call tools, use MCP. If you need tight coupling between agent components (shared memory, shared context), A2A's opacity model will work against you — consider an agent framework's native multi-agent patterns instead. AG-UI: The User Interface Layer What It Solves Your agent is running, calling tools, maybe coordinating with other agents. But the user is staring at a loading spinner. They don't know what's happening, can't intervene when things go wrong, and can't see intermediate results. AG-UI standardizes how agents communicate with user-facing applications in real time. How It Works AG-UI is an event-based protocol where the agent backend emits a stream of typed events that the frontend consumes. Unlike REST (request → response) or WebSocket (unstructured bidirectional), AG-UI defines ~16 specific event types that cover the full range of agent-user interactions. Key Concepts Event types are the core of AG-UI. The main ones: Lifecycle events (RUN_STARTED, RUN_FINISHED, RUN_ERROR) – let the frontend show loading states and handle errorsText message events (TEXT_MESSAGE_START, _CONTENT, _END) – stream generated text token by token for the "typing" effectTool events (TOOL_CALL_START, TOOL_CALL_END) – show the user what tools the agent is using and their resultsState deltas (STATE_DELTA) – send incremental UI state changes (progress bars, form updates) without resending everythingInterrupts (INTERRUPT) – pause execution to ask the user for approval before a sensitive action (like deleting a resource) Shared state enables bidirectional synchronization between the agent and the application. The agent can read application state (what page the user is on, what document is open) and push state changes back (update a chart, fill a form). Frontend tools are an interesting inversion — the agent can call functions that execute in the browser, like updating a collaborative document or rendering a visualization. When to Use AG-UI Use AG-UI when your agent needs to communicate with a user-facing application in real time. This includes chat interfaces that show tool execution progress, collaborative editing where the agent modifies a shared document, dashboards that update as the agent discovers information, and any workflow that requires human-in-the-loop approval. AG-UI was born from CopilotKit's production experience and has integrations with LangGraph, CrewAI, Strands Agents, Pydantic AI, and more. AWS Bedrock AgentCore Runtime added AG-UI support in March 2026, handling auth and scaling just like MCP and A2A workloads. When NOT to Use AG-UI If your agent is a background job with no user interaction (batch processing, scheduled tasks), AG-UI adds unnecessary complexity. Stick with simple API responses or logging. Also, AG-UI is about communication, not UI rendering — if you need the agent to generate actual UI components, look at A2UI (a separate spec from Google for declarative UI generation that can be transported over AG-UI events). How They Fit Together Here's where it gets interesting. In a real production system, you're likely using all three: The flow: The user asks a question in the frontendAG-UI streams the request to the supervisor agent and carries back real-time updatesThe supervisor uses MCP to call tools directly (databases, APIs, cloud services)For complex sub-tasks, the supervisor uses A2A to delegate to specialist agentsThose specialist agents may themselves use MCP for their own toolsResults flow back up through A2A → supervisor → AG-UI → user Each protocol handles its layer. No overlap. No conflict. The Decision Framework When you're designing an agent system, ask these three questions: 1. "Does my agent need to use external tools or data?" → Yes: Use MCP Wrap your APIs, databases, and services as MCP servers. Use existing open-source MCP servers for common services (AWS, GitHub, Slack, etc.). 2. "Does my agent need to collaborate with other agents?" → Yes: Use A2A Especially when agents are built by different teams, use different frameworks, or need to maintain the privacy of their internal logic. Publish Agent Cards for discovery. 3. "Does my agent need to communicate with a user in real time?" → Yes: Use AG-UI Stream progress, show tool execution, synchronize state, and handle human-in-the-loop approvals. Use AG-UI events to keep the user informed and in control. Most production agent systems will answer "yes" to at least two of these. And that's fine — the protocols are designed to compose. Quick Comparison Table MCPA2AAG-UILayerTool accessAgent collaborationUser interactionCreated byAnthropicGoogle / Linux FoundationCopilotKitWire protocolJSON-RPC 2.0JSON-RPC 2.0 + gRPCEvent stream (SSE)DiscoveryTool listing via tools/listAgent Card at /.well-known/agent.jsonN/A (direct connection)Key primitiveTool (function call)Task (lifecycle-managed work unit)Event (~16 standard types)Transportstdio, Streamable HTTPHTTP, SSE, gRPC, webhooksSSE, WebSocketsAuth modelOAuth 2.0, IAMOAuth 2.0, API keys, mTLSApplication-definedOpacityTransparent (tools are exposed)Opaque (internals hidden)N/AStreamingYes (SSE for resources)Yes (SSE for task updates)Yes (core design principle)AWS supportAgentCore Runtime + GatewayAgentCore Runtime (port 9000)AgentCore Runtime (March 2026)Spec version2025-03-26v0.3~16 event types, active development Running All Three on AWS AWS Bedrock AgentCore Runtime is one of the few platforms that supports all three protocols natively. Here's how they deploy: ProtocolAgentCore Runtime PortContainer PathAuthMCP8000/mcpIAM SigV4 or OAuth 2.0A2A9000/ (root)IAM SigV4 or OAuth 2.0AG-UIConfigurableConfigurableIAM SigV4 or OAuth 2.0 Each protocol gets the same enterprise infrastructure: session isolation in microVMs, automatic scaling, IAM auth, and observability through AgentCore. You write the server, AgentCore handles everything else. The AgentCore Gateway can sit in front of MCP servers to provide centralized tool discovery, routing, and policy enforcement via Cedar. For A2A, agents advertise their capabilities through Agent Cards. For AG-UI, the frontend connects directly to the AgentCore Runtime endpoint and receives streamed events. What About A2UI? You might have also heard about A2UI (Agent-to-UI), a separate specification from Google. It's easy to confuse with AG-UI given the similar names, but they solve different problems: A2UI defines what UI to render — it's a declarative spec for describing UI components (buttons, charts, forms) that agents can generate safely without executing arbitrary codeAG-UI defines how agents and UIs communicate at runtime — the event stream, state synchronization, and interaction lifecycle They're complementary. An agent can use AG-UI to stream events to the frontend, and one of those events can carry an A2UI payload that describes a UI component to render. AG-UI is the transport; A2UI is the content format. Getting Started If you're building your first agent system, here's the practical sequence: Start with MCP. Most agents need tools first. Build an MCP server for your primary data source or API. Deploy it to AgentCore Runtime or run it locally during development.Add AG-UI when you build the frontend. Once your agent works, connect it to a user-facing app using AG-UI events. CopilotKit provides React components that handle the event stream out of the box.Introduce A2A when you need specialization. When a single agent can't handle everything, split into specialists and use A2A for delegation. This typically happens when you're at the point of multi-team or multi-framework agent development. You don't need all three on day one. But understanding what each one does — and where it fits — saves you from building custom plumbing that a protocol already handles. References MCP Specification (2025-03-26)One Year of MCP: Spec Anniversary BlogOpen Source MCP Servers for AWSDeploy MCP Servers in AgentCore RuntimeA2A v0.3 Upgrade AnnouncementA2A on AWS AgentCore RuntimeAG-UI Overview — DataCamp TutorialPydantic AI AG-UI IntegrationA2UI Official SiteDeveloper's Guide to AI Agent Protocols — Google Developers Blog

By Jubin Abhishek Soni DZone Core CORE
Designing Effective Meetings in Tech: From Time Wasters to Strategic Tools
Designing Effective Meetings in Tech: From Time Wasters to Strategic Tools

If you’ve been in software engineering long enough — especially as a senior, staff engineer, architect, or tech executive — you’ve felt it: meetings that drain energy, fragment your focus, and somehow still fail to move anything forward. The irony is almost historical. The word meeting comes from the Old English mētan — “to encounter” or “to come together with purpose.” Yet in modern organizations, especially in IT, meetings often represent the opposite: diffusion instead of direction. Why should you care? Because meetings are one of the most expensive recurring operations in a company. Not in infrastructure, not in cloud bills — but in human cognitive bandwidth. And unlike compute, you cannot scale or autoscale attention. Used properly, meetings are one of the most powerful coordination tools we have. Used poorly, they become a systemic tax on productivity. Why We Hate Meetings (and Why That’s a Problem) Let’s start with an uncomfortable truth: most engineers hate meetings. Not because engineers are anti-social, nor because collaboration is bad — but because, in practice, meetings have become synonymous with interruption, inefficiency, and wasted time. You’ve probably experienced this: A meeting with no clear goalA discussion that goes in circlesA full hour spent without a single decisionA calendar so fragmented that real work becomes impossible And here lies the paradox. Meetings were supposed to be an ally — a tool to align people, make decisions, and move systems forward. Historically, the very idea of a “meeting” implies purposeful convergence. Yet in modern IT organizations, especially those dealing with complex systems, meetings often become the exact opposite: a source of entropy. Instead of reducing uncertainty, they amplify it. Instead of accelerating decisions, they delay them. Instead of enabling focus, they destroy it. From a systems-thinking perspective, this is not a minor inefficiency — it’s a structural failure. You’re taking your most expensive resource (highly skilled engineers) and allocating their peak cognitive time to low-value coordination. So the real problem is not that meetings exist. The problem is that we’ve stopped designing them. Why Meetings Exist (and Why They Still Matter) Before optimizing meetings, we need to challenge a common — but flawed — assumption: “Meetings are inherently bad.” They are not. They are a necessity born from complexity. In any non-trivial system — whether software or organization — you need mechanisms to: Share contextAlign decisionsResolve ambiguityCoordinate execution In distributed systems, we rely on protocols, consensus, and synchronization. In organizations, we rely on… meetings. So meetings are not optional. They are a coordination primitive. They become critical when: A decision requires multiple perspectivesThere is uncertainty that async communication cannot resolveTeams must commit to a shared direction Without them, organizations drift into misalignment. With poorly designed ones, they collapse into inefficiency. And that inefficiency is not just perception — it is measurable. According to Atlassian: Around 70% of meetings are considered unnecessary or could be replacedEmployees report attending too many low-value meetingsEmployees spend ~31 hours per month in unproductive meetingsTeams spend more time in unnecessary meetings than doing high-priority work Research discussed by Harvard Business Review shows: Professionals spend 50–70% of their time in meetingsMany of these meetings are perceived as ineffective or redundant Meanwhile, McKinsey & Company estimates: Improving meeting productivity can increase overall efficiency by 20–30% Now, let’s interpret this with an engineering mindset. If 70% of your system calls were unnecessary, you would redesign the architecture. If half of your CPU cycles produced no value, you would optimize immediately. Yet in organizations, this level of inefficiency is often normalized. That’s the real issue. Meetings are essential — but only when treated as intentional, well-designed coordination mechanisms. Otherwise, what should be a high-leverage tool becomes one of the most expensive and invisible bottlenecks in modern software engineering. Making Meetings an Ally (Not an Enemy) If meetings are a coordination primitive, then the real question becomes architectural: How do we design them to scale instead of degrade? Because that’s the core issue — most meetings don’t scale. They require everyone to be present, at the same time, with the same context. That’s the most expensive synchronization model you can choose. To turn meetings into an ally, we need to rethink them as part of a broader system of communication, decision-making, and knowledge sharing. 1. Start With a Clear Scope and Goal A meeting without a goal is equivalent to an API without a contract. Before scheduling anything, define: What is the purpose of this meeting?What decision or outcome is expected?What does success look like? If you cannot answer these questions, you’re not ready to create a meeting — you’re creating noise. A strong title is already half the solution: ❌ “Architecture discussion.”✅ “Decide event-driven vs synchronous integration for payments.” Clarity upfront reduces ambiguity during execution. 2. Respect Time (and Understand Parkinson’s Law) There is a well-known principle called Parkinson's Law: Work expands to fill the time available for its completion. Meetings follow this law perfectly. If you schedule 1 hour, people will unconsciously stretch the discussion to fit that time — even if the real value was delivered in 20 minutes. As a technical leader: Default to shorter meetings (15–25 minutes)Use longer slots only when strictly necessaryForce prioritization through constraints Timeboxing is not about speed — it’s about forcing intentionality. 3. Drive the Conversation (Avoid Bikeshedding) Another classic principle is the Law of Triviality: Teams spend more time on trivial issues than on complex ones. You’ve seen it: 2 minutes on architecture20 minutes debating naming or minor implementation details This is not accidental — it’s human nature. Your role as a senior/staff engineer or architect is not just to contribute technically, but to facilitate focus: Redirect when discussions driftPark irrelevant topicsBring the group back to the objective Think of it as managing thread execution — prevent starvation of critical topics. 4. Never Surprise People (Preparation Is Mandatory) One of the most common anti-patterns: “Let’s review this document together in the meeting” That’s not a meeting — that’s synchronous reading, which is one of the least efficient things you can do. If the meeting depends on prior knowledge (ADR, PR, design proposal): Share it at least 24 hours beforeUse the meeting description to: Link documentsHighlight key questionsDefine expected input This shifts the meeting from: “Let’s understand the problem” to: “Let’s decide based on shared understanding” That’s where real value happens. 5. Documentation: The Enemy of the Enemy Is Your Ally Here’s an uncomfortable truth in engineering culture: People dislike writing documentation. But they dislike unnecessary meetings even more. This creates an interesting dynamic — the classic: “The enemy of my enemy is my ally.” Documentation, when used properly, becomes a scaling mechanism for human knowledge. Instead of requiring 5, 10, or 15 people to synchronize at the same time, you: Write onceShare asynchronouslyLet people consume at their own pace This transforms communication from: Synchronous and blocking to: Asynchronous and scalable Think about it from a systems perspective: A meeting is a runtime dependencyDocumentation is a cached, reusable artifact With good documentation: People arrive at meetings already informedDiscussions become deeper and more focusedDecisions happen faster Without it: Meetings become onboarding sessionsContext must be rebuilt every timeThe same conversation repeats endlessly In other words, documentation is not bureaucracy — it’s latency reduction for human systems. 6. Prefer Async First, Sync Second Following the documentation, a strong principle emerges: Meetings should not be the starting point — they should be the convergence point. Before scheduling a meeting: Share a PR, ADR, or documentEncourage comments and discussionLet people raise concerns early By the time the meeting happens: Context is already distributedOpinions are already formedThe conversation is of higher quality If the goal is simply to communicate: Record itShare itAllow asynchronous questions Not everything needs real-time coordination. 7. Close With Decisions and Next Steps A meeting without an outcome is just an expensive conversation. Always end with: What was decidedWhat actions are requiredWho owns each actionWhat are the next steps Otherwise, the same topic will return, generating another meeting. And that’s how meeting debt accumulates. At this point, a pattern should be clear: effective meetings are not about better conversation — they are about better system design. And like any well-designed system, they rely on clear inputs, controlled execution, and meaningful outputs. Conclusion Meetings are not the problem — poorly designed meetings are. What should be a mechanism for alignment and decision-making often ends up as fragmentation, context switching, and wasted effort. When approached intentionally, however, meetings become a powerful coordination tool: one with clear goals, proper time constraints, strong facilitation, and a foundation of asynchronous preparation and documentation. In this model, meetings are no longer the place where work happens — they are where decisions converge after the work has already been explored. This is where modern tools, especially AI, can amplify the system. From automatic transcription to summarizing decisions and extracting action items, tools from ecosystems like Google Workspace and others help transform meetings into persistent knowledge artifacts. But the principle remains: tools don’t fix bad meetings — they scale good ones. When combined with strong practices, AI turns meetings from ephemeral conversations into structured, reusable assets, enabling teams to move faster with less friction and more clarity.

By Otavio Santana DZone Core CORE
Working With Cowork: Don’t Be Confused
Working With Cowork: Don’t Be Confused

TL;DR: Understand the Claude Desktop Architecture and Save Time You configured Claude in Claude Desktop, wrote instructions, uploaded reference files, and set your preferences. Then you clicked the Cowork tab. Unfortunately, Claude had no memory of what you just did. Your instructions were gone, as were your files and preferences. You assumed this was a bug, but it is a feature: You switched applications. The Claude Desktop App Hosts 3 Separate Applications The tabs at the top of Claude Desktop (Chat, Cowork, Code) appear to be views of the same product. They are not. For example, Anthropic’s own documentation describes Cowork as using “the same agentic architecture that powers Claude Code”. However, in practice, each tab runs on a different execution layer with its own sandbox, memory system, and instruction hierarchy. The architectural split matters: Cowork and Code share an engine. Chat is a separate system entirely. A useful functional shorthand is as follows: Chat is for thinking: It runs in the cloud on Anthropic’s servers. It cannot access files on your machine; you have to provide those. In Chat, you converse, you reason, you get answers.Cowork is for doing: It runs inside a sandboxed Linux virtual machine, or VM in short, on your local computer. It reads and writes files in folders you mount, works autonomously in the background, and wipes the VM after every session. (Which is also, as you may imagine, the reason that Cowork does not remember previous sessions: The previously used VM is gone.)Code is for building: It runs natively in your terminal with full system access and no sandbox. It is made for engineers. So, there is an architectural reason why the instructions you just spent 20 minutes writing do not follow you when you move between tabs. Let’s see what crosses the tab boundary and what does not: The Word “Project” Means 3 Different Things This is the collision that wastes the most time. Which of these three did you configure last week? The Cowork Projects documentation confirms that Cowork projects live locally on your desktop, separate from Chat Projects. Your Chat Project knowledge base is invisible to Cowork and Code. When Cowork says “choose a project,” it offers three options: start from scratch (a new folder), import from a Chat Project (a one-way snapshot, not a live link, not future synchronization between the two either), or use an existing folder on your hard drive. The word “Project” appears three times on that screen, referring to different things. Memory, Artifacts, and Instructions Collide, too Given the current architectural state of three different Claude apps, posing as one, this pattern repeats across every shared term. Memory: Chat auto-summarizes your conversations in the cloud. Cowork has project-scoped memory only (Note: that refers to “projects” listed in the sidebar.) Standalone Cowork sessions without a project remember nothing, because the VM that ran the session is wiped when it ends. Code uses CLAUDE.md files, plus an auto-memory system.Artifacts: In Chat, an artifact is a rendered preview in a side panel (HTML, React, SVG). In Cowork, the same word means a real file on your disk (.docx, .xlsx, .pdf) or a Live Artifact (a persistent interactive dashboard that survives session restarts).Instructions: Chat has two instruction locations (Profile Preferences and Project Instructions) plus a Styles selector for writing tone. Cowork has three different locations (Global, Folder, Project). Code has a five-tier hierarchy: managed policies, CLI flags, .claude/settings.local.json, .claude/settings.json, and ~/.claude/settings.json, plus CLAUDE.md files at user, project, and local levels. None of the instructions syncs across tabs. Count the instruction locations you have configured. Now count the ones you assumed were active in a different tab. That is the gap. Watch Out When Working With Claude Desktop: Back Up Your Folder Before Your First Cowork Session Cowork’s sandbox prevents access to files outside your mounted folder. Inside that folder, Cowork has full read and write access. It does not archive. It does not move files to a trash folder. When it deletes, the files are gone. On the day Cowork launched in January 2026, a user recorded their first session on video. They asked Cowork to “clean up” a folder. Cowork ran an rm -rf command inside the autonomous Linux VM and permanently deleted 11 GB of files. The video went viral on Hacker News. Anthropic has since added a deletion confirmation prompt that requires explicit permission before Cowork permanently deletes any files. The underlying access model has not changed: inside your mounted folder, Cowork can do anything. As of May 2026, these actions leave no audit trail. Anthropic states this directly: “Do not use Cowork for regulated workloads.” If you work in a regulated industry, that sentence applies to you. If it is gone, it is gone. Back up every folder you mount to Cowork, that’s non-negotiable. Obviously, Anthropic Knows About the Tab Isolation Dispatch, available as a research preview for Pro and Max plans, lets you send tasks from your phone to a Cowork session running on your desktop. It is a mobile-to-desktop bridge. The isolation between Chat, Cowork, and Code remains. Dispatch signals where the product is heading. 2 Documents So You Do Not Have to Discover This The Hard Way I put together three companion documents for the introductory module of my upcoming Claude Cowork Online Course. They cover the architecture, the terminology collisions, and the practical setup steps. I am sharing two of them here because the confusion they address is real and widespread, and nobody should have to discover these things by losing work: The Quick Reference Card maps Chat, Cowork, and Code across nine dimensions: environment, file access, sandbox, execution model, project type, memory, output type, extensions, and instruction locations. Pin it to your wall or keep it open during your first week with Cowork: Working with Cowork: Quick Reference Card.The Terminology Collision Glossary maps eight terms (Project, Memory, Artifacts, Instructions, Workspace, Session, Tool, Agent) across four surfaces (Chat, Cowork, Code, API). The “Project” row alone will save you thirty minutes of confusion: Working with Cowork: Terminology Glossary. Conclusion: Before You Start With Claude Desktop and Cowork, Take 4 Steps in 5 Minutes If you are about to use Cowork for the first time, do these four things: Create a dedicated folder for Cowork. Not your Documents folder. Not your Desktop. A purpose-built folder with a clear name within your existing local file system.Set up backups for that folder before you mount it. Time Machine on macOS. File History on Windows. Git if you prefer. Do this before you give Cowork access.Open Cowork, create a project by choosing Project from the sidebar and clicking “New Project”, and point it at that folder. Write one sentence of instructions describing what you use this workspace for. (You can iterate on the instructions later.)Switch between all three tabs. Verify for yourself that your Project, your instructions, and your memory do not follow you. Invest five minutes of your time, and these four steps prevent the mistakes that cost people hours. Once you stop fighting Claude Desktop’s architecture and start working with it, Cowork becomes a different tool entirely. That is what the rest of the course is about.

By Stefan Wolpers DZone Core CORE
AWS Kiro: The Agentic IDE That Makes Specs the Unit of Work
AWS Kiro: The Agentic IDE That Makes Specs the Unit of Work

The agentic IDE space has gotten crowded fast. Cursor, Claude Code, Copilot, Windsurf — they all share the same core model: you type a prompt, the AI writes some code, you iterate. It works well for prototyping. It breaks down when you're building production systems on a large codebase with a team of more than one. AWS Kiro takes a different bet. Instead of chat-first, it's spec-first. The unit of work isn't a prompt — it's a structured specification that the agent uses to plan, implement, verify, and document your feature end to end. That's a meaningful philosophical difference, and in practice it changes what the tool is useful for. Here's what Kiro actually is, how its core concepts fit together, and an honest take on when it makes sense over the alternatives. What Kiro Is Kiro launched from AWS in mid-2025 and is built on top of Amazon Bedrock, routing between Claude Sonnet for reasoning-heavy work and Amazon Nova for high-throughput code generation. It ships in three forms: Kiro IDE – a VS Code-compatible editor (built on Code OSS, so you can import your existing themes, keybindings, and Open VSX plugins)Kiro CLI – the same agent in your terminal, useful for SSH sessions or scripted workflowsKiro Autonomous Agent – a background agent that picks up tasks, implements them, and opens PRs without you sitting in the loop You don't need an AWS account to get started — you can sign in with GitHub or Google. The IDE feels immediately familiar if you've used VS Code, which removes one of the usual adoption barriers for new tooling. In January 2026, AWS also announced the end of Amazon Q Developer for new signups (effective May 15, 2026), explicitly directing users to Kiro as its successor for IDE-based AI assistance. That's a significant signal about where AWS is placing its bets. The Three Concepts That Make Kiro Different 1. Specs When you start a new feature in Kiro, you don't jump straight to code. You describe what you want to build, and Kiro generates three structured files: requirements.md — user stories and acceptance criteriadesign.md — system design, component breakdown, data flowtasks.md — a numbered implementation checklist the agent works through These become the source of truth. Code is a build artifact of the spec. When you come back to the feature a month later, or hand it to a new team member, the reasoning behind every decision is documented — not in a Confluence page nobody reads, but in the repo next to the code it describes. This is the thing chat-first tools can't replicate. Cursor or Claude Code can generate excellent code from a good prompt. What they can't do is maintain a structured paper trail of why the code looks the way it does. 2. Hooks Hooks are event-driven automations that fire when things happen in your workspace — file save, new file created, commit opened. You define what Kiro should do in response, and it runs those actions in the background without you having to think about them. Common hooks teams set up: Run the linter and auto-fix on every file saveRegenerate unit tests when implementation files changeUpdate the relevant section of design.md when a module is modifiedRun a security scan before any commit The practical effect is that a junior developer's output passes the same automated quality bar as a senior's, because the standards are enforced by the environment rather than by code review heroics. 3. Steering Files Steering files are Markdown files that give Kiro persistent context about your project — your conventions, the libraries you've standardized on, your architecture decisions, your security requirements. You create them once, and Kiro reads them on every interaction without you having to re-explain your stack in every prompt. They live in two places: ~/.kiro/steering/ – global rules that apply across all your projects.kiro/steering/ – project-specific overrides checked into the repo A typical global steering file might say things like "always use TypeScript strict mode," "prefer AWS CDK over raw CloudFormation," or "all Lambda functions must have structured logging with a correlation ID." Project steering files add things like "this service is a multi-tenant SaaS, tenant ID is always passed in the request context." The result is that Kiro's context isn't reset between sessions and doesn't depend on whoever wrote the last prompt being thorough. The Hooks + Specs Flywheel The real power emerges when hooks and specs work together. Here's what that looks like in practice: You describe a new feature. Kiro generates requirements.md, design.md, and tasks.md.You review and refine the spec. Add an edge case to the requirements, adjust the component breakdown in design.Kiro implements the task list, following your steering files for conventions.On each file save, hooks run: linter, tests, security scan. Issues surface immediately.When you're done, a hook generates the commit message from the spec diff.The PR description writes itself from requirements.md. The spec doesn't go stale because hooks keep it in sync with the code. The code doesn't drift from the design because the design was written before the code. This is what "engineering rigor" means in the context of agentic development — not slower, but structured. AWS-Native Advantages (and the Honest Tradeoff) Kiro has deep integration with the AWS ecosystem: CodeCatalyst for repositories and CI/CD, Bedrock for model access, IAM Identity Center for enterprise auth, and "Kiro Powers" — pre-packaged MCP servers for AWS-specific domains like CDK, CloudFormation, pricing, and (recently) HealthOmics workflows. If your team is already AWS-first, this is a genuine multiplier. Your Kiro agent can query your actual AWS account context, reference live Bedrock documentation, and generate CDK constructs that match your organization's guardrails. The honest tradeoff: if your team isn't AWS-first, some of this integration feels like overhead rather than lift. Kiro works perfectly well as a general-purpose agentic IDE — the spec/hooks/steering system has value regardless of your cloud provider — but the ecosystem integrations are clearly designed for AWS shops. Most teams running mixed infrastructure (some AWS, some not) find it practical to use Kiro for the AWS-native services and keep their existing editor for everything else. The two coexist fine. How It Compares to the Alternatives KiroCursorClaude CodePrimary paradigmSpec-drivenChat-drivenTask-driven (CLI)Persistent contextSteering filesRules / .cursorrulesAGENTS.mdAutomationHooks (event-driven)ManualManualAWS integrationNativeNoneNoneIDEStandalone (VS Code-compatible)Fork of VS CodeTerminal onlyBackground agentYes (autonomous agent)LimitedYesBest forProduction features, team consistencyFast prototyping, explorationComplex refactors, agentic tasks Kiro and Claude Code aren't direct competitors in practice — Kiro is an IDE product, and Claude Code is a terminal agent. Many teams run both, using Kiro for structured feature work and Claude Code for open-ended refactors or one-off tasks. Getting Started Download the IDE from kiro.dev — no AWS account required. Sign in with GitHub or Google, point it at an existing repo, and run through the onboarding to import your VS Code settings. A good first experiment: take a feature you're planning to build anyway, describe it to Kiro, and look at the spec it generates before writing any code. The value of the approach becomes obvious when you see your vague "add user preferences" idea turn into a concrete requirements doc with six acceptance criteria and a data model. From there: Create one global steering file in ~/.kiro/steering/ with your language and framework defaultsSet up one hook that runs your linter on file saveBuild the feature using the task list Kiro generated That's the feedback loop that makes the tool click. The full power of the hooks and autonomous agent comes later, but even the basic spec workflow is a meaningful improvement over prompt-and-iterate for anything that takes more than a day to build. Worth Watching A few things that make Kiro worth keeping an eye on, even if you're not ready to switch: The spec-as-artifact model is genuinely novel. When agents get better, spec-driven codebases will be better positioned to benefit — the structured requirements and design docs give future agents a much richer context than a commit history and some comments. Kiro Powers (the MCP server marketplace) is growing fast. The HealthOmics extension in February 2026 showed that domain-specific agent packs are a real product direction, not just a demo. And with Amazon Q Developer sunsetting for new users, AWS is clearly consolidating its developer AI bet onto Kiro. Whatever the roadmap looks like from here, it's going to get resources. Kiro isn't the right tool for every workflow. If you're prototyping solo or doing exploratory work, the spec-first overhead is friction you don't need. But for teams shipping production features that need to be documented, tested, and maintained — the bet that specs should be the unit of work is a compelling one. Kiro vs. the Alternatives FeatureKiroCursorClaude CodeGitHub CopilotPrimary paradigmSpec-drivenChat-drivenTask-driven (CLI)Inline completionPersistent contextSteering files.cursorrulesAGENTS.mdNoneEvent automationHooks (file save, commit)NoneNoneNoneStructured specs✅ Native❌❌❌Background agent✅ Autonomous agentLimited✅❌AWS-native integration✅ Deep❌❌❌Dynamic MCP loading✅ PowersManualManual❌IDE baseCode OSS (VS Code compat.)VS Code forkTerminal onlyPluginFree tier✅✅✅✅ How Spec-Driven Development Works Plain Text ┌─────────────────────────────────────────────────────────┐ │ YOU: describe a feature │ └─────────────────────────┬───────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────┐ │ KIRO GENERATES SPECS │ │ │ │ .kiro/specs/my-feature/ │ │ ├── requirements.md ← user stories + EARS notation │ │ ├── design.md ← architecture, data flow, APIs │ │ └── tasks.md ← ordered implementation plan │ └─────────────────────────┬───────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────┐ │ YOU: review + refine specs │ │ add edge cases, adjust design, approve task list │ └─────────────────────────┬───────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────┐ │ KIRO IMPLEMENTS task by task │ │ guided by steering files + spec context │ └─────────────────────────┬───────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────┐ │ HOOKS FIRE AUTOMATICALLY │ │ on every file save: │ │ → linter + autofix │ │ → test generation / update │ │ → security scan │ │ → design.md sync │ └─────────────────────────┬───────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────┐ │ PR OPENS — description from requirements.md │ │ commit message generated from spec diff │ └─────────────────────────────────────────────────────────┘ Steering File Layout Markdown ~/.kiro/steering/ ← global, applies to every project ├── typescript.md "always use strict mode, no any" ├── aws.md "prefer CDK over raw CloudFormation" ├── security.md "IAM roles must follow least privilege" ├── git.md "use conventional commits" └── testing.md "80% coverage minimum, jest + RTL" your-repo/ └── .kiro/ └── steering/ ← project-specific overrides (checked in) ├── architecture.md "multi-tenant SaaS, one DB schema per tenant" ├── api.md "all endpoints versioned under /v1" └── data-model.md "tenant ID always in request context, never inferred" Hook Definition Example YAML # .kiro/hooks/test-sync.yaml name: Sync Tests on Component Save trigger: event: onSave pattern: "src/**/*.tsx" instructions: | When a React component file is saved: 1. Check if a corresponding test file exists in __tests__/ 2. If not, create one with basic render and snapshot tests 3. If it exists, update it to cover any new props or exported functions 4. Run the test file and report failures inline YAML # .kiro/hooks/security-scan.yaml name: Pre-commit Security Scan trigger: event: onCommit instructions: | Before every commit: 1. Scan staged files for hardcoded secrets, API keys, and credentials 2. Check for any 0.0.0.0/0 ingress rules in IaC files 3. Flag any new IAM policies that use wildcard actions (*) 4. Block the commit and explain any findings — do not auto-fix How Powers Solve Context Rot Without Powers, connecting multiple MCP servers front-loads your entire context window before you write a single line: Plain Text Without Powers ────────────────────────────────────────────────── Context window (200K tokens) [Figma MCP tools] ~12K tokens ████ [Postman MCP tools] ~18K tokens ██████ [Stripe MCP tools] ~10K tokens ███ [Supabase MCP tools] ~15K tokens █████ [Datadog MCP tools] ~9K tokens ███ ────────────────── Total overhead ~64K tokens (32% gone before first prompt) With Powers (dynamic loading) ────────────────────────────────────────────────── You mention "payment" → Stripe power activates You mention "database" → Supabase activates, Stripe deactivates Workspace Architecture for AWS Teams Plain Text AWS Organization └── Management Account ├── Client A Account │ ├── Kiro workspace (.kiro/ scoped here) │ ├── CodeCatalyst repo │ ├── Bedrock access (us-east-1) │ └── Secrets Manager (client A secrets only) │ ├── Client B Account │ ├── Kiro workspace (.kiro/ scoped here) │ ├── CodeCatalyst repo │ ├── Bedrock access (us-east-1) │ └── Secrets Manager (client B secrets only) │ └── Shared Services Account ├── IAM Identity Center (SSO for all Kiro logins) This pattern keeps client IP, secrets, and Bedrock spend isolated by account boundary — IAM does the enforcement, not convention. Resources kiro.dev – download is free, no AWS account requiredIntroducing Kiro – the original launch post, good context on the design philosophy behind specs and hooksIntroducing Powers – explains why dynamic MCP loading matters and how Powers solve context rotTeaching Kiro new tricks with steering and MCP – practical deep dive on using steering + MCP to handle custom libraries and DSLsSpecs documentation – full reference, including the Design-First and Bugfix spec workflowsKiro Powers marketplace – browse Figma, Stripe, Supabase, Datadog, Terraform, and moreIDE Changelog – how fast the product is movingAmazon Q Developer end-of-support announcement – official AWS post confirming Kiro as Q Developer's successorgithub.com/kirodotdev/Kiro – issue tracker and feedback repo

By Jubin Abhishek Soni DZone Core CORE
Solving the Mystery: Why Java RSS Grows in Docker on M1 Macs
Solving the Mystery: Why Java RSS Grows in Docker on M1 Macs

The Problem You're running a Java application in a Docker container on your M1 Mac. Everything works fine, but you notice something strange: The resident set size (RSS) keeps growing, even though your heap usage is stable. After hours of investigation, you find mysterious rwxp memory regions, each exactly 128 MB, accumulating in your process memory map. What's causing this? Is it a memory leak? A JVM bug? Something else entirely? The Investigation Our journey began with monitoring RSS growth in a Java 17 application deployed on Docker-backed Minikube. Despite stable heap usage and no obvious memory leaks, RSS continued to grow by hundreds of megabytes over time. Initial Observations RSS growth: ~500-700 MB over 11 hoursHeap usage: Stable and within limitsThread count: StableNative memory tracking: No obvious leaks Deep Dive Into Memory Maps Using /proc/PID/maps and /proc/PID/smaps, we discovered the growth was coming from anonymous executable memory regions: Shell $ cat /proc/1/maps | grep rwxp efffd1d7c000-efffd9d7c000 rwxp 00000000 00:00 0 efffdb185000-efffe3185000 rwxp 00000000 00:00 0 efffe3d85000-efffebd85000 rwxp 00000000 00:00 0 ... Each region was exactly 128 MB, in the 0xefff* address range, with read-write-execute permissions. But what was in them? The Discovery Reading the memory content revealed something unexpected: ARM64 machine code instructions. But wait, the Java binary was x86-64, and the process reported x86_64 architecture. What was ARM64 code doing there? The "Aha!" Moment The answer: Rosetta 2 translation cache. When running x86-64 containers on ARM64 M1 Macs via Docker Desktop, Rosetta 2 translates x86-64 instructions to ARM64. The translated code is cached in executable memory regions-those mysterious RWXP regions we were seeing! The Root Cause Here's what was happening: JIT compilation: Java's JIT compiler generates x86-64 native code for hot methodsRosetta 2 intercepts: When x86-64 code executes, Rosetta 2 translates it to ARM64Translation cache: Translated ARM64 code is stored in 128 MB RWXP memory regionsGrowth: More JIT-compiled methods = more translations = more RWXP regions Evidence ObservationExplanationRWXP regions contain ARM64 codeRosetta 2's translated codeExactly 128 MB per regionRosetta 2 allocation granularityAnonymous (no file backing)Runtime translation cacheGrowth correlates with JIT activityMore compiled methods = more translations The Proof To definitively prove JIT was the trigger, we disabled JIT compilation using the -Xint flag: Java -Xint # Run in interpreter-only mode Results MetricBefore (JIT Enabled)After (JIT Disabled)RWXP Regions5 -> 12 -> 15 (growing)1 (stable, no growth)RWXP Memory~1.9 GB~128 MBGrowth RateMultiple regions/hour0 regions/hourCompiled Methods25,606 nmethods0 nmethods Result: With JIT disabled, RWXP growth completely stopped. Monitoring over 1+ hour confirmed zero growth. Why This Happens The Perfect Storm ARM64 host: M1 Mac (Apple Silicon)x86-64 container: Docker image built for AMD64Rosetta 2 enabled: Docker Desktop uses Rosetta 2 for emulationDynamic code generation: Java JIT compiler When all four conditions are met, Rosetta 2 must translate every JIT-compiled method from x86-64 to ARM64, storing the translations in executable memory regions that count toward process RSS. The Solution Option 1: Use Native ARM64 Images (Recommended) The best solution is to use ARM64-native Docker images: Shell # Build for ARM64 docker build --platform linux/arm64 ... # Or use multi-arch images docker pull --platform linux/arm64 your-image:tag Benefits: No Rosetta 2 translation neededNo RWXP growthBetter performance (native execution)Lower memory usage Option 2: Deploy to x86-64 Infrastructure If ARM64 images aren't available, deploy to x86-64 servers or cloud instances where Rosetta 2 isn't needed. Option 3: Accept and Monitor If you must use x86-64 containers on M1 Macs: Increase container memory limitsMonitor RWXP growthPlan for periodic restarts if needed Not Recommended Don't disable JIT in production (-Xint). While it stops RWXP growth, it dramatically reduces performance. Use it only for testing/debugging. Key Takeaways Rosetta 2 translation cache causes RWXP memory growth in x86-64 containers on ARM64 MacsJIT compilation is the primary trigger; each compiled method needs translationNative ARM64 images eliminate the problem entirelyThis is expected behavior, not a bug-it's the cost of emulation Conclusion What started as mysterious RSS growth turned out to be Rosetta 2's translation cache storing ARM64 translations of JIT-compiled Java code. By understanding the mechanism and testing with JIT disabled, we proved the root cause and identified the best solution: use native ARM64 images. If you're experiencing similar RSS growth in Java applications on M1 Macs, check for RWXP regions in your process memory map. If you see them, Rosetta 2 translation is likely the culprit. How to Check Shell # Check for RWXP regions cat /proc/PID/maps | grep rwxp # Count RWXP regions cat /proc/PID/maps | grep rwxp | wc -l # Check if Rosetta 2 is active cat /proc/PID/maps | grep rosetta Have you encountered similar issues? Share your experience in the comments below!

By Sumeet Sharma
Monitoring Spring Boot Applications with Prometheus and Grafana
Monitoring Spring Boot Applications with Prometheus and Grafana

Monitoring Spring Boot Applications with Prometheus and Grafana Spring Boot’s Actuator and Micrometer provide rich metrics that can be scraped by Prometheus and visualized in Grafana. This guide covers configuring a Spring Boot application to expose Prometheus-formatted metrics, writing custom metrics, and setting up Prometheus and Grafana for monitoring. We cover installing Prometheus, writing a configuration to scrape your application, importing Grafana dashboards, and crafting PromQL queries and alerting rules. We also discuss Prometheus best practices, including metric naming conventions, label cardinality, and retention settings. Security considerations, troubleshooting tips, and the performance impact of metrics collection are also included. The diagram below illustrates a typical monitoring architecture. Figure: Data flow for Spring Boot monitoring. Prometheus scrapes metrics from the Spring Boot /actuator/prometheus endpoint and stores time-series data. Grafana queries Prometheus for visualization, while Alertmanager handles notifications. Spring Boot Actuator and Micrometer Setup Spring Boot auto-configures Micrometer when the micrometer-registry-prometheus dependency is on the classpath. Include these dependencies in your build: Maven (pom.xml) XML <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-actuator</artifactId> </dependency> <dependency> <groupId>io.micrometer</groupId> <artifactId>micrometer-registry-prometheus</artifactId> </dependency> Gradle (build.gradle) Groovy implementation 'org.springframework.boot:spring-boot-starter-actuator' implementation 'io.micrometer:micrometer-registry-prometheus' These enable Actuator and the Prometheus registry. Next, configure Actuator to expose the Prometheus endpoint. By default, only the health endpoint is exposed. To enable others: Properties Plain Text management.endpoint.prometheus.enabled=true management.endpoints.web.exposure.include=prometheus The first property enables the /actuator/prometheus endpoint; the second exposes it over HTTP. Be cautious: exposing all endpoints can leak sensitive information. For production: Properties Plain Text management.endpoints.web.exposure.include=health,prometheus management.endpoints.web.exposure.exclude=env,beans Also secure /actuator via Spring Security or network ACLs. Verify metrics by visiting: Plain Text http://<host>:<port>/actuator/prometheus You should see Prometheus-formatted output like: Plain Text jvm_memory_used_bytes{area="heap",id="PS Eden Space"} 1.2345E7 http_server_requests_seconds_count{exception="None",method="GET",status="200",uri="/hello"} 42.0 These include JVM and HTTP metrics. Sample Spring Boot Metrics Code Below is a simple Spring Boot application with a REST endpoint and custom Micrometer metrics: Java @SpringBootApplication public class DemoApplication { public static void main(String[] args) { SpringApplication.run(DemoApplication.class, args); } } @RestController public class HelloController { private final Counter helloCounter; public HelloController(MeterRegistry registry) { // Define a custom counter with tags this.helloCounter = Counter.builder("custom_hello_requests_total") .description("Total number of /hello requests") .tags("env", "demo") .register(registry); } @GetMapping("/hello") public String hello() { // Increment counter on each request helloCounter.increment(); return "Hello from Spring Boot!"; } } Micrometer automatically exposes these metrics, along with built-in ones like jvm_memory_* and http_server_requests_*. Prometheus Setup Installing Prometheus Native: Download from the official site, extract, and run ./prometheus (default port: 9090).Docker: Bash Plain Text docker pull prom/prometheus docker run --name prometheus -d -p 9090:9090 prom/Prometheus To use a custom configuration: Bash YAML docker run --name prometheus -d -p 9090:9090 \ -v /path/to/prometheus.yml:/etc/prometheus/prometheus.yml \ prom/prometheus prometheus.yml Configuration YAML global: scrape_interval: 15s scrape_configs: - job_name: 'springboot-app' metrics_path: '/actuator/prometheus' scrape_interval: 15s static_configs: - targets: ['localhost:8080'] In this example Prometheus scrapes the /actuator/prometheus endpoint on localhost:8080 every 15 seconds. Adjust targets for your actual hostnames/IPs. For containerized or Kubernetes environments Prometheus can use service discovery. Grafana Setup Grafana reads data from Prometheus and renders dashboards. Install Grafana: Download from grafana.com and runAdd Prometheus Data Source: In Grafana UI go to Configuration -Data Sources-Add data source, choose Prometheus and set URL to http://localhost:9090 . Click Save & Test to confirm connection.Import a Dashboard: Grafana can import community dashboards. For Spring Boot, use the Spring Boot Statistics dashboard. In Grafana, go Dashboards- Import then paste the dashboard ID or JSON. The steps are click Dashboards – Import upload JSON or paste URL/ID. Choose your Prometheus data source when prompted.Create Custom Panels: we can also build your own dashboards. Common panels are JVM memory: jvm_memory_used_bytes{area="heap"} / jvm_memory_max_bytes{area="heap"} Threads: jvm_threads_live GC Activity: rate(jvm_gc_pause_seconds) Metric Naming and Label Best Practices Prometheus and Micrometer follow a naming convention to ensure clarity. Metric names should: Use snake_case describe one thing, and include a suffix for units. For example, http_request_duration_seconds, process_cpu_seconds_total, data_pipeline_last_record_processed_timestamp_seconds.Include an application or domain prefix when useful though commonly used metrics (like JVM or process metrics) often lack a custom prefix. Prometheus and Grafana Best Practices Cardinality: Reduce label explosion. For example avoid including full URLs with query strings or user specific labels on metrics. Keep label sets small and enumerable.Retention: Prometheus by default retains 15 days of data. You can change with flags: e.g. storage.tsdb.retention.time=30d. Alternatively use storage.tsdb.retention.size to cap disk usage. Security Considerations Protect Actuator: Never expose /actuator/prometheus or other management endpoints to the public internet without security. Restrict to a monitoring subnet or secure with Spring SecurityHTTPS: Use TLS for Grafana and Prometheus web UIs if reachable over the network. For example run Grafana behind an HTTPS proxy or enable its built in TLS support.Authentication: Enable authentication on Grafana. For Prometheus you may run it behind an authenticated proxy or use basic authNetwork: Consider placing monitoring tools in a VPN or private network segment. Do not open 9090/3000 publiclyAlerting secrets: In Alertmanager configs, protect credentials. Performance Implications Instrumenting with micrometer has minimal overhead on the application side. Prometheus scraping and storage is where costs accrue Summary of Key Information Dependencies & Config-Add spring boot starter actuator and micrometer registry prometheus enable management endpoint prometheus and expose it.Prometheus setup run prometheus and configure prometheus.yml to scrape your app at /actuator/prometheusGrafana dashboards-Add Prometheus as a data source. Import or create dashboardsMetric naming: Use descriptive names with units avoid redundant labels and avoid high cardinality labels.Metric types: Counters, Gauges, Histograms/Summaries. Prefer histograms for cluster quantiles.Alerts: Define Prometheus alerting rules for conditions like instance down or high error rate. Example expressions: rate(http_server_requests_seconds_count{status=~"5.."}[5m]) Conclusion By combining Spring Boot Actuator, Micrometer with Prometheus and Grafana. We will gain end to end monitoring of your Java applications. You expose rich application metrics via the /actuator/prometheus endpoint configure Prometheus to scrape them and build Grafana dashboards to visualize system health. This setup supports alerting so you can be notified of issues like high error rates or downtime. Careful attention to metric naming label cardinality and data retention ensures scalable high performance monitoring. Following best practices and using community dashboards will help your monitoring solution be effective and maintainable.

By Ramya vani Rayala

Top Tools Experts

expert thumbnail

Abhishek Gupta

Principal PM, Azure Cosmos DB,
Microsoft

I mostly work on open-source technologies including distributed data systems, Kubernetes and Go
expert thumbnail

Yitaek Hwang

Software Engineer,
NYDIG

The Latest Tools Topics

article thumbnail
How to Build a Local LLM Agent to Automate Work List Generation from Monthly Reports (With Jira Integration)
Learn how a local LLM agent automates work list generation from reports, enriches tasks from Jira, detects duplicates, and keeps enterprise data secure.
June 11, 2026
by Sergey Laptick
· 123 Views
article thumbnail
The Repo Tracker: Automating My Daily GitHub Catch-Up
Automate GitHub repo tracking with a local agent using Python, SQLite, and cron. Learn how to build a lightweight monitoring system for open-source projects.
June 11, 2026
by Alain Airom
· 213 Views
article thumbnail
The Documentation Crisis Nobody Sees: Why AI Agents Are Breaking Faster Than Humans Can Document Them
Production AI failures often stem from undocumented behavior. Learn about AIDF, a framework for defining agent decisions, boundaries, and accountability.
June 10, 2026
by Igboanugo David Ugochukwu DZone Core CORE
· 574 Views · 1 Like
article thumbnail
Combining Temporal and Kafka for Resilient Distributed Systems
Kafka handles durable event streaming while Temporal manages long-running workflow state, retries, and recovery to build resilient distributed systems.
June 9, 2026
by Akhil Madineni
· 739 Views · 1 Like
article thumbnail
Managing, Updating, and Organizing Agent Skills
Organize and manage AI agent skills with symlinks, automated syncing, overlap detection, and version tracking using the skill-organizer CLI tool.
June 9, 2026
by Sergio Carracedo
· 723 Views · 2 Likes
article thumbnail
Building a RAG-Powered Bug Triage Agent With AWS Bedrock and OpenSearch k-NN
Learn how a RAG-powered bug triage agent uses AWS Bedrock, OpenSearch, and dynamic scoring to automate crash analysis and routing.
June 9, 2026
by Rajasekhar sunkara
· 538 Views
article thumbnail
Amazon Quick: AWS's Agentic Workspace, Explained for Engineers
A technical deep dive into Amazon Quick — how it works, how it connects to your tools via MCP, and where it sits in the AWS agent stack.
June 9, 2026
by Jubin Abhishek Soni DZone Core CORE
· 605 Views
article thumbnail
Reproducible Development Environments, One Command Away: Introducing CodingBooth
We containerized production and CI, yet local development remains the least reproducible. CodingBooth fixes this by letting every project carry its own dev environment.
June 8, 2026
by Nawa Manusitthipol
· 799 Views · 1 Like
article thumbnail
Mastering Fluent Bit: Beginners' Guide for Contributing to our CNCF Project Docs
This intro to mastering Fluent Bit covers the entry point for developers that want to contribute to a CNCF documentation project but are not sure how.
June 8, 2026
by Eric D. Schabell DZone Core CORE
· 780 Views · 1 Like
article thumbnail
Mastering Fluent Bit: Beginners' Guide for Contributing to Our CNCF Project Website
This intro to mastering Fluent Bit covers the entry point for developers that want to contribute to a CNCF project website but are not sure how.
June 5, 2026
by Eric D. Schabell DZone Core CORE
· 2,149 Views · 1 Like
article thumbnail
Observability for Agents and Workflows: Tracing Prompts, Tool Calls, and Business Outcomes End-to-End
Learn how to trace AI agents end to end, from prompts and tool calls to business outcomes, with observability practices for production workflows.
June 5, 2026
by Srinivas Chippagiri DZone Core CORE
· 2,063 Views · 1 Like
article thumbnail
Build a GitHub Slack Bot With AWS Bedrock and MCP, Part 2
Build a Slack bot using AWS Bedrock and MCP to answer GitHub questions. Learn setup, architecture, and how to extend it with new tools and data sources.
June 4, 2026
by Sangharsh Agarwal
· 1,727 Views
article thumbnail
Compliance Automated Standard Solution (COMPASS), Part 11: Compliance as Code, the OSCAL MCP Server Way
How AI-native tooling is finally closing the loop between compliance personas and OSCAL artifacts with an MCP-standardized, AI-agent-ready interface.
June 4, 2026
by Yuji Watanabe
· 1,877 Views
article thumbnail
Build a GitHub Slack Bot With AWS Bedrock and MCP, Part 1
Building a Slack bot with traditional APIs led to 400 lines of code. Using MCP and AWS Bedrock reduced complexity, enabling scalable, tool-driven automation.
June 3, 2026
by Sangharsh Agarwal
· 2,017 Views · 1 Like
article thumbnail
MuleSoft MCP and A2A in Production: What 17 Recipes Reveal
MuleSoft MCP and A2A shipped in 2025. Zero practitioner guides exist beyond basic setup. 17 recipes reveal the implementation ladder teams are missing.
June 3, 2026
by Balachandra Shakar Bisetty
· 923 Views
article thumbnail
Migrate a Hardcoded LangGraph Agent to LaunchDarkly AI Configs in 20 Minutes
Moving a hardcoded LangGraph React agent into LaunchDarkly AI Configs so prompts, models, tools, tracking, and rollout testing can be changed without redeploying.
June 2, 2026
by Scarlett Attensil
· 1,426 Views
article thumbnail
MuleSoft IDP: Enhancing Efficiency and Accuracy in Data Extraction
MuleSoft IDP uses AI to extract and structure data from documents like invoices and PDFs, helping automate workflows, reduce errors, and improve processing speed.
June 1, 2026
by Jitendra Bafna
· 1,221 Views
article thumbnail
Event-Driven Pipelines With Apache Pulsar and Go
Build scalable, real-time pipelines with Apache Pulsar and Go using event-driven producers and consumers that communicate via Pulsar topics.
May 29, 2026
by Shivi Kashyap
· 2,654 Views
article thumbnail
Zero-Downtime Deployments for Java Apps on Kubernetes
Achieve zero-downtime deployments for Java applications on Kubernetes using rolling updates, readiness/liveness probes, and graceful shutdown strategies.
May 29, 2026
by Ramya vani Rayala
· 3,521 Views
article thumbnail
Pragmatica Aether: Let Java Be Java
A modern, distributed, fault-tolerant runtime environment for the language that was intentionally designed for managed environments.
May 29, 2026
by Sergiy Yevtushenko
· 3,709 Views · 1 Like
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • ...
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×