DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

DZone Spotlight

Monday, June 22 View All Articles »
Jakarta NoSQL: Why JPA Is Not Enough for the AI Era

Jakarta NoSQL: Why JPA Is Not Enough for the AI Era

By Otavio Santana DZone Core CORE
The most effective way to present this idea is to begin with the challenge architects face: AI has transformed the persistence landscape. Enterprise applications were once built almost exclusively on relational databases, making JPA a keystone of Jakarta EE. Today, modern systems use a mix of relational databases, document stores, caches, graph engines, and increasingly, vector databases that support semantic search, retrieval-augmented generation (RAG), and AI-powered applications. Polyglot persistence is now the industry standard. While Jakarta EE standardized relational persistence through JPA, it still lacks a vendor-neutral standard for non-relational persistence. This gap forces developers to rely on fragmented, proprietary solutions, creating barriers to portability, productivity, and innovation. The rise of AI makes this gap critical. Vector databases are now essential to intelligent systems, supporting semantic search, embeddings, and contextual retrieval. For Jakarta EE to remain the leading enterprise Java platform in the AI era, it must offer a standardized approach to NoSQL persistence, as it did for relational databases. Jakarta NoSQL is not just another specification; it constitutes a strategic investment in the ecosystem's future. By offering a familiar programming model, reducing vendor lock-in, and integrating with AI workloads, Jakarta NoSQL ensures that Jakarta EE remains relevant and competitive for the next generation of enterprise applications. NoSQL in the AI Era: Understanding the Modern Data Landscape For years, enterprise data persistence focused on relational databases. Systems relied on tables, rows, foreign keys, and SQL, making relational technology the standard for business applications. While still essential, modern architectures now use polyglot persistence, where multiple database types coexist, each satisfying specific requirements. Today, NoSQL refers to a family of database paradigms, each engineered for specific workloads and architectural needs, rather than just document databases. Key-value databases store data as key-value pairs, enabling fast lookups and low latency. Typical uses include caching, user sessions, feature flags, and temporary application state.Document databases store data as structured documents, such as JSON or BSON. They are effective for applications having hierarchical or evolving schemas, including web applications, e-commerce platforms, and content management systems.Column-family databases organize data by columns instead of rows, supporting high write throughput and horizontal scalability. They are used for IoT telemetry, event logging, analytics, and large-scale distributed systems.Graph databases model entities and relationships as nodes and edges. This structure is ideal for social networks, fraud detection, recommendation engines, dependency analysis, and knowledge graphs in which relationships are critical.Vector databases store high-dimensional embeddings from machine learning models and large language models (LLMs). They enable semantic search, similarity matching, retrieval-augmented generation (RAG), recommendation platforms, and other AI-driven features via understanding meaning instead of exact text matches.Time-series databases specialize in timestamped data that changes over time. They are used for observability, monitoring, financial markets, industrial sensors, and operational metrics where high-performance temporal data storage and analysis are essential. These database types often coexist within the same architecture. Modern applications may use PostgreSQL for transactions, Redis for caching, MongoDB for documents, Neo4j for relationships, InfluxDB for telemetry, and a vector database like Milvus, Pinecone, or Weaviate for AI-powered search and retrieval. This approach, known as polyglot persistence, is now standard in enterprise systems. The industry has embraced this shift. The Stack Overflow Developer Survey shows that while relational databases still dominate enterprise workloads, NoSQL technologies are now standard tools for developers. Technologies like Redis, MongoDB, and Elasticsearch are used alongside PostgreSQL and MySQL. Organizations no longer choose between SQL and NoSQL; instead, they combine multiple persistence technologies to leverage their strengths. Polyglot persistence is now the baseline for modern software systems. Vector databases are especially important among NoSQL categories, as they are basic to modern Artificial Intelligence systems. In contrast to traditional databases that store explicit business data, vector databases store numerical representations called embeddings. Generated by machine learning models, these embeddings encode the semantic meaning of words, documents, images, or other content as mathematical vectors. This enables software to search and retrieve information based on meaning rather than exact text matches. The distinction between lexical and semantic search illustrates the significance of vector databases. For example, a traditional SQL search for “Pet” returns records with that exact term, such as “Pet Shop,” but ignores related expressions like “Dog” or “Puppy.” Semantic search, by comparing embeddings, retrieves documents about dogs, puppies, or animal companions because it recognizes their semantic relationship. The search engine matches meaning, not just syntax. This function is vital for modern AI architectures. Large language models do not process relational tables directly; they use embeddings and contextual connections between concepts. Systems such as retrieval-augmented generation (RAG), enterprise knowledge search, recommendation engines, and intelligent assistants depend on similarity searches across millions of vectors. While relational databases can support some vector operations through extensions, vector databases are purpose-built for these workloads, offering optimized indexing and similarity algorithms for large-scale semantic retrieval. As AI adoption grows, vector databases are becoming a strategic component of enterprise architecture. Appreciating the importance of NoSQL, several Java ecosystems have developed their own solutions. Spring offers independent projects like Spring Data MongoDB, Spring Data Redis, and Spring Data Cassandra. These integrations provide a productive programming model but are tightly coupled to the Spring ecosystem. Quarkus supports NoSQL persistence through Panache and database-specific integrations, emphasizing developer productivity and cloud-native deployment. Micronaut Data supports several NoSQL engines, using compile-time code generation and ahead-of-time processing to improve performance and reduce execution overhead. While these solutions are effective, they remain framework-specific rather than platform standards. Developers switching frameworks encounter different APIs, abstractions, annotations, and operational models, even when solving similar persistence challenges. Jakarta EE addressed this for relational persistence with Jakarta Persistence (JPA), delivering a standardized, vendor-independent programming model. As NoSQL technologies expand and AI workloads more and more depend on vector databases, the lack of a vendor-neutral NoSQL standard is a significant gap in the Jakarta ecosystem. The Java Standardization Journey The need for a standardized NoSQL solution in the Java ecosystem has been discussed for years. During the Java EE era, several proposals tried to integrate non-relational databases into the enterprise platform. As NoSQL technologies grew in popularity throughout the 2010s, developers anticipated a dedicated specification to accompany traditional enterprise APIs at JavaOne conferences. Despite clear demand, no such initiative emerged within Java EE. The platform remained focused on relational persistence via JPA, leaving NoSQL adoption to rely on vendor-specific libraries and framework integrations. The transition of Java EE to the Eclipse Foundation provided an opportunity to address this challenge. Instead of waiting for a platform-level solution, the community launched Eclipse JNoSQL, an open-source project supplying a unified programming model for NoSQL databases. Drawing on JPA's success, Eclipse JNoSQL introduced mapping annotations, repositories, templates, and communication APIs that support document, key-value, column-family, and graph databases. The project showed that a consistent developer experience could be attained without compromising each database model's unique features. As Jakarta EE matured, Eclipse JNoSQL became the foundation for a new standardization effort: Jakarta NoSQL. Jakarta NoSQL was the first persistence specification created entirely within the Jakarta EE process. Unlike earlier specifications that migrated from Java EE, Jakarta NoSQL was conceived, developed, and released under the Eclipse Foundation governance model. It was among the first to complete the full Jakarta Specification Process from inception to release. Jakarta NoSQL's impact extended beyond its initial scope. During development, the expert group identified a common challenge for both relational and non-relational databases: developers needed a consistent repository abstraction independent of the underlying persistence engine. This led to the creation of a separate specification, Jakarta Data. The need to standardize NoSQL access patterns directly influenced the development of Jakarta Data's repository-oriented programming model, which applies across multiple persistence technologies. The relationship between these specifications highlights Jakarta NoSQL's broader influence on the Jakarta EE ecosystem. Jakarta NoSQL focuses on mapping and interacting with non-relational databases, while Jakarta Data delivers a unified repository abstraction for both relational and NoSQL implementations. Together, they significantly reduce fragmentation in enterprise persistence. This evolution continued beyond Jakarta Data. The drive to standardize modern persistence requirements has inspired new specifications, such as Jakarta Query, which aims to deliver a portable, type-safe, and expressive query language for various persistence technologies. As the Jakarta ecosystem grows, Jakarta NoSQL acts as a key milestone. It addressed the long-standing absence of a NoSQL standard and helped lay the foundation for the next generation of persistence specifications within Jakarta EE. Jakarta NoSQL: Built for NoSQL, Not Adapted to It When architects consider standardizing NoSQL development in Jakarta EE, a common question arises: why not extend Jakarta Persistence (JPA) to support NoSQL databases? JPA has long provided a unified programming model for relational databases in the Java ecosystem. The answer is based on a core architectural principle: tools should be optimized for their intended purpose. The first challenge is that JPA was designed specifically for relational databases, relying on concepts like tables, columns, joins, foreign keys, and transactional consistency. These are not simply implementation details but core elements of the specification. Forcing document, graph, key-value, or vector databases into this model creates friction and limits the use of each database’s native features. The second challenge is that NoSQL systems behave fundamentally differently. Graph databases perform path traversals, document databases store nested structures without normalization, key-value databases focus on fast lookups, and vector databases handle similarity calculations. These systems also differ in consistency, transactions, query languages, indexing, and scalability capabilities. Representing all these paradigms through a single relational abstraction leads to compromises. The third challenge is the importance of specialization. As Abraham Maslow noted, “if the only tool you have is a hammer, it is tempting to treat everything as if it were a nail.” Relational databases are effective, but not ideal for every persistence need. Semantic search, graph traversal, and high-volume telemetry storage are not relational problems. Applying a relational abstraction to all database types runs the risk of losing the unique optimizations each technology provides. Examine the analogy of transportation: cars, boats, submarines, and airplanes all address transportation but are specialized for different environments. Forcing them to use the same controls would result in mediocrity across all. Similarly, a single persistence abstraction may remove the features that make each database effective. Therefore, Jakarta NoSQL does not extend JPA beyond its intended scope. Instead, it offers a dedicated persistence model for non-relational databases, while continuing to maintain the familiar developer experience that contributed to JPA’s success. A key design goal of Jakarta NoSQL is to reduce mental effort for enterprise Java developers. Teams experienced with JPA should find the specification immediately approachable, as Jakarta NoSQL intentionally uses familiar terminology and concepts from the Jakarta EE community. Developers will encounter annotations like @Entity, @Id, and @Column, enabling a smooth transition from relational to non-relational persistence. Java @Entity public class Car { @Id private Long id; @Column private String name; @Column private CarType type; } At first glance, this entity closely resembles a JPA entity, which is intentional. However, the underlying implementation is fundamentally different. Jakarta NoSQL is built to support schema flexibility, embedded structures, nested documents, and database-specific storage models. This approach is reflected throughout the API. Instead of requiring developers to oversee low-level driver details, Jakarta NoSQL offers a high-level programming model via the Template API. Java @Inject Template template; Car ferrari = Car.builder() .id(1L) .name("Ferrari") .build(); template.insert(ferrari); List<Car> sports = template.select(Car.class) .where("type").eq(CarType.SPORT) .orderBy("name") .result(); The objective mirrors JPA’s original mission: permitting developers to focus on domain models and business logic, rather than serialization, connection management, or vendor-specific APIs. This foundation shaped Jakarta NoSQL 1.0. The initial release introduced the mapping layer, CDI integration, repository support, template operations, and standardized endpoints for four major NoSQL categories: Document databasesKey-value databasesColumn-family databasesGraph databases Jakarta NoSQL 1.0 showed that a unified Java programming model can respect the particular characteristics of each database family. Jakarta NoSQL 1.1 continued this evolution. While version 1.0 focused on mapping and persistence, version 1.1 expanded querying capabilities through integration with Jakarta Query. A key addition is support for parameterized queries, letting developers to safely bind parameters instead of manually constructing query strings. Java List<Car> cars = template.query( "FROM Car WHERE type = :type") .bind("type", CarType.SPORT) .result(); Version 1.1 also introduces projection support, allowing applications to retrieve lightweight views instead of entire entities. Java @Projection public record TechCarView( String name, CarType type) { } List<TechCarView> views = template .typedQuery( "FROM Car WHERE type = 'SPORT'", TechCarView.class) .result(); These features improve performance, reduce data transfer, and comply with modern Java features such as records. An important aspect of Jakarta NoSQL is its long-term architectural vision. While most developers use the mapping layer, the specification also defines a lower-level communication API for advanced scenarios. Java DocumentManagerFactory factory = ...; DocumentManager manager = factory.get("users"); DocumentRecord record = ...; manager.put(record); Optional<DocumentRecord> result = manager.findByKey("user:10"); manager.deleteByKey("user:10"); This communication layer is optional. Application developers can build complete systems without it, but it is valuable for database vendors, framework authors, and advanced integrations needing direct access to database capabilities. This design is fundamentally different from JDBC, which assumes communication through SQL statements and tabular result sets. That model works well because relational databases share a common language and interaction pattern. NoSQL databases do not. Document databases may use BSON, graph databases may offer traversal languages, and vector databases may provide similarity-search APIs. Others use REST endpoints, binary protocols, gRPC streams, or vendor-specific mechanisms. Forcing these models into a JDBC-style abstraction would limit their capabilities or demand ongoing vendor-specific extensions. For this reason, Jakarta NoSQL uses a layered architecture. The mapping layer offers a portable, productive programming model for developers, while the communication layer remains flexible to support diverse NoSQL systems. This architecture positions the specification for future growth. As new technologies like vector databases, time-series engines, and AI-native storage emerge, Jakarta NoSQL can evolve without imposing a relational mindset. Rather than treating every database as a nail for the JPA hammer, Jakarta NoSQL recognizes that different problems require different tools, while still presenting a consistent and familiar experience for enterprise Java developers. More
Your AI Coding Agent Can't Steal What It Never Had: The Docker Sandbox Isolation Story

Your AI Coding Agent Can't Steal What It Never Had: The Docker Sandbox Isolation Story

By Shamsher Khan DZone Core CORE
I ran an AI coding agent against a broken Kubernetes deployment for five minutes. The agent called Anthropic's API dozens of times — reasoning about manifests, running kubectl commands, redeploying workloads. It made fully authenticated requests throughout the entire session. The API key was never in its environment. Shell env | grep -iE "anthropic|api_key|secret|token|password" # (empty) That is Docker Sandbox's credential isolation model in action. This article is about what that actually means — and what else the isolation holds, breaks, and surprises you with when you probe it properly. Key Takeaways Docker Sandbox uses a host-side proxy to inject API credentials without the agent ever seeing them — the agent makes authenticated calls without possessing the keySeven live isolation probes confirmed the boundary held throughout real AI agent activity, not just at restNetwork policy is hostname-scoped HTTP filtering — not a full network control plane — with three specific behaviors the documentation doesn't make clearDevOps agents can run docker build and kubectl inside the sandbox without any path to the host Docker daemon or cluster credentialsThe --branch parallel agent mode is Git-level isolation, not VM-level — important distinction for threat models requiring separate credentials per agent The Setup I manage eight AKS clusters for Fortune 500 clients. My laptop has Azure service principals, SSH keys, kubeconfig files with a dozen cluster contexts, and twenty-plus repos — some with .env files containing real API keys. Running an AI agent from this machine without guardrails means the agent inherits all of it. Docker Sandbox changes that. Each sandbox is a microVM — its own Linux kernel, its own Docker daemon, its own network stack. You mount one project directory. The agent sees one project directory. Everything else on the machine does not exist inside the sandbox. I spent two weeks testing this claim. Here is what I found. Test environment: What Detail sbx version v0.31.1 · commit e658be1 Host macOS Apple Silicon Network endpoints probed 13 Isolation probes 7 targeted commands Kubernetes scenario Real agent task, two bugs, timed All findings backed by real terminal output. Full repo: github.com/opscart/docker-sandbox-devops. How the Credential Isolation Actually Works The sandbox environment has no API keys. But the agent made authenticated API calls. Here is the mechanism: Shell env | grep proxy # https_proxy=http://gateway.docker.internal:3128 # http_proxy=http://gateway.docker.internal:3128 # JAVA_TOOL_OPTIONS=-Dhttp.proxyHost=gateway.docker.internal -Dhttp.proxyPort=3128 ... Every outbound request — HTTP, HTTPS, even Java tools — routes through a proxy at gateway.docker.internal:3128. That proxy runs on the Mac host, completely outside the microVM boundary. When the agent sends a POST to api.anthropic.com, there is no Authorization header — the agent does not have the key. The request reaches the host-side proxy. The proxy checks the allowlist — api.anthropic.com is in the default AI services group under the Balanced policy. Authentication is performed by the host-side proxy using credentials stored outside the sandbox boundary. The authenticated request is forwarded to Anthropic. The agent receives the response. It has no idea what key was used, where it came from, or how to find it again. Think of it like an OAuth gateway. The proxy holds the credential and vouches for the agent's requests. The agent gets access without ever possessing the key. You cannot steal what you never had. This is architecturally different from the standard setup where ANTHROPIC_API_KEY sits in the shell environment — one echo $ANTHROPIC_API_KEY away from being exfiltrated. What the Four Isolation Layers Actually Do Docker Sandbox stacks four layers: Hypervisor isolation. Separate Linux kernel per sandbox. Host processes invisible. Other sandboxes invisible. A compromised sandbox cannot escalate to the host kernel. This is the fundamental difference from a Docker container — a container shares the host kernel. The microVM does not. Network isolation. All outbound HTTP/HTTPS routes through the host-side proxy. Raw TCP, UDP, and ICMP are blocked at the network layer. Three policy tiers: allow-all, balanced (curated dev allowlist), deny-all. Set before starting your first sandbox: Shell sbx policy set-default balanced Docker Engine isolation. Each sandbox runs a private Docker daemon with its own socket. No path to the host Docker daemon. An agent can run docker build and docker run without socket mounting — which is the tradeoff that breaks isolation in plain container-based approaches. Credential isolation. Proxy-based injection as described above. The raw key never enters the microVM. macOS host with sensitive assets and proxy on the left, Docker Sandbox microVM in the center, network policy zones on the right. Seven Isolation Proofs — Run Live After a Real Agent Task The agent exited after completing the debugging task. The sandbox remained alive, and I executed the following commands from the same shell session the agent had used — to show exactly what was accessible throughout the entire run. 1. Filesystem Boundary Shell ls /Users/opscart/ # Source ls /Users/opscart/.ssh/ 2>&1 One directory. The workspace mount. SSH keys, other repos, credential directories — none of them exist inside the sandbox. Parent directories above the workspace are read-only stubs with no siblings. One critical implication: if your workspace is your home directory, your entire home is visible and writable. Always mount a project subdirectory, not your home. 2. No Credentials in Environment Shell env | grep -iE "anthropic|api_key|aws|secret|token|password" # (empty) Confirmed. The agent that just made dozens of API calls had no raw credentials anywhere in its environment. 3. Proxy Confirms the Injection Mechanism Shell env | grep proxy # https_proxy=http://gateway.docker.internal:3128 # no_proxy=localhost,127.0.0.1,::1,[::1],gateway.docker.internal Proxy address visible. Credentials it carries: not visible. The mechanism described above confirmed live inside the running sandbox. 4. Process Namespace Shell ps aux | wc -l # 13 A macOS host runs hundreds of processes. The sandbox shows 13 — all internal. The stack includes dockerd, containerd, socat bridging SSH agent forwarding, and the coding agent. Host processes completely invisible. No way to inspect or interact with anything running on the host. 5. Private Docker Engine Shell docker info | grep -E "Server Version|Operating System|ID" # Server Version: 29.4.3 # Operating System: Ubuntu 25.10 (containerized) # ID: e6934b23-368c-4259-a873-96f879f587e5 Ubuntu 25.10. A unique daemon ID that differs from docker info on the host — confirming the sandbox runs a fully isolated daemon. The agent deployed a full Kubernetes cluster using this daemon. No path to the host Docker socket existed. 6. Host Services Unreachable Shell curl -s --max-time 3 https://localhost:6443 2>&1 || echo "blocked" # curl: (7) Failed to connect to localhost port 6443: Connection refused Port 6443 — my minikube cluster on the Mac host. From inside the sandbox, localhost is the sandbox's own loopback. Host clusters, host SSH, host services — unreachable by default. Eight AKS contexts on this machine. Zero is reachable from inside the sandbox without an explicit policy rule. 7. What the Agent Had vs. What It Didn't During the entire debugging task, the agent had full access to one project directory, kubectl to the sandbox-internal Kubernetes cluster, and full Docker capabilities against the private daemon. It could not reach any other directory, cloud credentials, other kubeconfig contexts, the host Docker daemon, or any cluster not running inside the sandbox. All seven proofs held throughout the session without exception. Three Network Policy Findings That Change How You Think About It Network policy is not a full network control plane. It is hostname-scoped HTTP filtering. Three findings define the actual scope: Finding 1: Blocking returns HTTP 403, not TCP rejection. Plain Text probe "example.com" "https://example.com" # example.com | exit=0 | http=403 Exit code 0. The curl command succeeded. The proxy returned 403 directly. An agent that retries on 403 will retry blocked requests indefinitely. It cannot distinguish a blocked domain from a legitimate server-side error by exit code. For DevOps workflows — an agent hitting a blocked container registry will keep retrying silently rather than failing fast. Finding 2: HTTP CONNECT established a tunnel to port 22 on an allowed host. Plain Text # Port 22 — SSH port curl -s --max-time 5 telnet://github.com:22 # Connected to github.com port 22 # Port 9999 — non-standard port curl -s --max-time 5 telnet://github.com:9999 # Connected to github.com port 9999 github.com is on the Balanced allowlist. HTTP CONNECT established TCP tunnels to github.com on both port 22 and the non-standard port 9999 — both succeeded. Port-based restrictions are not enforced at the proxy layer. The Balanced policy is hostname-scoped only. Any port to an allowed host is reachable via HTTP CONNECT. Finding 3: DNS is not filtered. A common assumption is that all outbound traffic routes through the HTTP proxy — including DNS. Lab results show DNS resolution occurs independently: Plain Text dig example.com +short # 172.66.147.243 A blocked domain resolved. The microVM has an internal stub resolver that forwards DNS independently of the HTTP proxy. An agent can resolve any hostname regardless of the active policy. DNS cannot serve as a secondary enforcement layer. These findings do not break the isolation model. They define its actual boundary. Network policy controls HTTP/HTTPS access by hostname. It does not control DNS, TCP tunnels to allowed hosts on arbitrary ports, or how agents interpret 403 responses. The Agent Scenario: Isolation Under Real Load The real test of isolation is not seven probe commands — it is whether the boundary holds while an agent is actively working, making API calls, running kubectl, deploying containers. I gave an AI agent a broken Kubernetes deployment: a payments-service with memory limits set to 64Mi on a service that needs ~150Mi at peak. The agent received a task file and a set of manifests. No other context. The agent completed the task in under five minutes. It found two bugs — one planted, one discovered independently by reading the manifest and noticing health check probes targeting port 8080 on an nginx container that only serves on port 80. The task said nothing about probes. Result: both pods 1/1 Running, 0 restarts. The seven isolation proofs above were verified immediately after — throughout the entire debugging session, the boundary held without exception. Full article and complete repo at opscart.com/docker-sandbox-devops. What This Means for DevOps Engineers Specifically Most Docker Sandbox articles target software developers running Claude Code on a single codebase. The DevOps case is different and more demanding. A DevOps engineer running an AI agent faces a broader attack surface: multiple cluster contexts, infrastructure credentials, IAM roles, service accounts, kubeconfigs that grant production access. The blast radius of a compromised or manipulated agent is not one repo — it is potentially every system those credentials touch. Docker Sandbox addresses this at the architecture level rather than the prompt level. You are not relying on the agent being well-behaved. You are relying on the microVM boundary, the proxy, and the private Docker daemon. The agent can be fully autonomous inside the sandbox because the guardrail is the environment, not the agent's behavior. The private Docker Engine is particularly significant. DevOps agents need to build and test containers. Every other local isolation approach that allows container operations requires socket mounting — which gives the agent direct access to the host Docker daemon and every image and volume on the host. Docker Sandbox eliminates this tradeoff. What Is Still Rough The image iteration cycle is the primary friction point. Adding a tool requires editing a Dockerfile, rebuilding, pushing to a registry, and recreating the sandbox. For a stable toolchain, this is acceptable. For rapid experimentation, it is not. The --branch parallel agent mode is Git isolation, not VM isolation. Both agents run in one microVM with shared Docker and network. For separate credentials or separate network policies per agent, you need separate workspace directories. The network policy CLI has non-obvious syntax in several places — sbx policy deny does not remove an allow rule, and external cluster access requires two policy rules not one. Neither behavior is documented. The CLI changes between minor versions. v0.31.1 changed login flow, renamed policy tiers, and introduced --clone mode. Pin your version. When Not to Use Docker Sandbox Docker Sandbox is the right tool for a specific set of problems. It is not the right tool when: You need raw UDP or ICMP. Network tracing tools (traceroute, mtr), some mTLS configurations, and anything relying on ICMP will not work — the sandbox proxy only handles HTTP/HTTPS. Your toolchain requires host-device access. USB devices, GPU passthrough beyond basic forwarding, and hardware security keys are not accessible from inside the microVM. You are on a memory-constrained machine. Each sandbox runs a full microVM plus its own Docker daemon. On a machine with 8GB RAM, running multiple sandboxes simultaneously alongside Docker Desktop and a browser will cause pressure. You need production-grade audit logging. Docker Sandbox is Experimental. Audit trails, compliance logging, and enterprise controls are not mature yet. For regulated environments, evaluate accordingly. Your agent needs to coordinate across multiple repositories simultaneously. The one-sandbox-per-workspace model means cross-repo agent work requires careful orchestration. The --clone mode helps but adds git workflow overhead. Conclusion The credential isolation model is the headline: the agent made authenticated API calls throughout the session without the API key ever entering the sandbox. Authentication was performed by the host-side proxy using credentials stored outside the sandbox boundary. The agent could use the credential — it could never see, copy, or exfiltrate it. Seven isolation proofs confirmed the boundary held under real active load. One directory visible. No credentials. No host processes. No host clusters. No host Docker daemon. The network policy findings add important nuance. The --branch mode reality is different from what the documentation implies. Docker Sandbox is Experimental, and the CLI is moving. Use it knowing what it is — and what it is not. More

Trend Report

Cognitive Databases, Intelligent Data

No longer passive storage and query engines, databases are becoming active, intelligent participants in how modern systems interpret, connect, and act on data. As AI moves deeper into production and enterprises adopt generative and agentic architectures, the database layer is being reshaped to support semantic search, contextual retrieval, and real-time decision-making. Vector databases, semantic indexing, and AI-driven optimization are changing how developers work with both structured and unstructured data, while the line between transactional and analytical systems continues to fade under hybrid workload demands.This report examines these industry shifts in practical terms, exploring how relational, NoSQL, vector, and multi-model systems are coming together to support AI-native applications. Our research, guest thought leadership, and practitioner insights look at how teams are bringing vector search into production, updating architectures for AI workloads, and redesigning data pipelines around semantic and contextual intelligence.

Cognitive Databases, Intelligent Data

Refcard #403

Shipping Production-Grade AI Agents

By Vidyasagar (Sarath Chandra) Machupalli FBCS DZone Core CORE
Shipping Production-Grade AI Agents

Refcard #388

Threat Modeling Core Practices

By Apostolos Giannakidis DZone Core CORE
Threat Modeling Core Practices

More Articles

The Cross-Lingual RAG Problem Nobody Is Talking About
The Cross-Lingual RAG Problem Nobody Is Talking About

The Benchmark Trap The retrieval-augmented generation (RAG) ecosystem has matured remarkably fast. Vector databases are production-grade, embedding models are cheaper than ever, and retrieval pipelines are being deployed across healthcare, finance, legal, and education systems worldwide. Every major benchmark shows impressive numbers. Almost every major benchmark is in English. This is not a minor oversight. It is a structural blind spot that has allowed a critical class of failures to accumulate in production systems largely undetected. When your evaluation dataset is monolingual and your deployment is multilingual, you are not measuring what you think you are measuring. The gap between benchmark performance and real-world performance for non-English users is not a rounding error — it is, in documented cases, up to 29% accuracy degradation for non-English queries compared to equivalent English ones. That number comes from Oracle AI researchers who studied RAG consistency across languages in enterprise deployments. Twenty-nine percent. In a medical context, that is not a metric. That is a patient safety issue. Where Exactly Does Cross-Lingual RAG Break? The failure is not in one place. It cascades across all three stages of the RAG pipeline, which makes it particularly difficult to diagnose and fix. At Retrieval Most embedding models used in production RAG systems are trained predominantly on English corpora. When a Tamil or Arabic query is embedded, it enters a vector space whose geometry was shaped by English semantics. The nearest neighbors retrieved may appear topically related but carry subtle semantic misalignments that compound downstream. Amazon AGI's XRAG benchmark, published in 2026, was one of the first systematic evaluations of this failure mode. Their findings were stark: in monolingual retrieval settings, where an English knowledge base serves non-English queries, all evaluated models struggled with response language correctness. The system retrieved the right document. It still got the answer wrong. At Augmentation The naive fix — retrieve documents in the user's language alongside English documents and concatenate them into the context — introduces a different problem. A French document and a Hindi document about the same topic may express subtly different facts, use different cultural reference points, or carry implicit contradictions that the model has no mechanism to resolve. Concatenation without alignment is not multilingual RAG. It is multilingual noise. At Generation This is the most insidious failure mode. Research has consistently shown that large language models tend to reason internally in English even when processing non-English inputs. The model receives a Tamil query, retrieves relevant context, and then effectively thinks in English before generating a Tamil response. Cultural grounding, local conventions, and contextual meaning are lost at the final and most consequential step. The result is a response that may be grammatically correct in Tamil but conceptually rooted in English assumptions — wrong units of measurement, unfamiliar care protocols, culturally inappropriate framing. The Research Is There. The Attention Is Not. A small but growing body of research is directly addressing this problem. It deserves far more attention than it is currently receiving in mainstream AI engineering conversations. XRAG (Amazon AGI, 2026) introduced one of the first dedicated benchmarks for cross-lingual RAG evaluation, covering monolingual and multilingual retrieval scenarios with relevancy annotations per retrieved document. Their finding that cross-lingual reasoning — not just language generation — is the core challenge reframes the problem in an important way. This is not a translation problem. It is a reasoning problem. CroSearch-R1 (Beijing Jiaotong University/Université de Montréal, SIGIR 2026) proposed using reinforcement learning, specifically Group Relative Policy Optimization (GRPO), to dynamically align multilingual knowledge during retrieval. Rather than treating documents in different languages as competing contexts, their framework integrates them as complementary evidence. Results showed measurable improvements in cross-lingual RAG effectiveness across multiple language pairs. CrossRAG (University of Edinburgh, EACL 2026) took a different approach — translating retrieved documents into a common language before generation rather than translating the query before retrieval. Their experiments showed that this document-side translation strategy significantly outperforms query-side translation, particularly for low-resource languages, because it preserves the semantic richness of retrieval while giving the generation model a consistent linguistic context to reason over. BordIRLines (ACL 2025) introduced a dataset of territorial disputes across 49 languages to study cross-lingual RAG robustness in culturally sensitive scenarios. Their finding that retrieving multilingual documents actually improves response consistency over monolingual retrieval — when done correctly — is an important signal that the solution lies in better multilingual architecture, not in defaulting to English-only retrieval. Together, these papers paint a clear picture: the problem is real, measurable, and solvable. What is missing is the engineering community treating it as a first-class concern. Who Is Actually Affected The framing of this as a technical NLP problem undersells its human stakes. Consider the populations for whom English-centric RAG is not an inconvenience but a genuine barrier: A patient in rural Tamil Nadu queries a hospital AI system about post-surgery medication. A student in rural Nigeria is trying to use an AI tutoring system to access global research in Yoruba. A refugee querying a legal AI system about asylum rights in their native Dari. A farmer in rural India is asking an agricultural advisory AI about crop disease treatment in Marathi. In every one of these cases, a RAG system that was benchmarked at 90%+ accuracy in English may be operating at 60-70% accuracy in the language that actually matters to the user. The people least able to absorb the consequences of AI errors are the ones most exposed to them. This is not an edge-case population. Over 6.5 billion people speak a language other than English as their primary language. The majority of the world is the edge case in most RAG deployments. What Good Cross-Lingual RAG Looks Like The research points toward a few clear architectural principles for building RAG systems that work equitably across languages. Shared semantic embedding spaces over language-specific ones. Models like mE5, LaBSE, and multilingual-E5-large represent meaningful progress here — they map semantically equivalent content across languages into nearby regions of vector space, reducing the retrieval gap for non-English queries without requiring query translation. Explicit cross-lingual knowledge alignment rather than naive concatenation. The CroSearch-R1 approach of using RL to integrate multilingual evidence as complementary knowledge is a significant step forward. The goal is a retrieval-augmented context that is linguistically unified before the generation model ever sees it. Document-side translation over query-side translation when translation is necessary at all. CrossRAG's findings suggest that translating retrieved documents into a common language preserves more semantic fidelity than translating the user's query into English. This is counterintuitive but empirically supported. Culture-aware generation as a design goal, not an afterthought. Language and culture are not separable. A RAG system that generates linguistically correct but culturally inappropriate responses has not solved the problem — it has reframed it. A Proposal Worth Exploring The building blocks for genuinely equitable cross-lingual RAG exist today. What does not yet exist is an intentional, end-to-end architecture that assembles them with language equity as a first-class design principle rather than a post-hoc consideration. We call this architectural vision PolyRAG — a framework that coordinates multilingual semantic retrieval, reinforcement learning-based cross-lingual knowledge fusion, and culture-aware generation into a unified pipeline. The goal is not to make RAG work slightly better in non-English languages. It is to eliminate the architecture-level reasons why it fails in the first place. Each of the three components draws from independently validated research. What remains is the engineering work of intentionally combining them, rigorously benchmarking them across low- and high-resource language pairs, and releasing the results openly so the broader community can build on them. The Conversation We Should Be Having The RAG community has done extraordinary work optimizing retrieval latency, chunk strategies, reranking approaches, and hallucination reduction. Almost all of it assumes English. The question worth asking in 2026 is simple: what would RAG look like if we designed it for everyone from the start?

By Janani Annur Thiruvengadam DZone Core CORE
GenAI Isn't Solving the Problem Most Development Teams Actually Have
GenAI Isn't Solving the Problem Most Development Teams Actually Have

It was an afternoon when one of our reconciliation flows started throwing NullPointerExceptions in production. The fix, once we found it, was two lines. Finding those two lines took nearly six hours. Three engineers and endless log grepping. Tracing through an integration application with JSF UI that predates most of the libraries we take for granted today. No modern APIs exposed. No clean service boundary to isolate the problem. Just a chain of legacy integration points that required someone to hold the full mental map of the system in their head to understand what was breaking where. Six months into using generative AI (GenAI) extensively in software delivery, I keep coming back to those production bugs. They still feel like the norm, not the exception. The Conference Demo vs. Your Monday Morning The keynote demos are genuinely impressive. The presenter opens a laptop, talks to an AI assistant, and a fully functional web service materializes. The crowd cheers. It looks like the future. And it is — for some problems. But most of the development work I do, the work I actually live in, does not look like spinning up a new service. It looks like debugging a reconciliation failure buried inside a legacy integration platform. It looks like a cron batch job that failed silently and left no useful trace. It looks like a data transformation bug hidden four systems deep in a component last documented sometime around 2018 by somebody who no longer works for the organization. There are still systems working with XMLs. The demos are not wrong. They just solved a problem that I rarely have. The Productivity Paradox After six months with GenAI tools, I have landed on something that feels uncomfortably true: Developers were never slow at writing code. They were slow at figuring out where to write it. Think about the last bug you fixed. How long did the actual code change take once you understood the problem? Five minutes? Ten? The hours were spent elsewhere. Reading unfamiliar code. Tracing execution paths. Building a mental model of the system. Validating assumptions. Eliminating false leads. The bottleneck was understanding — cognitive overhead. Not typing. GenAI has undeniably accelerated code generation. Boilerplate code, unit tests, documentation drafts, simple feature implementations — all of these can now be produced significantly faster than before. I’ve seen feature development accelerate in my own team. But the dominant bottleneck in most enterprise systems was never code generation. It was comprehension. The hidden dependencies. The opaque patterns that only become visible after somebody has spent enough time on a codebase to develop intuition about it. That part stayed exactly where it was. We got dramatically faster at the part that was already manageable. A Real Example: When Context Lives Outside the Code Recently, I investigated a reconciliation issue in which transactions appeared incorrect in the UI. The AI was genuinely useful at first. It quickly highlighted suspicious code paths within the application and suggested plausible root causes. None of them were entirely correct. After several hours, we eventually traced the issue through an integration database, a legacy transformation component, and an upstream system that communicated through this legacy integration layer. The component predated modern API standards, and I could not write any MCP or connect to its database in any way. The final code change took minutes. Understanding the path the data had taken through the system consumed most of the day. The interesting lesson was not that the AI failed. The interesting lesson was that the AI only had visibility into part of the problem. The knowledge required to solve the issue lived across multiple systems, integration layers, historical design decisions, and several teams. The real challenge here was reconstructing context. GenAI Amplifies What Already Exists One thing surprised me during the last six months. GenAI amplifies whatever it touches. Not just the good parts. If your applications are well documented, integration contracts are clear, and operational practices are mature, AI can deliver meaningful productivity gains. The model has enough context to reason effectively. If your environment contains technical debt, unclear ownership boundaries, fragile deployment processes, or institutional knowledge locked inside a handful of experienced engineers, AI surfaces those weaknesses just as effectively. I’ve seen AI-generated changes pass every unit test and then fail immediately when they hit an integration layer. The problem wasn’t the generated code. The problem was that the integration contract existed only in somebody’s head. That issue existed long before GenAI arrived. AI simply made it impossible to ignore. GenAI has become a forcing function for problems that were always there. Pipeline complexity, team silos, missing runbooks — they did not appear with AI. In many ways, GenAI acts like an amplifier. Good engineering practices become more valuable. Weak engineering practices become more visible. If you were already carrying process debt, GenAI lands in that environment and makes the debt harder to defer. This observation aligns closely with findings from the DORA research program. The report has increasingly highlighted that AI’s impact is heavily influenced by the surrounding engineering system rather than the tooling itself. Maybe We’ve Been Measuring The Wrong Thing Most AI productivity discussions focus on output. Lines of code written. Features completed. Pull requests merged. Like someone was telling me that they had written a service while they were on a train journey. But enterprise software rarely fails because developers could not type quickly enough. It fails because knowledge is fragmented. Teams cannot trace dependencies. Integration contracts are undocumented. Critical operational knowledge exists only inside someone’s head. If GenAI is teaching us anything, it may be that software engineering productivity has always been a knowledge-management problem disguised as a coding problem. The more I work with AI, the more I suspect that this distinction matters. The Real Unlock: Context, Not Code I have been trying to articulate where productivity genuinely improves. The best answer I have today is simple: Productivity is fewer hours spent chasing failures you cannot explain. The contained problems — bugs where the scope is clear and the relevant context is available — are noticeably faster to solve. Describe the issue, provide the relevant code, and prompt right. The AI often gets you surprisingly close to the answer. The difficult problems are different. A failure appears in one system but originates three hops upstream. The relevant context is spread across source code, operational dashboards, integration platforms, databases, support tickets, and undocumented assumptions accumulated over years. AI cannot reason about information it cannot access. The real opportunity is not generating more code. It is about It is helping developers navigate complexity and reducing the search space. Surfacing hidden patterns. Making system behavior more understandable. That is fundamentally a context problem. Not a coding problem. The Bottleneck Simply Moves Another pattern I’ve observed is that AI often shifts bottlenecks rather than eliminating them. Feature development that previously took several days may now take hours. But deployment approvals, compliance reviews, security assessments, testing processes, release windows, and governance controls have not accelerated at the same pace. The queue simply moves downstream. Code gets generated faster. Organizational throughput does not necessarily improve. In highly regulated environments, the distance between “code written” and “code deployed” often remains the dominant constraint. If anything, AI has made that gap more visible. The development teams can move faster. The surrounding delivery system frequently cannot. It is not uncommon for releases to be moved because of end-of-month activities or business peak activities. The Next Enterprise Problem: Agent Sprawl Another challenge is already emerging. For years, organizations struggled with tool sprawl. Now we are creating agent sprawl. One team uses GitHub Copilot. Another uses Claude and someone using a cursor. Someone else is experimenting with MCP servers, vector databases, and workflow orchestration platforms. Teams are building custom internal agents to help the platform. Every agent develops its own context boundary, permissions model, knowledge source, and operational behavior. Over time, this begins to look familiar. We have seen similar patterns before with microservices, cloud platforms, and integration technologies. Multiple solutions for the same problem with slight variations. Without standardization, governance, and ownership, complexity grows faster than value. Access to AI is no longer the limiting factor; integration and consistency increasingly are. What This Means For Engineering Teams The AI story is becoming less about individual tools and more about organizational architecture. The teams that will benefit most from GenAI are unlikely to be the teams with the most sophisticated prompts. They will be the teams that can feed AI enough and have invested in: Strong observabilityWell-defined integration contractsDiscoverable architectural knowledgeConsistent engineering practices In other words, the foundations that good engineering organizations have always needed. The uncomfortable reality is that AI cannot compensate for missing context. It can only amplify whatever context already exists. And I do not have a clean answer for how you achieve that in a large organization. I am not sure anyone does yet. Final Thoughts I still believe GenAI represents a genuine shift in software engineering. I am faster than I was a year ago. My team is faster than it was a year ago. Many activities that once felt repetitive now take a fraction of the time. That is real progress. But I still think about that production issue. Six hours searching for two lines of code. Three engineers on a call. A JSF application that none of us originally designed. A chain of integration points that only became understandable after piecing together knowledge from multiple systems and multiple people. GenAI did not change that day. It could not. The dominant problem was never code generation. The problem was that the knowledge required to understand the system was fragmented and largely invisible. The biggest productivity gains over the next few years may not come from better models. They may come from making systems more legible. Better observability. Clearer integration contracts. Architectural decisions that live somewhere other than inside somebody’s head. GenAI did not create the need for those things. It simply made it much harder to pretend they were optional. Have you seen a similar pattern in your organization? Where does AI work brilliantly for contained problems but struggle at integration boundaries? I’d be interested in hearing what has worked — and what hasn’t — in your teams. References DORA Report 2025 — State of AI-assisted Software DevelopmentParadis et al. — How much does AI impact development speed? An enterprise-based randomized controlled trial (Google)Afroz et al. — Developer Productivity with GenAI

By Gaurav Gaur DZone Core CORE
When Your Documentation Manages Itself: mdship and AI-Assisted Markdown
When Your Documentation Manages Itself: mdship and AI-Assisted Markdown

If you write technical documentation in markdown, you already know the tension: some parts of your document are hand-written prose, while others — a table of contents, an included code snippet, a rendered diagram — are generated from somewhere else. How you handle that boundary says a lot about your workflow. Most documentation toolchains resolve it the same way preprocessors like PET or Jamal do: separate the source from the output. You maintain a template file, run a build step, and get a rendered document as the result. Clean, predictable, and easy to reason about — but it adds a build step, and the output file is not the thing you actually edit or share. mdship takes a different approach. It is a command-line tool and MCP server that edits your markdown in place: it reads the file, updates specific sections, and writes the result back to the same file. Everything else — your prose, your headings, your structure — is untouched. No separate output file, no build pipeline. The document you see is the document you ship. Think of it less like a preprocessor and more like a very opinionated editor that knows how to regenerate a table of contents, pull in a code snippet from another file, or render a Mermaid diagram — all within the file you are already editing. One File: The Trade-Off Working in a single file has real advantages for technical writers. The managed content — including snippets, generated TOC entries — is visible inline while you are editing. You can read the full document as your readers will see it, without switching to a preview mode or running a build. There is no output file to track separately, and markdown-aware tools like GitHub or your IDE render it correctly wherever it lives. The downside is equally real: because managed and hand-written content share the same file, it is easy to accidentally edit a section that is meant to be regenerated. You fix a typo in an included code snippet; on the next run, your fix is gone. You add a note inside a generated TOC block; mdship overwrites it without warning. Preprocessor tools sidestep this entirely. The source is one file, the output is another, and you never edit the output directly. The separation of concerns is clean. But you pay for it: every change requires a build step, the output is not portable without that step, and contributors who are not familiar with the toolchain may not know which file to edit. Neither model is universally better. mdship makes the pragmatic choice that for most documentation workflows, a single file with good guardrails beats a clean architecture that requires a build. Content Integrity: The Guardrail The guardrail is a checksum. Every time mdship writes content into a managed section — a TOC block, an INCLUDE block, a MERMAID block — it records a checksum of that content inside the opening placeholder marker, under a key called _content_generated_. On the next run, before overwriting anything, it verifies that the checksum still matches. If it does not, mdship stops and reports an error instead of silently discarding your edits. Plain Text ERROR: Placeholder TOC content was manually edited. Hash mismatch detected. Delete _content_generated_ line to override and accept data loss. This turns an accidental overwrite — which would otherwise be invisible until you notice the missing content — into an explicit decision. You can delete the _content_generated_ line to tell mdship "I know, proceed anyway," or you can pass --force on the command line to skip the check for a single run. Either way, you are opting in, not being surprised. AI-Generated Sections: The Same Idea, Extended The same pattern extends naturally to sections written by an LLM. mdship supports an <!--AI--> placeholder: an HTML comment embedded in the markdown file that contains a prompt. When you invoke the /ai-placeholder skill in Claude Code, it reads the prompt and writes the generated content between the opening and closing markers — directly into the file, in place, just like any other mdship operation. The workflow has three steps, enforced by the skill: Check: before writing anything, the skill calls mdship ai-check via MCP to verify that the existing content has not been manually edited since it was last generated. If the checksum does not match, the skill stops and reports the conflict to you rather than overwriting your edits.Generate: if the check passes (or there is no checksum yet, meaning the section is new), the LLM reads the prompt and writes the content.Seal: after writing, the skill calls mdship ai-fix via MCP to record a new checksum for the freshly generated content, protecting it against accidental edits until the next intentional update. The MCP integration means these calls happen automatically, as part of the skill's defined behavior — not as something the LLM has to remember to do. The Prompt Is Documentation, Too There is a subtler benefit to this approach that is easy to overlook. The prompt that instructs the LLM remains embedded in the file as a non-rendered HTML comment, right above the content it produced. It does not live in a commit message, a Jira ticket, or a separate prompt library that may be hard to find six months later. It is part of the document. This has practical consequences. If you need to regenerate a section — because the underlying API changed, or a referenced file was updated, or you simply want a fresh pass — you re-run the same prompt against the same file. The instruction is already there; you do not have to reconstruct it. The prompt can also reference external files: other documentation pages, source code, configuration files. If those change, rerunning the prompt automatically picks up the changes. The document becomes self-updating in the sense that the machinery to update it is built in. Conclusion mdship's in-place editing model and its LLM integration are two expressions of the same design choice: keep everything in one file, protect it with checksums, and let the tooling manage the regeneration cycle rather than the author. For technical writers, this means fewer context switches, no build step, and a document that carries both its content and the instructions for maintaining that content in a single portable file. The trade-off — shared space for managed and hand-written content — is managed by the checksum guardrail, which turns silent overwrites into explicit decisions. Whether the content is generated by mdship itself or by an LLM following an embedded prompt, the contract is the same: write it, seal it, and trust that the next update will ask before it overwrites.

By Peter Verhas DZone Core CORE
From printTriangularNumber to Duff’s Device: Mastering Java Switch Statements Old and New
From printTriangularNumber to Duff’s Device: Mastering Java Switch Statements Old and New

In this blog post, we will see how the humble Java switch statement evolved from a fall-through curiosity into a powerful expression, and how understanding its mechanics unlocks classic techniques like Duff's Device. Java's switch statement has evolved from a fall-through-prone construct into a modern expression syntax introduced in Java 14. The post traces this evolution using a concrete example, a method that computes triangular numbers by intentionally allowing execution to cascade through cases without break statements. The post also connects this behavior to Duff's Device, a 1983 loop-unrolling technique that uses deliberate fall-through to handle remainder elements before processing full blocks. A comparison of old and new switch syntax outlines trade-offs, and practical guidance is offered on when each form is appropriate. The Accidental Discovery I was prepping for the OCP Java 21 exam and stumbled across a tricky question. A method named question2 used a switch statement without any break statements. The output surprised me at first. Once I traced through it, I renamed the method to printTriangularNumber. That one rename told the whole story. This post dives into why. The Old Switch Statement The traditional switch statement has been part of Java since day one. The syntax looks like this: Java int day = 3; switch (day) { case 1: System.out.println("Monday"); break; case 2: System.out.println("Tuesday"); break; case 3: System.out.println("Wednesday"); break; default: System.out.println("Unknown"); break; } As shown above, every case ends with a break. Without it, execution does not stop. It keeps going into the next case. The old switch works on int, char, String, and enum types. Fall-Through: Feature or Bug? The most misunderstood behavior in switch is fall-through. When you omit break, execution literally falls into the next case. Java int x = 2; switch (x) { case 3: System.out.println("three"); case 2: System.out.println("two"); // jumps here case 1: System.out.println("one"); // falls through default: System.out.println("done"); // falls through } Output: Plain Text two one done Most developers treat this as a bug waiting to happen. They are not wrong. Forgetting a break is one of the most common Java mistakes. But intentional fall-through is a different story. It is a deliberate tool. And printTriangularNumber is the perfect example. printTriangularNumber: Fall-Through in Action Here is the method I renamed from question2 during my OCP prep: Java private static void printTriangularNumber(int n) { int res = 0; switch (n) { case 5: res += 5; case 4: res += 4; case 3: res += 3; case 2: res += 2; case 1: res += 1; default: break; } System.out.println(res == 0 ? "Ok, bye." : res); Let us trace through n = 4: Jumps to case 4, adds 4. res = 4 Falls to case 3, adds 3. res = 7 Falls to case 2, adds 2. res = 9 Falls to case 1, adds 1. res = 10 Hits default, breaks Output: 10 The pattern for each input: nResultFormula111232+1363+2+14104+3+2+15155+4+3+2+1 This is n * (n + 1) / 2, the triangular number formula. The fall-through is doing the summation for you. Each case accumulates the remaining values by simply not stopping. For n = 0 or any value above 5, no case matches, default fires immediately, and res stays 0. The ternary prints "Ok, bye.". I personally find it a beautiful example of using language semantics intentionally. This is also the kind of question the OCP exam loves to throw at you. The New Switch Expression (Java 14+) Java 14 introduced switch expressions as a standard feature. The arrow syntax -> eliminates fall-through entirely. Each arm is independent. Java int day = 3; String name = switch (day) { case 1 -> "Monday"; case 2 -> "Tuesday"; case 3 -> "Wednesday"; default -> "Unknown"; }; System.out.println(name); // Wednesday A few things to notice here: Switch is now an expression. It returns a value. The arrow -> replaces : and break together. No fall-through. Each arm executes independently. Multiple labels on a single arm: case 1, 7 -> "Weekend"; You can also use it inline: Java System.out.println(switch (day) { case 1, 7 -> "Weekend"; default -> "Weekday"; }); Much cleaner. Much safer. Switch Expressions With Yield Sometimes you need more than a single expression in an arm. That is where yield comes in. Java int n = 4; int result = switch (n) { case 1, 2 -> n * 10; case 3, 4 -> { int temp = n * n; System.out.println("Computing for: " + n); yield temp; // return value from block } default -> 0; }; System.out.println(result); // 16 Think of yield as the return statement for a switch block arm. You need it whenever the arm has multiple statements inside {}. A common mistake is using return instead of yield inside a switch expression block. That compiles only inside a method and it returns from the entire method, not just the switch. Always use yield inside switch expression blocks. Duff's Device: Fall-Through Taken to the Extreme Now that we understand fall-through well, let us look at the most famous intentional use of it: Duff's Device. Tom Duff invented this in 1983 to speed up memory copy operations by reducing loop branch overhead. The trick is to unroll the copy loop and use a switch to jump into the middle of it based on the remainder. In Java, we replicate it in two clean phases since Java does not allow interleaved switch+loop syntax: Java public static void duffCopy(int[] src, int[] dst, int n) { int i = 0; int rem = n % 4; // Phase 1: handle remainder via fall-through switch (rem) { case 3: dst[i] = src[i]; i++; case 2: dst[i] = src[i]; i++; case 1: dst[i] = src[i]; i++; case 0: break; } // Phase 2: full blocks of 4 int fullBlocks = (n - rem) / 4; while (fullBlocks-- > 0) { dst[i] = src[i]; i++; dst[i] = src[i]; i++; dst[i] = src[i]; i++; dst[i] = src[i]; i++; } } Let us trace through n = 13: rem = 13 % 4 = 1 Switch jumps to case 1, copies 1 element. i = 1 fullBlocks = (13 - 1) / 4 = 3 Loop runs 3 times, copying 4 elements each time Total: 1 + 12 = 13 elements The Python equivalent makes the two phases explicit: Python def duff_copy(src, n): dst = [None] * n rem = n % 4 for i in range(rem): # Phase 1: remainder dst[i] = src[i] i = rem while i < n: # Phase 2: full blocks dst[i] = src[i] dst[i+1] = src[i+1] dst[i+2] = src[i+2] dst[i+3] = src[i+3] i += 4 return dst The connection to printTriangularNumber is direct. Both use fall-through intentionally. In printTriangularNumber, the switch jumps to the right case and accumulates downward. In Duff's Device, the switch jumps to the right case and copies the remainder before the main loop takes over. Old vs. New Switch at a Glance FeatureOld Switch (:)New Switch (->)Fall-throughYes (default)NoReturns valueNoYesbreak neededYesNoMultiple labelsNoYes (case 1, 2 ->)Block with yieldNoYesNull safeNoYes (Java 21 preview)OCP exam topicYesYes Which One Should You Use? For new code, always prefer the switch expression with ->. It is safer, cleaner, and expressive. Your reviewers will thank you. Reserve the old switch with fall-through only when you genuinely need the cascading behavior, like in printTriangularNumber or a hand-tuned loop like Duff's Device. In those cases, add a comment explaining the intent. Otherwise, the next developer (including future you) will assume the break is missing by accident. My personal observation: the OCP Java 21 exam tests both heavily. Knowing when fall-through is intentional versus accidental is the key distinction examiners probe. Make sure you can trace through any switch block without running it. Happy testing! What is your take: is intentional fall-through clever engineering or a maintenance nightmare waiting to happen? Drop your thoughts below!

By NaveenKumar Namachivayam DZone Core CORE
Context Rot: Why Your AI Agent Gets Worse the Longer It Works
Context Rot: Why Your AI Agent Gets Worse the Longer It Works

AI-powered features often behave perfectly during testing and quietly degrade in production. The model has not changed. The prompts have not changed. Latency looks normal. Error rates are clean. Yet the responses gradually feel off, slightly disconnected, missing nuance, referencing things that are no longer relevant to the task at hand. This pattern has a name: context rot. It does not throw exceptions. It does not appear in dashboards. It is one of the more subtle failure modes in production AI systems, and understanding it early makes a meaningful difference in the quality of what gets built. How Attention Works in LLMs To understand context rot, just enough of the underlying mechanic is needed. Before an LLM generates each new token, it looks at every token in the context and decides how much weight to give each one. This is called attention. The key insight: attention scores are normalized, and they sum to 1.0 across all tokens. That means attention is a fixed budget. When the context has 500 tokens, each important piece of information might receive 0.15 or 0.20 of the total attention. When the context has 50,000 tokens, that same important piece might receive only 0.002, even if it is equally critical to the task. Java // Simplified illustration — not actual LLM code public float[] generateNextToken(String[] contextTokens) { float[] scores = new float[contextTokens.length]; for (int i = 0; i < contextTokens.length; i++) { // How relevant is each past token to what we are generating? scores[i] = computeRelevance(currentState, contextTokens[i]); } // Scores must sum to 1.0 — a fixed attention budget float[] weights = softmax(scores); return weightedCombination(contextTokens, weights); } Every token added, relevant or not, slightly dilutes the attention available for everything else. That is the seed of context rot. Context Position and Attention A well-known multi-document question-answering experiment revealed something that should give every engineer building AI systems reason to pause. The correct answer was hidden at different positions across a long context, and retrieval accuracy was measured purely by position: Answer at the beginning: ~75% accuracyAnswer at the end: ~72% accuracyAnswer in the middle: ~55% accuracy A 20 percentage point drop caused entirely by where the information sat, not by its quality or relevance. The information was present. The model could technically see it. It simply was not attending to it properly. This is known as the Lost-in-the-Middle effect. It is an emergent architectural property of the transformer training process itself. Models learn to attend strongly to the beginning and end of their inputs, where the most signal-dense content tends to appear in human writing. The middle of a long context becomes an attention dead zone as a natural consequence of how these models are trained, not as an oversight. Does this still apply to modern models? The honest answer is: yes, with important nuance. Newer models have largely resolved the effect for simple factoid retrieval — finding a specific fact at a specific position in a long context is something recent architectures handle well. The problem persists, and arguably intensifies, on multi-step reasoning tasks where the model must synthesize information across several documents simultaneously. That is precisely the category most production AI systems fall into, so the practical risk remains significant even as benchmark numbers improve. What Context Rot Looks Like in Practice Scenario 1: The wandering coding agent. An agent is asked to fix a bug. It reads 15 files, explores 3 wrong leads, and backtracks. Each file, each search result, each dead end accumulates in context. By the time the agent finds the right file, buried in the middle of 20,000 tokens, attention is spread thin. The analysis of the one file that actually matters is noticeably weaker than it would have been with a clean context. Scenario 2: The RAG pipeline that drifts. A retrieval pipeline fetches 10 document chunks per query, roughly 5,000 tokens. For most queries, this works fine. But longer queries trigger larger system prompts and conversation history. Total context grows to 40,000 tokens, and the documents retrieved third and fourth, sitting in the middle, fall into the attention dead zone. The model answers confidently, drawing on what it can see well. A crucial nuance from chunk 4 gets missed. The pattern is always the same: no error, no warning, just answers that are subtly less accurate than they should be. How to Detect It Step 1: Log context length alongside every LLM call. What cannot be measured cannot be managed. Step 2: Run a positional accuracy test. Place a key fact at different positions in a realistic context and check whether the model retrieves it correctly. Java public void positionalAccuracyTest(LlmClient client, String keyFact, String fillerText) { double[] positions = {0.1, 0.5, 0.9}; // beginning, middle, end for (double pos : positions) { int split = (int) (fillerText.length() * pos); String context = fillerText.substring(0, split) + "\nKEY: " + keyFact + "\n" + fillerText.substring(split); String response = client.complete(context, "Summarise the most important information from the context."); boolean found = response.toLowerCase().contains(keyFact.toLowerCase()); System.out.printf("Position %d%%: %s%n", (int)(pos * 100), found ? "RECALLED" : "MISSED"); } } If the model passes at 10% and 90% but fails at 50%, context rot is measurable in that system at that context length. Step 3: Alert on context length thresholds. Set a warning at around 50,000 tokens and a hard alert at 100,000. These are starting points — the positional accuracy test above will help calibrate the right numbers for a specific model and task type. Context Rot Is Also a Cost Problem Most conversations about context rot focus on quality, and rightly so. But at any meaningful scale, it is equally a financial problem, and that dimension tends to get overlooked until the infrastructure bill arrives. LLM providers charge by the token. Every token in the context window is billed on every single call. A context that has grown to 80,000 tokens costs roughly 8x more per call than one held at 10,000 tokens, for the same task, often with worse output quality. That is not a trade-off; it is strictly worse in both dimensions simultaneously. The exact cost per token varies by provider and model tier, but the ratio holds universally — longer context means a proportionally larger bill. The compute reality makes this more pronounced. Transformer attention scales quadratically with context length. Doubling the number of tokens does not double the compute required; it roughly quadruples it. At low volumes, this is invisible. With millions of calls per day, it becomes one of the largest line items in an AI system's operating cost. The numbers are illustrative, but the ratio is the point. A context that has grown to 80,000 tokens costs roughly 8x more per call than one held at 10,000 tokens, for the same task, often with worse output quality. That is not a trade-off; it is strictly worse in both dimensions simultaneously. Context rot at scale is not a quality inconvenience. It is a budget problem. Compaction, precise retrieval, and subagent isolation are not just engineering best practices; they are cost controls. 4 Practical Mitigations 1. Compact early — do not wait until quality degrades. Summarize older conversation turns before the context gets large, not after the damage is done. Java public List<Message> compactIfNeeded(List<Message> messages, LlmClient client) { int limit = 30_000; if (estimateTokens(messages) < limit) return messages; // Need at least a system prompt + messages to summarise + recent turns if (messages.size() < 7) return messages; // Everything except system prompt and last 5 turns List<Message> older = messages.subList(1, messages.size() - 5); String summary = client.complete("Summarise concisely: " + format(older)); List<Message> compacted = new ArrayList<>(); compacted.add(messages.get(0)); // system prompt compacted.add(new Message("system", summary)); compacted.addAll(messages.subList(messages.size() - 5, messages.size())); return compacted; } 2. Use subagents for exploration. When an agent needs to search or explore, do it in a dedicated subagent with its own context window. Only the compact result, not the exploration trace, returns to the parent agent. Noise stays isolated. 3. In RAG, retrieve less and rerank. Three precisely relevant chunks consistently outperform ten loosely relevant ones. Retrieval quantity does not equal retrieval quality. Fetch a wider candidate set, rerank by relevance, and pass only the top results to the model. 4. Position critical content deliberately. Given what is known about the attention curve, the most important context belongs at the beginning or end, not sandwiched in the middle. The system prompt and the current user query naturally occupy those positions. Keep them there, and be intentional about what fills the space between. What This Means at Each Level For early-career engineers: when an AI feature works in local testing but feels off in production, check context length first. Adding llm.context_tokens to an observability stack, alongside latency and error rate, is a small change with a meaningful signal. For tech leads and architects: context is not a free resource. Every design session for an LLM-powered feature should include a clear answer to "what is in this context window and why?" If that question cannot be answered clearly, the design is incomplete. For engineering managers and leaders: context rot does not appear in standard dashboards. Error rate and latency can look perfectly healthy while response quality silently degrades. Correlating context length with downstream quality metrics, task success rates, and user satisfaction is the monitoring work that production AI systems now require. Conclusion Context rot is one of those concepts that feels advanced until it is encountered in production, and then it feels like something that should have been understood from day one. The core reality is simple: transformer attention is a finite, dilutable resource. Every token added to a context window reduces the focus available for everything else. When contexts grow long, and important information ends up in the middle, quality degrades in ways that are real, measurable, and unfortunately silent. The good news is that it is manageable. Compact early. Isolate exploration into subagents. Be precise with retrieval. Position critical content deliberately. None of these requires advanced machine learning knowledge; they are engineering disciplines applied to a new kind of resource. The mental model that tends to help most is treating context the way experienced engineers treat memory: allocate it deliberately, release what is no longer needed, and keep the working set small and focused. The models are already capable of doing remarkable work, if given a clean signal and kept free of noise.

By Vineet Bhatkoti
Top Java Security Vulnerabilities and How to Prevent Them in Modern Java
Top Java Security Vulnerabilities and How to Prevent Them in Modern Java

With the increasing number of security threats, organizations have invested heavily in cybersecurity initiatives to protect their applications, infrastructure, and sensitive data. Security vulnerabilities are rarely introduced intentionally. Most of them creep into applications through shortcuts, overlooked edge cases, outdated libraries, or some bad coding habits. Modern Java has significantly improved its security capabilities, but no framework or JVM version can completely protect an application from insecure coding practices. As developers, we still need to understand where vulnerabilities originate and how to prevent them before they reach production. In this article, I am trying to summarize some of the most common Java security vulnerabilities and practical techniques used to prevent them. These are the same security best practices and lessons learned that I frequently share with new team members joining my team. I am sharing them here in the hope that they can serve as a practical handbook for Java developers looking to build more secure applications. 1. SQL Injection SQL injection remains one of the oldest and most dangerous vulnerabilities. It occurs when user input is directly concatenated into SQL statements. Consider the following example: Java String query = "SELECT * FROM users WHERE username = '" + username + "'"; Statement stmt = connection.createStatement(); ResultSet rs = stmt.executeQuery(query); If an attacker enters, the query can be manipulated to return unintended results. SQL admin' OR '1'='1 Prevention Always use parameterized queries. Java String query = "SELECT * FROM users WHERE username = ?"; PreparedStatement stmt = connection.prepareStatement(query); stmt.setString(1, username); ResultSet rs = stmt.executeQuery(); Prepared statements separate data from executable SQL, eliminating injection opportunities. 2. Hardcoded Secrets One of the most common findings during security reviews is hardcoded credentials. Java private static final String API_KEY = "abcd123456789"; This may seem harmless during development, but once committed to source control, secrets often remain exposed indefinitely. Prevention Store secrets externally. SQL String apiKey = System.getenv("PAYMENT_API_KEY"); Better alternatives are to include it in AWS Secrets Manager, Azure Key Vault, HashiCorp Vault, or Kubernetes Secrets. Secrets should never live inside source code repositories. 3. Insecure Deserialization Java serialization has been responsible for numerous security incidents. Example: Java ObjectInputStream input = new ObjectInputStream(request.getInputStream()); Object obj = input.readObject(); The danger is that attackers can craft malicious serialized objects that execute unexpected code during deserialization. Prevention Avoid Java serialization whenever possible. Prefer formats such as JSON, XML (with secure parsing), or Protocol Buffers. Example using Jackson: Java ObjectMapper mapper = new ObjectMapper(); User user = mapper.readValue(json, User.class); Using structured formats reduces attack surfaces significantly. 4. Cross-Site Scripting (XSS) Although often associated with front-end applications, backend services can accidentally enable XSS vulnerabilities when user-generated content is returned without sanitization. Example: Java String comment = request.getParameter("comment"); response.getWriter().write(comment); If the user submits, the browser executes the script. HTML <script>alert('Hacked')</script> Prevention Always encode output. Using Spring: Java String safeComment = HtmlUtils.htmlEscape(comment); Additionally, validate inputs, sanitize rich text, and implement Content Security Policies (CSP). 5. Path Traversal Attacks File download functionality often introduces path traversal vulnerabilities. Example: Java String file = request.getParameter("file"); Path path = Paths.get("/documents/" + file); An attacker could submit and potentially access sensitive files. Shell ../../../etc/passwd Prevention Normalize and validate paths. Java Path base = Paths.get("/documents"); Path resolved = base.resolve(file).normalize(); if (!resolved.startsWith(base)) { throw new SecurityException( "Invalid file path"); } Never trust file names coming directly from user input 6. Weak Password Storage Storing passwords improperly remains surprisingly common. Bad practice: Java String passwordHash = DigestUtils.md5Hex(password); MD5 and SHA-1 are no longer considered secure for password storage. Prevention Use adaptive hashing algorithms. Example with BCrypt: Java BCryptPasswordEncoder encoder = new BCryptPasswordEncoder(); String hash = encoder.encode(password); BCrypt automatically includes salting and work-factor adjustments. Other strong alternatives include Argon2, PBKDF2 or SCrypt 7. Dependency Vulnerabilities Modern Java applications often contain more third-party code than custom code. A secure application can still become vulnerable because of outdated dependencies. Prevention Integrate dependency scanning into CI/CD pipelines. Example Maven plugin: XML <plugin> <groupId>org.owasp</groupId> <artifactId>dependency-check-maven</artifactId> </plugin> Additionally, tools such as Snyk can automatically identify known vulnerabilities. We have been using Snyk for the last couple of years, and it is effective. Regular dependency updates should be part of every release cycle. 8. Improper Logging of Sensitive Data Developers often log information for troubleshooting without considering security implications. Example: Java logger.info( "Login request received for user={} password={}", username, password); This exposes credentials inside log files. Prevention Mask or exclude sensitive information. Java logger.info( "Login request received for user={}", username); Never log passwords, access tokens, credit card information, Personal health information (PHI), or PII information. This is especially important in regulated industries such as healthcare, like ours. 9. Insufficient Authentication and Authorization Authentication verifies identity, and authorization determines access. Many applications perform authentication correctly but fail to enforce authorization consistently. Example: Java @GetMapping("/admin/users") public List<User> getUsers() { return userService.findAll(); } Without authorization checks, any authenticated user might gain access. Prevention Use role-based security. Java @PreAuthorize("hasRole('ADMIN')") @GetMapping("/admin/users") public List<User> getUsers() { return userService.findAll(); } Security should be enforced at every layer, not just the UI. 10. Lack of Input Validation Many vulnerabilities originate from accepting unexpected input. Example: Java String age = request.getParameter("age"); int userAge = Integer.parseInt(age); Invalid input can cause exceptions or unexpected behavior. Prevention Validate all external input. Java @Min(18) @Max(120) private Integer age; Bean Validation provides a simple and consistent approach for validating request payloads. Never assume user input is safe. Final Thoughts Security is not a feature that can be added at the end of a project. It needs to be part of the development process from the very beginning. The vulnerabilities discussed here are not theoretical. They are among the most common findings during security assessments, penetration tests, and production incident investigations. Fortunately, modern Java provides mature frameworks, libraries, and tools that make secure development significantly easier than it was a decade ago. The key is building security awareness into everyday development practices: Use parameterized queriesProtect secrets properlyValidate all inputsKeep dependencies updatedApply strong authentication and authorizationLog responsiblyContinuously scan for vulnerabilities Security is ultimately about reducing risk. Small improvements applied consistently across a codebase can prevent incidents that would otherwise become expensive lessons later.

By Muhammed Harris Kodavath
Amazon CodeWhisperer to Q Developer to Kiro: The Rise of Agentic Coding
Amazon CodeWhisperer to Q Developer to Kiro: The Rise of Agentic Coding

The Abrupt End of Amazon Q Developer In May 2026, AWS dropped a bombshell: Amazon Q Developer IDE plugins and paid subscriptions will reach end-of-support on April 30, 2027, with new signups blocked as of May 15, 2026. The successor? Kiro — AWS's next-generation AI IDE that reframes how engineers build software from scratch. If you're a backend engineer who has been relying on Q Developer for code completion, inline chat, and security scanning inside VS Code or JetBrains, the clock is ticking. But before you begrudgingly migrate, it's worth understanding why this transition is happening, what Kiro actually offers, and whether the trade-offs are worth it — especially in production backend contexts like microservices, distributed systems, and observability pipelines. Historical Context: From CodeWhisperer to Q Developer to Kiro AWS's AI coding journey started with Amazon CodeWhisperer (launched in preview in 2022), which was a single-model code suggestion tool — think GitHub Copilot, but AWS-native. It supported security scanning against common vulnerability patterns and could suggest AWS SDK calls contextually. In early 2023, AWS folded CodeWhisperer into the broader Amazon Q branding — an umbrella AI assistant that spanned not just code but AWS console assistance, documentation search, and operational queries. Q Developer became the IDE-facing arm of that product. The problem? Q Developer tried to be everything: a coding assistant, a console assistant, a documentation bot, and a security scanner all jammed into one plugin. Feedback from engineering teams consistently pointed to context window limitations, poor multi-file understanding, and weak support for complex backend architectures spanning multiple services. Kiro is AWS's response. Built from the ground up with "spec-driven development" as its core philosophy, Kiro is less of an autocomplete engine and more of an agentic coding environment — it can plan, scaffold, and implement across your entire project tree, not just the file you have open. Architecture Comparison The architectural difference is significant. Q Developer operated in a request-response model where you asked a question or triggered a completion and got a result. Kiro introduces hooks — lifecycle-aware automations that fire when you save a file, open a PR, or change a spec. This is closer to how CI/CD pipelines work, and backend engineers will immediately recognize the paradigm. Feature-by-Feature Breakdown FeatureAmazon Q DeveloperAmazon KiroMulti-file contextLimited (single file primary)Full project treeAgentic task executionNoYes (plan → implement → test)Spec-driven developmentNoYes (SPEC.md driven)MCP integrationNoYes (external tool calls)Security scanningYes (CodeWhisperer rules)Yes (enhanced)JetBrains supportYesYesVS Code supportYesYesAWS Free Tier accessYesYes (via AIdeas Competition)Paid subscription$19/mo (deprecated)Separate Kiro pricingEnd of supportApril 30, 2027Active Production Code Example 1: Spec-Driven Microservice Scaffolding With Kiro One of Kiro's most powerful features is its SPEC.md-driven workflow. Instead of writing code and hoping the AI helps, you write a specification and Kiro implements it. Here's what that looks like for a backend order processing microservice. TypeScript // SPEC.md concept implemented as TypeScript types // Kiro reads your spec and generates this scaffolding import { Logger } from '@aws-lambda-powertools/logger'; import { Tracer } from '@aws-lambda-powertools/tracer'; import { DynamoDBClient, PutItemCommand, GetItemCommand } from '@aws-sdk/client-dynamodb'; import { SQSClient, SendMessageCommand } from '@aws-sdk/client-sqs'; import { marshall, unmarshall } from '@aws-sdk/util-dynamodb'; const logger = new Logger({ serviceName: 'order-service', logLevel: 'INFO' }); const tracer = new Tracer({ serviceName: 'order-service' }); const ddb = tracer.captureAWSv3Client(new DynamoDBClient({})); const sqs = tracer.captureAWSv3Client(new SQSClient({})); interface Order { orderId: string; customerId: string; items: Array<{ sku: string; qty: number; price: number }>; status: 'PENDING' | 'CONFIRMED' | 'SHIPPED' | 'CANCELLED'; createdAt: string; } interface CreateOrderResult { success: boolean; orderId?: string; error?: string; } // Kiro-generated handler with full error handling + structured logging export const createOrder = async ( order: Omit<Order, 'orderId' | 'status' | 'createdAt'> ): Promise<CreateOrderResult> => { const segment = tracer.getSegment(); const subsegment = segment?.addNewSubsegment('createOrder'); try { const orderId = `ORD-${Date.now()}-${Math.random().toString(36).slice(2, 7).toUpperCase()}`; const newOrder: Order = { ...order, orderId, status: 'PENDING', createdAt: new Date().toISOString(), }; logger.info('Creating order', { orderId, customerId: order.customerId, itemCount: order.items.length }); // Persist to DynamoDB await ddb.send(new PutItemCommand({ TableName: process.env.ORDERS_TABLE!, Item: marshall(newOrder), ConditionExpression: 'attribute_not_exists(orderId)', // idempotency guard })); // Publish to downstream processing queue await sqs.send(new SendMessageCommand({ QueueUrl: process.env.ORDER_QUEUE_URL!, MessageBody: JSON.stringify(newOrder), MessageGroupId: order.customerId, // FIFO ordering per customer MessageDeduplicationId: orderId, })); logger.info('Order created and queued', { orderId }); return { success: true, orderId }; } catch (error) { const err = error as Error; logger.error('Failed to create order', { error: err.message, stack: err.stack }); subsegment?.addError(err); return { success: false, error: err.message }; } finally { subsegment?.close(); } What Q Developer would do: Suggest inline completions line-by-line based on your cursor position. What Kiro does: Reads your SPEC.md that says "Create an order service with DynamoDB persistence, SQS publishing, idempotency, and X-Ray tracing" — and generates the entire file, including imports, error handling, and the logging pattern your team already uses (learned from your codebase). Production Code Example 2: Using Kiro Hooks for Automatic Test Generation Kiro's hook system is where backend engineers will find the most leverage. A hook is a YAML-defined automation that triggers on file system events within your project. YAML # .kiro/hooks/auto-test.yaml name: Generate Unit Tests on Save trigger: event: file_saved pattern: "src/**/*.ts" exclude: "**/*.test.ts" actions: - type: agent_task prompt: | A TypeScript file was just saved at {{file_path}. Review the exported functions. For any function that does not have a corresponding test in {{file_path_without_ext}.test.ts, generate comprehensive unit tests using Vitest. Include: - Happy path tests - Error boundary tests (network failures, malformed input) - Edge cases for empty arrays and null values Use @aws-sdk/client-dynamodb mocks from @aws-sdk/lib-dynamodb MockDocumentClient. output_file: "{{file_path_without_ext}.test.ts" mode: merge # Don't overwrite existing tests, only append missing ones YAML // Auto-generated test from the hook above (Vitest) import { describe, it, expect, vi, beforeEach } from 'vitest'; import { mockClient } from 'aws-sdk-client-mock'; import { DynamoDBClient, PutItemCommand } from '@aws-sdk/client-dynamodb'; import { SQSClient, SendMessageCommand } from '@aws-sdk/client-sqs'; import { createOrder } from './order-service'; const ddbMock = mockClient(DynamoDBClient); const sqsMock = mockClient(SQSClient); describe('createOrder', () => { beforeEach(() => { ddbMock.reset(); sqsMock.reset(); process.env.ORDERS_TABLE = 'test-orders'; process.env.ORDER_QUEUE_URL = 'https://sqs.us-east-1.amazonaws.com/123/orders.fifo'; }); it('should create order and return orderId on success', async () => { ddbMock.on(PutItemCommand).resolves({}); sqsMock.on(SendMessageCommand).resolves({ MessageId: 'msg-123' }); const result = await createOrder({ customerId: 'cust-001', items: [{ sku: 'SKU-A', qty: 2, price: 29.99 }], }); expect(result.success).toBe(true); expect(result.orderId).toMatch(/^ORD-/); }); it('should return error when DynamoDB PutItem fails', async () => { ddbMock.on(PutItemCommand).rejects(new Error('ProvisionedThroughputExceededException')); const result = await createOrder({ customerId: 'cust-001', items: [{ sku: 'SKU-A', qty: 1, price: 9.99 }], }); expect(result.success).toBe(false); expect(result.error).toContain('ProvisionedThroughputExceededException'); }); it('should handle empty items array gracefully', async () => { ddbMock.on(PutItemCommand).resolves({}); sqsMock.on(SendMessageCommand).resolves({}); const result = await createOrder({ customerId: 'cust-001', items: [] }); expect(result.success).toBe(true); }); }); This hook fires every time you save a .ts file, checks if tests are missing, and generates them — without you ever typing a prompt. Production Code Example 3: Migrating Q Developer Inline Chat Workflows to Kiro If you've built muscle memory around Q Developer's /dev inline chat, here's how equivalent workflows map in Kiro. TypeScript // Q Developer workflow (deprecated): // You'd type: /dev add retry logic with exponential backoff to this fetch call // And it would modify the highlighted code block. // Kiro equivalent: Add to your SPEC.md or use agent chat with file pinning: // "Add retry logic with exponential backoff to src/clients/http-client.ts" // The result Kiro produces — production-grade with observability: import { Logger } from '@aws-lambda-powertools/logger'; const logger = new Logger({ serviceName: 'http-client' }); interface RetryOptions { maxAttempts: number; baseDelayMs: number; maxDelayMs: number; retryableStatusCodes: number[]; } const DEFAULT_RETRY_OPTIONS: RetryOptions = { maxAttempts: 3, baseDelayMs: 100, maxDelayMs: 5000, retryableStatusCodes: [429, 500, 502, 503, 504], }; export async function fetchWithRetry<T>( url: string, options: RequestInit = {}, retryOpts: Partial<RetryOptions> = {} ): Promise<T> { const opts = { ...DEFAULT_RETRY_OPTIONS, ...retryOpts }; let lastError: Error | null = null; for (let attempt = 1; attempt <= opts.maxAttempts; attempt++) { try { const response = await fetch(url, options); if (!response.ok && opts.retryableStatusCodes.includes(response.status)) { const delay = Math.min(opts.baseDelayMs * Math.pow(2, attempt - 1), opts.maxDelayMs); logger.warn('Retryable HTTP error, backing off', { url, status: response.status, attempt, delayMs: delay, }); await new Promise(res => setTimeout(res, delay)); continue; } if (!response.ok) { throw new Error(`HTTP ${response.status}: ${response.statusText}`); } return response.json() as Promise<T>; } catch (error) { lastError = error as Error; if (attempt === opts.maxAttempts) break; const delay = Math.min(opts.baseDelayMs * Math.pow(2, attempt - 1), opts.maxDelayMs); logger.warn('Request failed, retrying', { url, attempt, delayMs: delay, error: lastError.message }); await new Promise(res => setTimeout(res, delay)); } } logger.error('All retry attempts exhausted', { url, maxAttempts: opts.maxAttempts }); throw lastError ?? new Error('Unknown fetch error after retries'); When to Migrate Now vs. Wait Migrate now if: You're starting a new service or greenfield project — Kiro's spec-driven approach saves the most time at project inceptionYour team does heavy test generation — the hook system is a net productivity winYou're building MCP-integrated tooling or AWS-native agentic workflows Wait if: You have a heavily customized Q Developer security scanning ruleset — give the Kiro security scanner time to matureYou're on a locked-down enterprise network — Kiro's agentic features require broader outbound connectivity than Q Developer's plugin model Performance and Productivity Metrics MetricAmazon Q DeveloperAmazon Kiro (early data)Avg. context window (tokens)~16K~128K+Multi-file edits per session1-210-20+Test coverage improvement~15%~35% (with hooks)Time to scaffold new service~2-3 hrs manual~20-40 min spec-drivenSecurity scan languages1520+ Summary The Q Developer → Kiro transition isn't just a rebranding. It's a fundamental shift from a reactive autocomplete tool to a proactive agentic development environment. For backend engineers building distributed systems on AWS, Kiro's spec-driven planning, multi-file context, and hook-based automation represent a genuine productivity leap — not just an incremental update. Start your migration now. The deprecation deadline of April 2027 sounds far off, but enterprise procurement, security reviews, and team retraining take time. Get ahead of it. References AWS: Amazon Q Developer End-of-Support Announcement — AWS News Blog, May 2026AWS: Top Announcements of What's Next with AWS 2026 — AWS News Blog, April 2026AWS Lambda Powertools for TypeScript — Official DocumentationAWS SDK Client Mock — GitHubKiro Documentation — Official Kiro DocsAWS Well-Architected Framework: Operational Excellence — AWS Docs

By Jubin Abhishek Soni DZone Core CORE
OpenAPI, ORM, SVG, and Lottie
OpenAPI, ORM, SVG, and Lottie

This is the third follow-up to Friday's release post. Saturday's was about how you iterate; yesterday's was about new platform APIs in the core; today's is about a run of pieces that change how you write the structural parts of an app. The pieces are an OpenAPI client generator, a SQLite ORM, JSON and XML mappers, a component binder with validation, build-time SVG and Lottie transcoders, and a declarative router with deep links. All ride on a single build-time codegen pipeline: a Maven-plugin pass that reads annotations or declarative source files at build time and emits typed Java that compiles into your binary. No reflection, no service loader, no Class.forName. The "How it works" section at the end of this post covers the codegen plumbing once you have seen what it powers. OpenAPI Client Generation The headline of this release for any team that talks to a backend. A new cn1:generate-openapi-client Mojo reads an OpenAPI 3.x JSON spec (a URL or a local file) and writes typed Codename One client code that compiles into your app: One @Mapped POJO per components.schemas entry.One <Tag>Api.java class per OpenAPI tag, with one fluent method per operation.Every method routes through Rest.<verb> + Mappers.toJson + fetchAsMapped / fetchAsMappedList, so the generated surface integrates with the rest of the framework instead of dragging in a separate HTTP stack. Wire it into the project's pom.xml: XML <plugin> <groupId>com.codenameone</groupId> <artifactId>codenameone-maven-plugin</artifactId> <executions> <execution> <id>petstore-client</id> <goals><goal>generate-openapi-client</goal></goals> <configuration> <specUrl>https://petstore3.swagger.io/api/v3/openapi.json</specUrl> <basePackage>com.example.petstore</basePackage> </configuration> </execution> </executions> mvn generate-sources picks the spec up, downloads it, and writes one file per schema and one per tag under target/generated-sources/. The Petstore reference spec exercised end-to-end produces six model classes (Pet, Order, Customer, Tag, Category, User) and three API classes (PetApi, StoreApi, UserApi), and the nine generated .class files compile cleanly against codenameone-core. Documented at the OpenAPI codegen Maven goal. In application code you call the generated Api class the same way you would call any other Java method: Java PetApi pets = new PetApi(); // Returns AsyncResource<Pet>; resolves with the deserialised object. pets.getPetById(42).onResult((pet, err) -> { if (err == null) Log.p("Got " + pet.getName()); }); // Returns AsyncResource<List<Pet>>. pets.findPetsByStatus("available").onResult((list, err) -> { if (err == null) { for (Pet p : list) Log.p(p.getName()); } }); // POST with a request body. addPet takes a Pet, returns a Pet. Pet candidate = new Pet(); candidate.setName("Mittens"); candidate.setStatus("available"); pets.addPet(candidate).onResult((created, err) -> { /* ... */ }); There is no hand-rolled ConnectionRequest setup, no manual JSON parsing, no string-typed request bodies. The generated client takes a typed Pet, serializes it with Mappers.toJson(...), fires the right HTTP verb, deserializes the response with Mappers.fromJson(...), and surfaces the result through the framework's AsyncResource so your callback fires on the EDT. For teams who already publish an OpenAPI spec as part of their backend (most modern backend frameworks do this automatically; FastAPI, Spring's springdoc-openapi, NestJS, ASP.NET Core, Go's gnostic), the practical effect is that the mobile client's bindings stay in sync with the backend without anyone hand-writing a single network call. Update the spec, re-run mvn generate-sources, and the new and changed endpoints land in your app as typed Java; the IDE picks up immediately. It is the kind of change that is most useful when you do not know you have it: pull a fresh spec, rebuild, and your IDE highlights every place in the codebase that called a renamed endpoint or passed the wrong type to a parameter. SQLite ORM @Entity marks the class; @Id and @Column shape the schema; @DbTransient opts a field out: Java @Entity public class TodoItem { @Id @Column long id; @Column String title; @Column(name = "completed_at") Date completedAt; @DbTransient Object cachedView; } Dao<TodoItem> dao = EntityManager.open("todos.db").dao(TodoItem.class); dao.createTable(); dao.insert(new TodoItem(0, "Read the post", null)); List<TodoItem> open = dao.find("completed_at IS NULL", new Object[] {}); TodoItem byId = dao.findById(42); dao.delete(byId); The generated DAO does the typed work underneath. No reflection in insert; the generated code calls setString(1, e.title) and setLong(2, e.id) directly against the SQLite PreparedStatement. Validation at build time catches missing @Id, fields that look like relationships but are not yet supported, and abstract entity classes; the build fails with a class name and a reason. For JPA/Hibernate developers, the API is intentionally familiar. @Entity, @Id, @Column, and @Transient (here renamed @DbTransient to avoid colliding with java.beans.Transient) carry the same meaning they do under javax.persistence / jakarta.persistence. The EntityManager name is the same. Dao#findById, Dao#findAll, Dao#find(where, params), Dao#insert, Dao#update, Dao#delete line up with the basic JPA repository contract. The query language is plain SQL (there is no JPQL or Criteria DSL), but the annotation surface, the lifecycle, and the runtime methods will feel like a long-lost friend to anyone with server-side Java persistence experience. JSON/XML Mapping @Mapped marks a class as a transferable POJO. @JsonProperty and @XmlElement (plus @XmlRoot, @XmlAttribute, @JsonIgnore, @XmlTransient) shape the wire format. The runtime entry points are Mappers.toJson(...), Mappers.fromJson(...), Mappers.toXml(...), Mappers.fromXml(...): Java @Mapped public class User { @JsonProperty("user_id") long id; @JsonProperty String name; @JsonProperty("created_at") Date createdAt; @JsonIgnore String passwordHash; } String json = Mappers.toJson(user); User back = Mappers.fromJson(json, User.class); The same @Mapped POJO is the type the typed Rest helpers accept: Java Rest.get("https://api.example.com/users/42") .fetchAsMapped(User.class) .onResult((user, err) -> { /* ... */ }); Rest.get("https://api.example.com/users") .fetchAsMappedList(User.class) .onResult((users, err) -> { /* ... */ }); Rest.fetchAsJsonList (top-level JSON arrays, no {"root":[...]} envelope trick), JSONWriter (the complement of JSONParser, with fluent builders and streaming variants for Writer and OutputStream), and URLImage.setDefaultBearerToken (auth headers on image fetches) all ship alongside. For JAXB developers, the XML surface (@XmlRoot, @XmlElement, @XmlAttribute, @XmlTransient) is a direct port of the long-established javax.xml.bind.annotation surface. The same model class can be both @XmlRoot-decorated and @JsonProperty-decorated, which gives you a single source of truth for both wire formats. The JSON surface adopts the Jackson convention (@JsonProperty, @JsonIgnore) that nearly every modern JVM JSON binding (Jackson, Moshi, kotlinx-serialization) inherited. Component Binding With Validation The fourth annotation processor on the same pipeline is the component binder. @Bindable marks a model class; @Bind(name = "userField") ties a field to a component on a form by the component's name. Field-level validation annotations compose with @Bind on the same field: Java @Bindable public class SignupModel { @Bind(name = "userField") @Required @Length(min = 3) private String user; @Bind(name = "emailField") @Required @Email private String email; @Bind(name = "ageField") @Numeric(min = 13, max = 120) private String age; @Bind(name = "roleField") @ExistIn({ "admin", "editor", "viewer" }) private String role; } The matching form sets a name on each component so the binder can find them: Java TextField user = new TextField(); user.setName("userField"); TextField email = new TextField(); email.setName("emailField"); TextField age = new TextField(); age.setName("ageField"); ComboBox<String> role = new ComboBox<>("admin", "editor", "viewer"); role.setName("roleField"); Button submit = new Button("Sign up"); Form form = new Form("Sign Up", BoxLayout.y()); form.add(user).add(email).add(age).add(role).add(submit); form.show(); SignupModel model = new SignupModel(); Binding binding = Binders.bind(model, form); binding.getValidator().addSubmitButtons(submit); Binding is the handle: refresh() re-reads the model into the components, commit() writes the components back, disconnect() tears the listeners down. Multiple validation annotations on a single field compose via Validator.addConstraint(Component, Constraint...) and GroupConstraint (first failure wins). @Validate(MyClass.class) is the escape hatch for hand-written Constraint implementations. The validation set: @Required, @Length, @Regex, @Email, @Url, @Numeric, @ExistIn, @Validate. The new BindAttr enum lets @Bind target a specific attribute of the component (TEXT, UIID, SELECTED, ...) when the default ("write a String field into the component's text") is not what you want. SVG at Build Time Drop an SVG into src/main/css/, alongside theme.css: Shell src/main/css/ theme.css star.svg gradient_circle.svg path_arrow.svg rounded_button.svg wave.svg pro_badge.svg After the next build, every SVG is a regular Codename One Image. An SVG handled by the transcoder is a vector image, but it is still an Image. Everywhere a raster Image works (Label.setIcon, Button.setIcon, BorderLayout.NORTH, the toolbar, a MultiButton's leading icon, a CSS background: url(...) rule), the SVG works too. The difference is that it stays crisp at any size: the same source file is sharp at a 16-point list-row icon, a 64-point hero header, and a 256-point launch screen, on every DPI bucket. A grid of the static SVGs from the hellocodenameone fixture, rendered through the new pipeline: Sizing in Millimeters The SVG transcoder's most useful feature is also the one most easily missed: size every SVG in millimeters from CSS. SVGs in the wild routinely declare odd width / height attributes (a 1024×1024 export of a 24×24 icon, no dimensions at all, design-pixel values from one specific framework). Pinning the rendered size in millimeters sidesteps all of that. CSS HomeIcon { background: url(home.svg); cn1-svg-width: 6mm; cn1-svg-height: 6mm; bg-type: image_scaled_fit; } LogoBanner { background: url(logo.svg); cn1-svg-width: 32mm; cn1-svg-height: 12mm; } A 6 mm icon is 6 mm tall on a 1× desktop, 6 mm on a high-DPI handset, and 6 mm on a 4K tablet. The transcoder routes both values through Display.convertToPixels() at install time, the same way font-size: 3mm already behaves elsewhere in Codename One CSS. No design-pixel guesswork, no DPI bucket to choose, no scaling surprise when the artist re-exports the source SVG at a different resolution. If a project does not use CSS for theming, the two-float constructor on the generated class takes millimeters directly: new com.codename1.generated.svg.Home(6f, 6f). Coverage and What We Still Want Feedback On The transcoder is a maven/svg-transcoder/ module that parses SVG with javax.xml StAX. No Batik, no Flamingo, no external dependencies. Coverage targets what real-world icon SVGs use: rect (rounded corners included), circle, ellipse, line, polyline, polygon, the full path grammar (M / L / H / V / C / S / Q / T / A / Z plus relative-coordinate and smooth-curve reflection), groups with affine transforms (translate, scale, rotate, skew, matrix), linear gradients via LinearGradientPaint, fill, stroke, stroke-width, linecap, linejoin, opacity. SMIL animations are supported in the same pipeline: <animate>, <animateTransform> (translate, scale, rotate), and <set>. Time values interpolate against wall-clock time on every paint, with from / to / values / begin / dur / repeatCount / fill="freeze" honored. Text and clip-path landed in the follow-up PR for the static SVG fixtures, and both are visible in the screenshot above (the "Codename One / build-time SVG" wordmark in the rounded button, the "PRO" badge text, and the clip-path-shaped rounded-corner badge underneath). <text> and <tspan> work with single-style fills and transforms; <clipPath> referenced via clip-path="url(#id)" works against rect, circle, and path clip shapes (nested clip refs are ignored). What is still not supported: SVG filter primitives, <mask> (treated as a clip, so alpha masking falls back to opaque), <radialGradient> (falls back to the first-stop color), and CSS-in-SVG (style rules inside the SVG document; the transcoder reads presentation attributes and the inline style="..." attribute, but a <style> element with selectors is not parsed). If you hit an SVG that does not transcode the way you expect, please open an issue at github.com/codenameone/CodenameOne/issues and attach the source file. The fastest way to extend the coverage is for us to run the failing case through the test fixtures and watch the output. Every SVG we ship test goldens for started as somebody else's "this doesn't render right" report. Caveat on iOS: The transcoded SVGs use the framework's shape API (fillShape, drawShape, LinearGradientPaint). The full surface is implemented on the Metal renderer. The deprecated GL ES 2 pipeline does not have parity on every operation, so an SVG drawn under ios.metal=false will often render with visible artifacts (missing gradients, clipped fills, distorted paths) rather than the placeholder you might expect. Now that Metal is the default for new iOS builds as of last Friday, this is a non-issue on most apps; if you have explicitly pinned ios.metal=false, expect some visual regressions on SVG content and let us know which. The coverage matrix and troubleshooting are in the SVG Transcoder in the developer guide. Lottie at Build Time The same pipeline carries Lottie. Drop a Bodymovin export into the same src/main/css/: JSON src/main/css/ theme.css pulse.json spinner.json After the next build, both are real Image instances on every platform that exposes the shape API. The same vector-everywhere story as SVG: a Lottie animation renders crisply at any size and slots into any Image slot in the framework. Java Image pulse = Resources.getGlobalResources().getImage("pulse"); Image spinner = Resources.getGlobalResources().getImage("spinner"); Animation runs against wall-clock time on every paint, with no Timer and no allocation in the hot path. A capture of the hellocodenameone Lottie fixture in motion: The Lottie transcoder lives in maven/lottie-transcoder/. It parses Bodymovin JSON with no external dependencies (the framework's built-in JSON parser carries the load) and lowers each file into the same SVGDocument model the SVG path uses. The same JavaCodeGenerator emits the same GeneratedSVGImage subclass, and the same SVGRegistry registers it under the source filename. No new Image base class, no new registry, no per-port wiring, since the SVG path's JavaSE reflective load and iOS / Android Stub weaving already cover the new format. Coverage in v1: shape layers (rc / el / sh) with solid fills and strokes; layer transforms (anchor, position, scale, rotation, opacity); animated rotation, position, and scale collapsed to a two-keyframe loop; solid-color layers as filled rects. Most icon-grade Bodymovin exports lower cleanly. Complex character animations from After Effects with image references, masks, and effects do not, and the transcoder logs which layers it dropped so the source of any blank output is obvious. Same ask as for SVG: if a Lottie / Bodymovin file does not transcode the way you expect, please open an issue at github.com/codenameone/CodenameOne/issues and attach the source .json. The transcoder grows one shape family at a time from the cases the community reports. The same iOS caveat applies: the renderer leans on the shape API, so the deprecated GL ES 2 pipeline shows artifacts on the more elaborate Lottie animations. Use the Metal default (now on by default for new iOS builds). Deep Links and Routing Two pieces of plumbing for apps that handle URLs from outside themselves (notification taps, marketing links, share targets, Universal Links from Safari and the equivalent App Links from Chrome on Android). Deep Links Codename One has had deep-link support for a long time through Display.setProperty("AppArg", url). The platform plumbing already writes the incoming URL into that property on cold launch, and an app-resume sets it again on warm launch; reading it back from start() works fine for a small number of patterns. Where the AppArg-only approach gets fragile is consistency. The cold and warm paths execute different lifecycle code, the value is a flat string with no parsing, and the trickiest case is the one where a user lands in the middle of the app via a link and then continues to interact: their next navigation needs to compose with the entry point, the back-stack needs to make sense as if they had arrived through the usual flow, and "fall off the edge of the app" on back is a common bug. With a hand-rolled AppArg reader it is easy to miss one of these and ship a half-working flow. This release introduces a typed DeepLink and a single handler that fires for both cold and warm launches: Java Display.getInstance().setDeepLinkHandler(link -> { // link is a normalised DeepLink: scheme, host, path, // segments, query map, fragment. Same shape cold or warm. if ("/users".equals(link.path()) && link.segments().size() == 2) { showUserDetailForm(link.segments().get(1)); return true; } return false; AppArg still works for projects that depend on it, but the new handler is what we recommend going forward. The handler runs on a consistent lifecycle path on both cold and warm starts, and the parsed DeepLink value carries the scheme, host, path segments, query map, and fragment, so app code does not need to roll its own URL parser. Routing For projects that handle more than a handful of URL patterns, the second piece is the declarative router in com.codename1.router. We built it on the same build-time codegen pipeline as the ORM and the mappers (the router was actually the first concrete consumer of the new preprocessor), so the two surfaces compose: a deep-link handler that delegates to the router becomes a one-liner. Each form declares its own path with a @Route annotation: Java @Route("/") public class HomeForm extends Form { /* ... */ } @Route("/users/:id") public class UserDetailForm extends Form { public UserDetailForm(RouteMatch match) { String userId = match.param("id"); // build UI for user `userId` } } @Route("/about") Router.navigate("/users/42") resolves the path, instantiates UserDetailForm, and shows it. The deep-link handler now collapses to: Java Display.getInstance().setDeepLinkHandler(link -> Router.navigate(link.toString())); Each form owns its own routing rule. Adding or moving a screen is a one-class change. The "what screens does this app have, and at what paths?" question is answered by an IDE search for @Route, not by reading every form constructor in the project. For Spring developers, the shape is familiar by design. @Route plays the same role as Spring MVC's @RequestMapping: a class-level declaration that announces "this controller handles URLs of this shape". The :id parameter syntax mirrors Spring's {id} path-variable syntax; RouteMatch.param("id") is the same kind of accessor as Spring's @PathVariable. The mental model carries over from server-side Java with almost no friction. The same recognition is available to anyone with React Router, Vue Router, or Angular Router experience; the :param convention is the cross-framework default. The build-time processor validates that each annotated class extends Form, that the path starts with /, that the constructor is accessible, and that there are no duplicate patterns. Any rule violation fails the build with a class name and a reason, not at runtime with a stack trace. The rest of the router surface covers the kind of thing that has become table stakes in modern client routing: Route guards run before navigation completes and can cancel or redirect.Per-tab navigation stacks via TabsForm, where each tab keeps its own back stack.Location listeners so anything in the app can subscribe to "the route changed".Form.setPopGuard(PopGuard) intercepts hardware back, toolbar back, or Router.pop() with a chance to ask "are you sure?".Sheet.showForResult() returns an AsyncResource<T> that auto-cancels with null if the user dismisses the sheet. The API is opt-in. Apps that prefer the existing Form.show() / Form.showBack() flow keep using that; nothing changes. For the link-publishing side, an AasaBuilder emits the iOS apple-app-site-association JSON and an AssetLinksBuilder emits the Android assetlinks.json. The full setup walk-through (entitlements, the Android intent-filter, the .well-known/ upload on your origin server) is at Routing and Deep Links in the developer guide. The JavaScript port bridges the router into window.history so navigating the in-app router pushes a real entry into the browser's session history. Back and forward in the browser drive the router; reloading the page lands at the deep-link URL; sharing the URL out of the address bar takes a colleague to the same in-app location. How It Works: The Build-Time Codegen Pipeline Everything above sits on a single Maven-plugin pass. The plugin has an AnnotationProcessor SPI and two new Mojos: cn1:generate-annotation-stubs (in generate-sources) and cn1:process-annotations (in process-classes). The orchestrator ASM-scans target/classes, dispatches to every registered processor, validates the annotated classes, and emits a typed runtime artifact next to each one plus a tiny Index class that registers everything with a public runtime registry. Adding a new processor later is a matter of dropping it into META-INF/services with no orchestrator changes. The reason this runs against bytecode rather than against source text is that the source-regex prototype was scrapped early. The bytecode pass sees the JVM's view of the project (extends Form is a thing the JVM actually knows, not a pattern we have to hope the user wrote a specific way), rule violations come back with class names and reasons, and the build fails fast before any generated .class lands on disk. The infrastructure shares the ASM passes that the BytecodeComplianceMojo's existing String rewrites already use. A small stub source is emitted under target/generated-sources/cn1-annotations/ during generate-sources so application code that references the generated registry resolves at compile time. The real .class overwrites the stub later in process-classes. Standard "compile against a stub, link against the real thing" pattern; it just works inside a single Maven build instead of needing a multi-module split. cn1-core ships a no-op stub of each generated index (RoutesIndex, MappersIndex, BindersIndex, DaosIndex), so application code compiles even when the project has no annotated classes. The build-time processor shadows each stub with the real implementation before packaging. The SVG and Lottie transcoders sit on a parallel pipeline (declarative graphics files in place of annotations), but they emit the same shape of code and obey the same constraints. The practical effect is that the kind of code that historically required reflection at runtime (with all the obfuscation hazards and surprise allocations that come with that) now happens once at build time and produces direct, dead-code-eliminable, rename-safe symbol references. Wrapping Up That closes this release's post series. We already have some pretty big features lined up for this Friday's release post; the headline pieces are the most substantial things to land in months and are worth checking back for. Back to the weekly index.

By Shai Almog DZone Core CORE
Why Infrastructure Efficiency Is Becoming the New Cloud Profitability Metric
Why Infrastructure Efficiency Is Becoming the New Cloud Profitability Metric

Infrastructure efficiency is rapidly becoming one of the most important factors determining profitability for cloud providers, managed service providers, and SaaS companies. For years, infrastructure growth followed a simple formula: add more servers, more storage, and more capacity whenever demand increased. That model worked when hardware prices consistently declined, and inefficiencies could be absorbed through growth. Those conditions no longer exist. Today, providers face rising costs for memory, enterprise SSDs, GPUs, power, cooling, and colocation, while customers continue to expect lower pricing, better performance, stronger SLAs, and faster service delivery. Several industry shifts have fundamentally changed infrastructure economics. Changes in virtualization licensing models have increased costs for many organizations. AI adoption has driven demand for GPUs, high-capacity memory, and high-performance storage. Power and colocation costs continue to rise globally, while sovereign cloud initiatives are creating demand for regional infrastructure that must compete economically with hyperscale cloud providers. The challenge is clear: infrastructure costs are rising faster than revenue. What Does a Workload Really Cost? Infrastructure efficiency ultimately comes down to a simple question: what does it cost to deliver a workload? Customers do not buy servers, storage systems, or software licenses. They buy virtual machines, Kubernetes clusters, databases, AI environments, SaaS applications, and business services. The true cost of delivering those workloads includes much more than infrastructure hardware: Software licensingPower and coolingColocationNetwork connectivityStorageCapacity buffersStaffing and operationsSupport and SLA commitments The providers that achieve the lowest cost per workload while maintaining performance and service quality gain a significant competitive advantage. As infrastructure costs continue to increase, "cost per workload delivered" is becoming a useful framework for evaluating efficiency. Unlike traditional metrics focused solely on hardware utilization or licensing costs, this approach considers the complete economics of delivering customer-facing services. Beyond Infrastructure Utilization Infrastructure efficiency is not measured only by CPU, memory, or storage utilization. Operational metrics often have an equally significant impact on the cost of delivering workloads. Examples include administrator-to-server ratio, administrator-to-VM ratio, workload deployment times, incident resolution times, and the number of infrastructure platforms that must be maintained. Cost alone is also a misleading metric. A workload delivered at lower cost may also deliver lower performance, higher contention, or slower support response times. A virtual machine with two vCPUs does not necessarily provide the same amount of usable compute across platforms. CPU oversubscription ratios, noisy-neighbor effects, storage latency, network performance, and support commitments all influence the actual customer experience. The relevant metric is not simply cost per workload, but cost per workload delivered at a defined SLA. Architectural Choices and Efficiency Infrastructure architecture plays a major role in determining workload economics. Traditional infrastructure environments often combine separate virtualization, storage, networking, monitoring, backup, and orchestration platforms. While this approach offers flexibility, it can also increase operational complexity, encourage overprovisioning, and create management overhead. As a result, many organizations are moving toward more integrated infrastructure models, including hyperconverged infrastructure (HCI) and software-defined platforms that consolidate multiple functions into a unified operational framework. The goal is not merely consolidation. The real objective is to reduce operational overhead, improve resource utilization, simplify scaling, and lower long-term total cost of ownership. This becomes particularly important for sovereign cloud initiatives. Unlike hyperscalers that benefit from massive global scale, regional cloud providers often need to achieve competitive economics within a specific country or market while maintaining local data residency, compliance, and operational control. In these environments, maximizing infrastructure efficiency is often critical to long-term profitability. Infrastructure Efficiency Metrics Worth Tracking Organizations evaluating infrastructure efficiency should look beyond traditional utilization metrics and monitor indicators that directly affect workload economics, including: Cost per virtual machineCost per containerCost per Kubernetes clusterCost per AI workloadStorage efficiency ratiosPower consumption per workloadAdministrator-to-server ratioWorkload deployment timesMean time to resolution (MTTR)Resource utilization across compute and storage environments These metrics provide a more accurate view of infrastructure performance than hardware utilization alone. Why AI Changes the Equation The emergence of AI workloads has made infrastructure efficiency even more important. GPU resources are expensive, but GPUs alone do not determine the economics of AI infrastructure. Storage performance, networking efficiency, workload orchestration, and operational processes all directly impact GPU utilization and overall service profitability. In many environments, the challenge is no longer acquiring GPUs. It ensures that the surrounding infrastructure can keep them fully utilized. As GPU, storage, and power costs continue to rise, organizations are increasingly focused on maximizing the value extracted from every infrastructure resource. AI infrastructure economics are becoming less about acquiring the largest amount of hardware and more about achieving the highest utilization and operational efficiency from existing investments. Measuring Infrastructure Economics One of the challenges with infrastructure efficiency is that it often remains invisible until it is measured. Many organizations focus on software licensing when evaluating infrastructure costs, but licensing is only one part of the equation. Utilization rates, storage efficiency, operational overhead, power consumption, hardware refresh cycles, staffing requirements, and SLA commitments often have a much greater impact on long-term economics. This is why Total Cost of Ownership (TCO) modeling is becoming increasingly important. Effective infrastructure evaluations should account for: Software costsHardware acquisitionEnergy consumptionColocation expensesStorage efficiencyStaffing requirementsOperational complexitySupport and maintenance costs Organizations that perform these broader analyses often discover that the greatest opportunities for savings come not from individual licensing decisions but from improving overall workload economics. Conclusion The next phase of cloud infrastructure optimization is unlikely to be driven by capacity growth alone. As infrastructure costs continue to rise and customer expectations continue to increase, providers must focus on delivering more workloads with fewer resources while maintaining performance and service quality. In that environment, infrastructure efficiency becomes more than a technical objective. It becomes a business metric. The organizations that can achieve the lowest cost per workload delivered at a defined service level will be best positioned to protect margins, remain competitive, and build sustainable cloud and AI services for the future.

By Tetiana Fydorenchyk
Intelligent Matching and Semantic Search for Marketplace Applications Using OpenAI and .NET
Intelligent Matching and Semantic Search for Marketplace Applications Using OpenAI and .NET

Marketplace platforms are fundamentally matching systems. Whether the platform connects: Students and tutorsFreelancers and clientsBuyers and sellersConsultants and companies The overall user experience usually depends on how accurately the platform can connect relevant people together. At early stages, traditional search systems are often enough. Basic SQL filtering, category-based navigation, and keyword matching can solve many initial requirements without major issues. The situation changes once the platform grows and users begin writing longer, intent-based queries instead of simple keywords. For example, a user may search for: “online calculus tutor for engineering preparation” while a marketplace listing may contain: “advanced mathematics mentor for university students” Even though both sides are highly relevant, a traditional keyword-based search engine may completely fail to connect them because the wording is different. This is one of the areas where semantic search architectures become extremely valuable. Why Traditional Marketplace Search Starts Breaking Down Most marketplace platforms initially rely on: SQL LIKE queriesfull-text searchBM25 rankingtag filtering These approaches work reasonably well when search queries are short and predictable. However, real users rarely search using the exact same terminology as listing owners. A few common examples: User QueryMarketplace Listingmath tutorcalculus mentorIELTS coachEnglish speaking trainerReact mentorfrontend architectstartup advisorbusiness consultant The issue here is not syntax. It is the semantic meaning. Traditional search engines are effective at matching identical words, but much weaker at understanding contextual similarity between different phrases. Intent Matters More Than Exact Keywords One thing that becomes visible fairly quickly in marketplace applications is that users tend to search with intent instead of isolated keywords. For example: “senior React mentor for interview preparation” The user is probably not simply searching for “React.” The actual intent may include: MentorshipSenior-level expertiseInterview coachingFrontend architecture experience Traditional keyword search systems struggle to interpret these relationships properly. Even when partially relevant results appear, ranking quality often becomes inconsistent as queries become longer and more contextual. Semantic Search Approaches the Problem Differently Semantic search systems do not treat text as isolated keywords. Instead, text is converted into vector representations called embeddings. These embeddings represent contextual meaning rather than exact wording. As a result, the following phrases can become mathematically close to each other even when they do not contain identical words: “Math tutor”“Calculus mentor”“Engineering mathematics coach” This allows marketplace applications to perform much more flexible matching. A Typical Semantic Search Architecture Plain Text User Query ↓ Embedding Generation ↓ Vector Search ↓ Similarity Scoring ↓ Hybrid Ranking ↓ Marketplace Results The important detail here is that the following are converted into embeddings before similarity calculations happen: Marketplace listingsUser queries .NET Technologies Used in the Architecture A typical .NET-based semantic search stack may include the following components: AreaTechnologyAPI LayerASP.NET CoreBackground JobsHosted Services / Quartz.NETQueue SystemRabbitMQ / Azure Service BusDatabaseSQL Server / PostgreSQLVector Storagepgvector / PineconeCacheRedisLoggingSerilogMonitoringOpenTelemetryAI IntegrationOpenAI API One thing that becomes obvious during implementation is that the OpenAI API itself is usually only a small part of the overall system. The larger engineering effort often involves: IndexingRankingCachingAsynchronous processingOperational monitoringRetry handling Marketplace Listing Model A simplified marketplace listing model may look like this: C# public class MarketplaceListing { public long Id { get; set; } public string Title { get; set; } public string Description { get; set; } public string CategoryName { get; set; } public string Location { get; set; } public bool IsActive { get; set; } public bool IsDeleted { get; set; } public DateTime CreatedOn { get; set; } public DateTime? LastIndexedOn { get; set; } public string SearchText => $"{Title} {Description} {CategoryName} {Location}"; } The SearchText property combines multiple searchable fields into a single semantic context before embedding generation. Generating Embeddings With OpenAI A simplified embedding service implementation in .NET may look like this: C# public class OpenAIEmbeddingService : IEmbeddingService { private readonly HttpClient _httpClient; private readonly IConfiguration _configuration; public OpenAIEmbeddingService( HttpClient httpClient, IConfiguration configuration) { _httpClient = httpClient; _configuration = configuration; } public async Task<float[]> GenerateEmbeddingAsync( string input, CancellationToken cancellationToken = default) { var apiKey = _configuration["OpenAI:ApiKey"]; using var request = new HttpRequestMessage( HttpMethod.Post, "https://api.openai.com/v1/embeddings"); request.Headers.Authorization = new AuthenticationHeaderValue("Bearer", apiKey); var body = new { model = "text-embedding-3-small", input = input }; request.Content = new StringContent( JsonSerializer.Serialize(body), Encoding.UTF8, "application/json"); using var response = await _httpClient.SendAsync(request, cancellationToken); response.EnsureSuccessStatusCode(); var json = await response.Content.ReadAsStringAsync(cancellationToken); using var document = JsonDocument.Parse(json); return document .RootElement .GetProperty("data")[0] .GetProperty("embedding") .EnumerateArray() .Select(x => x.GetSingle()) .ToArray(); } } Dependency Injection C# builder.Services.AddHttpClient< IEmbeddingService, OpenAIEmbeddingService>(); builder.Services.AddScoped< ISemanticSearchService, SemanticSearchService>(); Background Indexing One issue that appears very quickly in production systems is latency. Generating embeddings synchronously during listing creation or updates may slow down the request lifecycle significantly. Because of this, many systems move embedding generation into: Background workersQueuesAsynchronous indexing pipelines A simple hosted worker example: C# public class ListingEmbeddingWorker : BackgroundService { private readonly IServiceProvider _serviceProvider; private readonly ILogger<ListingEmbeddingWorker> _logger; public ListingEmbeddingWorker( IServiceProvider serviceProvider, ILogger<ListingEmbeddingWorker> logger) { _serviceProvider = serviceProvider; _logger = logger; } protected override async Task ExecuteAsync( CancellationToken stoppingToken) { while (!stoppingToken.IsCancellationRequested) { try { using var scope = _serviceProvider.CreateScope(); var service = scope.ServiceProvider .GetRequiredService<IListingEmbeddingService>(); await service.IndexPendingListingsAsync(stoppingToken); } catch (Exception ex) { _logger.LogError( ex, "Listing embedding worker failed."); } await Task.Delay( TimeSpan.FromMinutes(5), stoppingToken); } } } Vector Similarity Once embeddings are generated, similarity calculations can be performed. The most common approach is cosine similarity: cos(θ)=A⋅B∥A∥∥B∥\cos(\theta)=\frac{A\cdot B}{\|A\|\|B\|}cos(θ)=∥A∥∥B∥A⋅B A simplified helper implementation may look like this: C# public static class VectorSimilarityHelper { public static double CosineSimilarity( float[] vectorA, float[] vectorB) { double dotProduct = 0; double magnitudeA = 0; double magnitudeB = 0; for (int i = 0; i < vectorA.Length; i++) { dotProduct += vectorA[i] * vectorB[i]; magnitudeA += vectorA[i] * vectorA[i]; magnitudeB += vectorB[i] * vectorB[i]; } return dotProduct / (Math.Sqrt(magnitudeA) * Math.Sqrt(magnitudeB)); } } Semantic Similarity Alone Is Usually Not Enough One thing that became obvious during testing was that semantic similarity alone sometimes produced weak ranking behavior. For example, the following could still receive high semantic scores: Inactive listingsOutdated profilesLow-quality marketplace entries Because of this, most production marketplace systems eventually move toward hybrid ranking models.A simplified ranking formula may look like this: FinalScore=0.45SemanticScore+0.25KeywordScore+0.15PopularityScore+0.10FreshnessScore+0.05ConversionScoreFinalScore=0.45SemanticScore+0.25KeywordScore+0.15PopularityScore+0.10FreshnessScore+0.05ConversionScoreFinalScore=0.45SemanticScore+0.25KeywordScore+0.15PopularityScore+0.10FreshnessScore+0.05ConversionScore This combines: semantic relevancekeyword relevancepopularityfreshnessconversion metrics There is usually no perfect ranking formula. In practice, ranking becomes an ongoing optimization problem. Example Semantic Search Service C# public class SemanticSearchService : ISemanticSearchService { private readonly AppDbContext _dbContext; private readonly IEmbeddingService _embeddingService; public SemanticSearchService( AppDbContext dbContext, IEmbeddingService embeddingService) { _dbContext = dbContext; _embeddingService = embeddingService; } public async Task<List<SearchResultDto>> SearchAsync( string query, CancellationToken cancellationToken) { var queryVector = await _embeddingService.GenerateEmbeddingAsync( query, cancellationToken); var listings = await _dbContext.ListingEmbeddings .Include(x => x.Listing) .ToListAsync(cancellationToken); var results = listings .Select(x => { var vector = JsonSerializer.Deserialize<float[]>(x.VectorJson); var similarity = VectorSimilarityHelper.CosineSimilarity( queryVector, vector); return new SearchResultDto { ListingId = x.ListingId, Title = x.Listing.Title, SimilarityScore = similarity }; }) .OrderByDescending(x => x.SimilarityScore) .Take(50) .ToList(); return results; } } Production Challenges One thing that often gets underestimated in semantic search discussions is operational complexity. The AI layer itself is usually easier than the surrounding production engineering. A few examples include: Embedding costsQueue managementIndexing latencyRetry handlingStale embeddingsCache invalidationMultilingual relevanceRanking quality optimization For example, the following trigger embedding generation, API costs can grow much faster than expected: Listing updateProfile editSearch query Caching and embedding reuse become important fairly early in the process. Final Thoughts Semantic search is not really about replacing traditional search entirely. In most production systems, the better approach is usually to combine these into a layered ranking architecture: Semantic relevanceKeyword matchingBehavioral scoringFreshnessBusiness metrics OpenAI embeddings and .NET provide a practical foundation for building these types of marketplace systems, especially for platforms where relevance quality directly affects user experience and conversion rates. One interesting observation after introducing semantic matching is that users generally spend less time trying to “guess the correct keywords.” The platform becomes significantly better at understanding what users actually mean instead of simply matching individual words.

By Omer Yilmaz

Culture and Methodologies

Agile

Agile

Career Development

Career Development

Methodologies

Methodologies

Team Management

Team Management

A Practical Guide to Temporal Workflow Design Patterns

June 18, 2026 by Akhil Madineni

WebSocket Debugging Without a Proxy — A Browser-First Workflow

June 17, 2026 by Dan Pan

Cutting Data Pipeline Costs and Data Freshness Issues With Netflix Maestro and Apache Iceberg: A Practical Tutorial

June 16, 2026 by Intiaz Shaik

Data Engineering

AI/ML

AI/ML

Big Data

Big Data

Databases

Databases

IoT

IoT

GenAI Isn't Solving the Problem Most Development Teams Actually Have

June 19, 2026 by Gaurav Gaur DZone Core CORE

Automating Power Automate: How to Ensure Cloud Flows Are Active After Every Pipeline Deployment

June 19, 2026 by karthik nallani chakravartula

Testing Strategies for Web Development Code Generated by LLMs

June 19, 2026 by Sandesh Basrur

Software Design and Architecture

Cloud Architecture

Cloud Architecture

Integration

Integration

Microservices

Microservices

Performance

Performance

Automating Power Automate: How to Ensure Cloud Flows Are Active After Every Pipeline Deployment

June 19, 2026 by karthik nallani chakravartula

Testing Strategies for Web Development Code Generated by LLMs

June 19, 2026 by Sandesh Basrur

Your AI Coding Agent Can't Steal What It Never Had: The Docker Sandbox Isolation Story

June 19, 2026 by Shamsher Khan DZone Core CORE

Coding

Frameworks

Frameworks

Java

Java

JavaScript

JavaScript

Languages

Languages

Tools

Tools

From Open SQL to CDS Views: Rewriting SAP Data Access for Performance at Scale

June 19, 2026 by Deepika Paturu

Jakarta NoSQL: Why JPA Is Not Enough for the AI Era

June 19, 2026 by Otavio Santana DZone Core CORE

Your AI Coding Agent Can't Steal What It Never Had: The Docker Sandbox Isolation Story

June 19, 2026 by Shamsher Khan DZone Core CORE

Testing, Deployment, and Maintenance

Deployment

Deployment

DevOps and CI/CD

DevOps and CI/CD

Maintenance

Maintenance

Monitoring and Observability

Monitoring and Observability

Automating Power Automate: How to Ensure Cloud Flows Are Active After Every Pipeline Deployment

June 19, 2026 by karthik nallani chakravartula

Testing Strategies for Web Development Code Generated by LLMs

June 19, 2026 by Sandesh Basrur

Your AI Coding Agent Can't Steal What It Never Had: The Docker Sandbox Isolation Story

June 19, 2026 by Shamsher Khan DZone Core CORE

Popular

AI/ML

AI/ML

Java

Java

JavaScript

JavaScript

Open Source

Open Source

GenAI Isn't Solving the Problem Most Development Teams Actually Have

June 19, 2026 by Gaurav Gaur DZone Core CORE

Testing Strategies for Web Development Code Generated by LLMs

June 19, 2026 by Sandesh Basrur

The Cross-Lingual RAG Problem Nobody Is Talking About

June 19, 2026 by Janani Annur Thiruvengadam DZone Core CORE

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×