Programming languages allow us to communicate with computers, and they operate like sets of instructions. There are numerous types of languages, including procedural, functional, object-oriented, and more. Whether you’re looking to learn a new language or trying to find some tips or tricks, the resources in the Languages Zone will give you all the information you need and more.
Skills, Java 17, and Theme Accents
XMLReader vs XmlExtractKit for Real XML Extraction Tasks in PHP
This post walks through building and running a real-world agentic workflow with Agentican and Quarkus. Specifically, an agentic workflow to automate market research and information sharing: Identify the top vendors within a market category.Research the positioning and strengths of each vendor.Classify the findings as either standard or urgent.Draft a brief to share with others in the company. Prerequisites QuarkusJava 25Maven (or Gradle)LLM provider API key Step 1: Add the dependency Create a Quarkus app, and add the Agentican Quarkus runtime module: XML <dependency> <groupId>ai.agentican</groupId> <artifactId>agentican-quarkus-runtime</artifactId> <version>0.1.0-alpha.3</version> </dependency> Step 2: Define Agents, Skills, and the Workflow Create an `agentican-catalog.yaml` file on the classpath. This is where you describe: Who does the work (agents)What they need to do it (skills)How they will do it (workflows) YAML agents: - id: researcher name: researcher role: | Expert at finding accurate, sourced information about companies and markets. Quotes sources. Distinguishes opinion from fact. - id: writer name: writer role: | Synthesizes research into structured, concise briefs. Avoids hedging language. Cites concrete evidence. skills: - id: web-search name: web-search instructions: | When a question requires external information, call the search tool first. Quote sources in your answer. Update the `agentican-catalog.yaml` file to define the workflow. YAML workflows: - id: market-brief name: market-brief description: Research vendors in a market and produce a structured brief outputStep: deliver params: - name: topic description: Market to research required: true - name: vendor_count description: Number of vendors defaultValue: "5" steps: - name: identify agent: researcher skills: [web-search] instructions: | Identify the top {{param.vendor_count} vendors in {{param.topic}. Return a JSON array of vendor names — names only, no commentary. - name: deep-dive type: loop over: identify steps: - name: analyze agent: researcher skills: [web-search] instructions: | Deep-dive vendor {{item}: positioning, key strengths, recent news. Quote sources. - name: classify agent: writer instructions: | Read the per-vendor deep-dives below. If any vendor has launched a competitive feature in the last 30 days, return the single word 'urgent'. Otherwise return 'standard'. Deep-dives: {{step.deep-dive.output} dependencies: [deep-dive] - name: deliver type: branch from: classify default: standard branches: - name: urgent steps: - name: urgent-brief agent: writer instructions: | Synthesize a vendor brief flagged URGENT for executive review. Lead with the recent competitive moves. Topic: {{param.topic} Deep-dives: {{step.deep-dive.output} - name: standard steps: - name: standard-brief agent: writer instructions: | Synthesize a vendor brief. Topic: {{param.topic} Deep-dives: {{step.deep-dive.output} A few things worth flagging: agent: researcher references the agent for a step, skills referenced by name, too.outputStep designates the step whose output becomes the workflow's typed result.{{param.X} interpolates workflow inputs into step instructions.{{step.X.output} interpolates an upstream step's output.{{item} is the current value inside a loop iteration.type: loop steps take an over reference (a step that produced a list, or a list-typed param).type: loop steps run their nested steps once per item, in parallel, and on virtual threads.type: branch steps take a from reference (a step whose output is used to select a branch).branches: mutually exclusive steps (or sets of steps) with default for unrecognized values. The framework loads agentican-catalog.yaml from the classpath, or you can define where it's loaded from: Properties files agentican.catalog-config=/etc/agentican/agentican-catalog.yaml Note: Agents, skills, and workflows can be defined via a fluent builder API as well. Step 3: Configure the Models Agentican reads the engine configuration from `application.properties`. The minimum is one LLM: Properties files agentican.llm[0].api-key=${ANTHROPIC_API_KEY} The provider defaults to `anthropic`, and the model defaults to `claude-sonnet-4-5`. Want OpenAI instead? Properties files agentican.llm[0].provider=openai agentican.llm[0].api-key=${OPENAI_API_KEY} agentican.llm[0].model=gpt-4o-mini Want to mix and match? Configure `name`s and reference them per-agent in the YAML catalog: Properties files agentican.llm[0].name=default agentican.llm[0].api-key=${ANTHROPIC_API_KEY} agentican.llm[1].name=efficient agentican.llm[1].provider=openai agentican.llm[1].api-key=${OPENAI_API_KEY} agentican.llm[1].model=gpt-4o-mini Step 4: Create a Typed Workflow Instance Define the workflow input and output records: Java public record ResearchParams(String topic, int vendorCount) {} public record VendorBrief(String topic, List<Vendor> vendors) { public record Vendor(String name, String positioning, List<String> strengths) {} } Then inject the typed workflow, and call it from a REST endpoint: Java @Path("/market-brief") public class VendorBriefResource { @Inject @AgenticanWorkflow(name = "market-brief") Workflow<ResearchParams, VendorBrief> brief; @POST @Path("/{topic}") public VendorBrief generate(@PathParam("topic") String topic) { return brief.start(new ResearchParams(topic, 5)).await(); } } Now, test the endpoint: Shell curl -X POST http://localhost:8080/market-brief/data%20observability%20platforms A few things worth flagging — they're what set this apart from a generic "call an LLM" library: ResearchParams.vendorCount becomes the workflow parameter vendor_count via SNAKE_CASE mapping.start() returns a WorkflowRun<VendorBrief> and await() parses the output step's text into a VendorBrief.@AgenticanWorkflow(name = "vendor-brief") resolves the registered workflow at injection time. Note: WorkflowRun itself exposes future() for a CompletableFuture<R>, and there's a ReactiveWorkflow<P, R> Mutiny variant for Vert.x stacks. Step 5: Add Agent Tools Agentican ships two integrations out of the box: MCP (Model Context Protocol) There is one config block per server. Tools are auto-discovered: Properties files agentican.mcp[0].slug=github agentican.mcp[0].name=GitHub agentican.mcp[0].url=https://mcp.github.com/sse agentican.mcp[0].headers.Authorization=Bearer ${GITHUB_TOKEN} Composio 100+ SaaS toolkits — Slack, Notion, Linear, Salesforce, GitHub, Google Workspace: Properties files agentican.composio.api-key=${COMPOSIO_API_KEY} agentican.composio.user-id=user-123 Tools are referenced by name within agent steps: YAML steps: - name: research agent: researcher tools: [github_search_repositories] instructions: "Profile open-source vendors in {{param.topic}." Structured agentic workflows for the JVM. Where to Go Next Getting Started — install, configure, and run workflowsCore Concepts — architecture, terminology, and data flowWorkflows & Steps — CDI surface, beans, qualifiers, override patterns.Agents — defining agents, skills, and rolesGetting Started (Quarkus) — dependency setup, config, first taskCDI Integration — injection, qualifiers, lifecycle events, bean overridesREST API — endpoints, SSE streaming, WebSocket, error codesObservability — Micrometer metrics, OTel tracing, Prometheus queries
Artificial intelligence is rapidly transforming software development. Many developers now use AI-powered tools to generate code, but the next advancement is integrating AI directly into applications. Modern systems increasingly use large language models (LLMs) to answer questions, automate workflows, summarize information, and enhance user experiences. Software engineers must therefore combine traditional enterprise development practices with AI capabilities while ensuring reliability, scalability, and maintainability. This evolution offers Jakarta EE developers a significant opportunity. Jakarta EE provides a mature platform for enterprise applications, with standards for dependency injection, RESTful services, configuration, persistence, and cloud-native development. By integrating Jakarta EE with LangChain4j, developers can access advanced AI models through a straightforward Java API, adding intelligent features without leaving the familiar Jakarta EE environment. In this article, we will build a simple "Hello World" AI application to demonstrate how easily a Large Language Model can be integrated into a Jakarta EE application using LangChain4j. Configuring LangChain4j With Jakarta EE Technologies Before developing your first AI-powered application, it is important to understand LangChain4j’s role in the Java ecosystem and its popularity for AI integration. LangChain4j serves as an orchestration layer between Java applications and AI providers. It simplifies AI integration by offering a consistent programming model, regardless of the underlying vendor. If you are familiar with Spring Data or Jakarta Data, this concept will be familiar. With Spring Data and Jakarta Data, developers define repository interfaces and use annotations to specify behavior. Implementation details are handled by a provider that generates the concrete implementation and manages database communication. This allows developers to focus on business logic rather than low-level database operations. LangChain4j uses a similar approach for artificial intelligence. Instead of writing HTTP clients, building JSON payloads, and managing provider-specific APIs, developers define Java interfaces representing AI capabilities. LangChain4j then generates the implementation and manages communication with the chosen AI provider. LangChain4j can be viewed as the AI equivalent of Jakarta Data or Spring Data, with the AI provider dependency functioning like a JDBC driver. Switching from one AI provider to another, such as from OpenAI to a different provider, usually only requires updating the dependency and configuration, while the application code remains largely unchanged. While this article uses a Java SE application for simplicity, the same approach applies to Jakarta EE, Spring Boot, Quarkus, Helidon, Micronaut, and other Java platforms. Project Dependencies The first step is to create a Maven Quickstart project and add the required dependencies for CDI, Eclipse MicroProfile Config, and LangChain4j: XML <dependency> <groupId>io.smallrye.config</groupId> <artifactId>smallrye-config-core</artifactId> <version>3.17.2</version> <scope>compile</scope> </dependency> <dependency> <groupId>io.smallrye.config</groupId> <artifactId>smallrye-config</artifactId> <version>3.17.2</version> </dependency> <dependency> <groupId>org.jboss.weld.se</groupId> <artifactId>weld-se-core</artifactId> <version>6.0.4.Final</version> </dependency> <dependency> <groupId>dev.langchain4j.cdi</groupId> <artifactId>langchain4j-cdi-portable-ext</artifactId> <version>${langchain4j-cdi.version}</version> </dependency> <dependency> <groupId>dev.langchain4j.cdi.mp</groupId> <artifactId>langchain4j-cdi-config</artifactId> <version>${langchain4j-cdi.version}</version> </dependency> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-open-ai</artifactId> <version>1.15.0</version> </dependency> This example uses the langchain4j-open-ai dependency, which serves as the provider-specific driver for communicating with OpenAI models. The application code remains independent of the provider implementation. Configuring the AI Provider LangChain4j integrates with Eclipse MicroProfile Config, allowing you to externalize all provider settings. Create a microprofile-config.properties file and add the following configuration: Properties files dev.langchain4j.cdi.plugin.chat-model.class=dev.langchain4j.model.openai.OpenAiChatModel dev.langchain4j.cdi.plugin.chat-model.config.api-key=<<API_KEY>> dev.langchain4j.cdi.plugin.chat-model.config.model-name=gpt-5 This configuration specifies the chat model implementation, the authentication API key, and the model that will process prompts. A key advantage of this approach is flexibility. If you choose another provider in the future, you typically only need to replace the provider dependency and update the configuration. The application code often remains unchanged, reinforcing the provider dependency’s role as similar to that of a JDBC driver in traditional data access. For this sample, you can place the API key directly in the configuration file or provide it through environment variables. In production, use environment variables, secret managers, or vault solutions. Never commit API keys to source control, as exposed credentials can lead to unauthorized use, unexpected costs, and security risks. Your First AI Service With the project configured, we can now build our first AI-powered service. As is customary in software development, we will begin with a “Hello World” example. Rather than printing a static message, we will send a question to an AI model and display its response. This example uses the simplest contract: a String as input and a String as output. Although real-world applications typically use more complex domain objects, starting with plain text helps us focus on the core LangChain4j programming model and understand how to create and use AI services. The first step is defining an AI service interface: Java import dev.langchain4j.cdi.spi.RegisterAIService; import jakarta.enterprise.context.ApplicationScoped; @RegisterAIService @ApplicationScoped public interface AssistantService { String chat(String prompt); } This interface does not include an implementation. LangChain4j generates the implementation automatically at runtime. The @RegisterAIService annotation directs LangChain4j to create an AI-backed implementation for this interface. The @ApplicationScoped annotation makes the generated implementation available as a CDI bean, which can be injected or accessed like any other Jakarta EE component. The method signature defines the AI contract. When the chat method is called, the parameter serves as the prompt for the AI model, and the returned value contains the generated response. In this example, both the request and response are simple strings. Next, we need a client application to consume this service: Java import jakarta.enterprise.context.control.RequestContextController; import jakarta.enterprise.inject.se.SeContainer; public class App { public static void main(String[] args) { try (SeContainer container = jakarta.enterprise.inject.se.SeContainerInitializer .newInstance() .initialize()) { RequestContextController requestContextController = container.select(RequestContextController.class).get(); requestContextController.activate(); AssistantService assistantService = container.select(AssistantService.class).get(); String response = assistantService.chat("What is the capital of France?"); System.out.println("Assistant response: " + response); requestContextController.deactivate(); } } } The application starts a CDI container using Weld SE, which provides dependency injection in a Java SE environment. After initializing the container, we activate the request context and obtain an instance of AssistantService from CDI. Although there is no concrete implementation in the codebase, CDI returns a fully functional service generated by LangChain4j. When the chat method is called, LangChain4j sends the prompt to the configured AI model, waits for the response, and converts the result into a Java String. Running the application produces an output similar to the following: Plain Text Assistant response: Paris is the capital of France. The exact wording may vary because large language models are probabilistic systems. Unlike traditional methods that always return the same result for a given input, AI models may produce slightly different responses while maintaining the same meaning. While using strings is useful for learning the fundamentals, enterprise applications rarely exchange raw text between layers. Business applications typically use structured data, domain objects, commands, and responses to ensure stronger contracts and better maintainability. In the next section, we will enhance this example by replacing raw strings with dedicated input and output classes, enabling LangChain4j to map between Java objects and AI interactions in a more type-safe and expressive manner. Working With Structured Input and Output The previous example showed a basic AI interaction: a string input produces a string output. While this illustrates the fundamentals, real-world applications rarely use unstructured text alone. Enterprise systems typically exchange well-defined objects that represent business concepts, making code more expressive, maintainable, and type-safe. LangChain4j’s key strength is its ability to map Java objects directly to AI interactions. It automatically converts structured input into prompts and transforms AI responses into strongly typed Java objects, eliminating the need for manual serialization and parsing. Developers can work with domain concepts instead of raw text. To demonstrate this, we will build a simple book recommendation engine. Given a book title and author, the AI will suggest three books that logically follow in a learning journey. We begin by defining the input object: Java public record BookRequest(String title, String author) { } This record captures the user’s input. Instead of manually creating a textual prompt, we provide a structured Java object with the book’s title and author. Next, we define the domain model representing a recommended book: Java import java.util.List; public record Book( String title, String author, String description, List<String> keywords) { } This record contains richer information than a simple title. This record includes more than just the title and author. It also provides a short description and a set of keywords to further characterize the recommendation, with the reason why the book was selected: Java public record Recommendation(Book book, String reason) { } Finally, we create a wrapper object that represents the complete response returned by the AI service: Java import java.util.List; public record NextReadBooks(List<Recommendation> recommendations) { } At this stage, we have a complete domain model for both the request and the expected response. Next, we define the AI service: Java import dev.langchain4j.cdi.spi.RegisterAIService; import dev.langchain4j.service.SystemMessage; import jakarta.enterprise.context.ApplicationScoped; @ApplicationScoped @RegisterAIService public interface NextReadBookService { @SystemMessage(""" Recommend up to 3 books that should naturally follow the provided book in a learning journey. Recommendations should prioritize: - conceptual progression - complementary knowledge - technical depth - thematic similarity For each recommendation provide: - title - author - concise description - relevant keywords - a short recommendation reason Keep recommendations concise, technically relevant, and focused on software engineering and architecture learning. """) NextReadBooks nextReadBooks(BookRequest bookRequest); } This example also introduces the concept of a system message. The @SystemMessage annotation provides instructions that guide the model’s behavior. Unlike user input, which varies with each request, the system message serves as a permanent set of rules for AI responses. Here, we instruct the model to recommend up to three books, explain each recommendation, and return the information using our defined Java records. The method signature uses only domain objects: BookRequest as input and NextReadBooks as output. There is no need for manual JSON handling, prompt creation, or response parsing, as LangChain4j manages these tasks automatically. The application code remains straightforward: Java import jakarta.enterprise.context.control.RequestContextController; import jakarta.enterprise.inject.se.SeContainer; public class BookApp { public static void main(String[] args) { try (SeContainer container = jakarta.enterprise.inject.se.SeContainerInitializer .newInstance() .initialize()) { RequestContextController requestContextController = container.select(RequestContextController.class).get(); requestContextController.activate(); var bookService = container.select(NextReadBookService.class).get(); BookRequest request = new BookRequest( "The Great Gatsby", "F. Scott Fitzgerald"); var recommendations = bookService.nextReadBooks(request); for (var recommendation : recommendations.recommendations()) { System.out.println( "Recommended book: " + recommendation.book().title() + " by " + recommendation.book().author()); System.out.println( "Reason: " + recommendation.reason()); } requestContextController.deactivate(); } } } When executed, LangChain4j converts the BookRequest into a prompt, sends it to the model, validates the response against the target structure, and maps the result back into NextReadBooks. For developers, this interaction is similar to calling a standard Java service. This approach offers clear advantages over raw string-based interactions. The code is easier to understand, IDE autocompletion enhances productivity, and refactoring is safer because inputs and outputs are explicit domain models. The application can also adapt more easily to new business requirements. So far, our examples have used explicit user requests and static system instructions. However, modern AI applications often need additional context beyond user input. In the next section, we will explore how to enrich AI interactions with external knowledge and context, enabling the model to produce more accurate and relevant responses aligned with the application’s domain.
Java structured concurrency has been under development for a span of 5 years, weaving through 8 (!) distinct JEPs (JEP 428, JEP 437, JEP 453, JEP 462, JEP 480, JEP 499, JEP 505, JEP 525). To me, this feels rather excessive for what could be considered a fairly concise feature. My goal here is to experiment with an alternative approach that leverages Java's tried-and-tested, robust functionality available since JDK 1.5. It's possible this pathway could achieve better outcomes than what is proposed in JEP 505, which, from my perspective, introduces a suite of redundant interfaces and classes that replicate pre-existing ones. No doubt, developers need some governance, even in a relatively safe development environment like Java, with its automatic garbage collection, memory management, and strict typing. No matter how safe the provided path is, developers will still make mistakes, such as dereferencing nulls, using out-of-bound indexes, swallowing exceptions, and who knows what else. And, undoubtedly, concurrency is the hardest thing to get right — it's an endless source of bugs. But first, let me introduce some helper code that we will use throughout the article. Java // Example Proto package net.tascalate.concurrentx; // imports here public class FuturesDemo { static final ScopedValue<String> DEMO_SV = ScopedValue.newInstance(); // This emulates long-running calls // we need to execute asynchronously -- // all we do is returning value after the delay // or throw a supplied exception to emulate error private static <T> Callable<T> produceValue(T value, long delay) { return () -> { var start = System.currentTimeMillis(); try { System.out.println(">> Waiting value: " + value + " (SCOPED VALIUE IS " + DEMO_SV.orElse("<UNBOUND>") + ")"); Thread.sleep(delay); System.out.println(">> Producing value: " + value); if (value instanceof Exception) { throw (Exception)value; } else { return value; } } finally { var finish = System.currentTimeMillis(); System.out.println(">> Exiting " + value + ", " + Thread.currentThread() + ", done in " + (finish - start) + "ms, vs " + delay + "ms specified"); } }; } public static void main(String[] argv) { // implementation will be here } } According to Oracle, the majority of Java developers tend to approach concurrency execution in the following way (excerpt courtesy JEP 505, modified to use a helper code from above): Java // Example A - "unstructured concurrency" public static void main(String[] argv) throws InterruptedException, ExecutionException { var executor = Executors.newVirtualThreadPerTaskExecutor(); var start = System.currentTimeMillis(); try { Future<String> a = executor.submit( produceValue("A", 1000)); Future<LocalDateTime> b = executor.submit( produceValue(LocalDateTime.now(), 1500)); Future<BigInteger> c = executor.submit( produceValue(BigInteger.valueOf(42), 500)); var result = List.of(a.get(), b.get(), c.get()); System.out.println("*** ALL result: " + result); } finally { var finish = System.currentTimeMillis(); System.out.println( "*** Exiting main, executed in " + (finish - start) + "ms"); executor.shutdownNow(); } } Here, a range of critical problems lurk, several of which are detailed in the "Motivation" section of the JEP: In contrast to the above example, Oracle proposes the use of its structured concurrency API as a solution that, hypothetically, addresses these concerns: Java // Example B -- structured concurrency @SuppressWarnings("preview") public static void main(String[] argv) throws InterruptedException, ExecutionException { var start = System.currentTimeMillis(); try (var scope = StructuredTaskScope.open( StructuredTaskScope.Joiner.allSuccessfulOrThrow())) { var a = scope.fork(produceValue("A", 1000)); var b = scope.fork(produceValue(LocalDateTime.now(), 1500)); var c = scope.fork(produceValue(BigInteger.valueOf(42), 500)); scope.join(); var result = List.of(a.get(), b.get(), c.get()); System.out.println("*** ALL result: " + result); } catch (StructuredTaskScope.FailedException ex) { System.out.println("*** ALL exception: " + ex.getCause()); } finally { var finish = System.currentTimeMillis(); System.out.println( "*** Exiting main, executed in " + (finish - start) + "ms"); } } Let’s shift our focus back to the original code. After putting in diligent QA efforts, writing useful tests with good code coverage, and completing a thorough code review, what’s the developer’s next move? Most likely, they'll refine the initial code block to resemble the updated version below: Java // Example C - fixed "unstructured concurrency" from Example A public static void main(String[] argv) throws InterruptedException, ExecutionException { Future<String> a = null; Future<LocalDateTime> b = null; Future<BigInteger> c = null; var executor = Executors.newVirtualThreadPerTaskExecutor(); var start = System.currentTimeMillis(); try { a = executor.submit(produceValue("A", 1000)); b = executor.submit(produceValue(LocalDateTime.now(), 1500)); c = executor.submit(produceValue(BigInteger.valueOf(42), 500)); var result = List.of(a.get(), b.get(), c.get()); System.out.println("ALL result: " + result); } finally { var finish = System.currentTimeMillis(); Stream.of(a, b, c) .filter(Objects::nonNull) .forEach(f -> f.cancel(true)); System.out.println( "*** Exiting main, executed in " + (finish - start) + "ms"); executor.shutdownNow(); } } At a glance, this approach seems fairly effective — any remaining Features are canceled in the instance of an intermediate error, and all execution threads are properly terminated. However, there's still a fair amount of boilerplate code, which remains cumbersome to implement consistently. No problem, let's extract common functionality into some reusable class. Please see the TaskScope class in the Gist. By doing so, the code undergoes a noticeable transformation: Java // Example D - fixed "unstructured concurrency" from Example A // with a reusable TaskScope class public static void main(String[] argv) throws InterruptedException, ExecutionException { var start = System.currentTimeMillis(); try (var scope = new TaskScope( Executors.newVirtualThreadPerTaskExecutor())) { var a = scope.fork(produceValue("A", 1000)); var b = scope.fork(produceValue(LocalDateTime.now(), 1500)); var c = scope.fork(produceValue(BigInteger.valueOf(42), 500)); var result = List.of(a.get(), b.get(), c.get()); System.out.println("*** ALL result: " + result); } finally { var finish = System.currentTimeMillis(); System.out.println( "*** Exiting main, executed in " + (finish - start) + "ms"); } } Upon inspecting the Gist sources — which you absolutely should for understanding — you’ll notice something important: this implementation relies on Java version 1.8, released over 12 years ago. Furthermore, if it does not use java/util/stream/Stream, it can even run seamlessly on JDK 1.5! But hold on — why incorporate java/util/stream/Stream here? Quite frankly, it's the core of the proposal. Take example D above: it efficiently handles just one scenario, namely, waiting for all tasks to finish while throwing an error if any fail along the way. Support for different scenarios requires something a bit more sophisticated. The TaskScope implementation shared in the Gist translates a queue of completed Futures (irrespective of whether completion came via a result, error, or cancellation) directly into a Stream. Curious why this may be useful? Let's rewrite this boring example once again: Java // Example E - same as Example D but with Stream pipeline public static void main(String[] argv) { var start = System.currentTimeMillis(); try (var scope = new TaskScope( Executors.newVirtualThreadPerTaskExecutor())) { scope.fork(produceValue("A", 1000)); scope.fork(produceValue(LocalDateTime.now(), 1500)); scope.fork(produceValue(BigInteger.valueOf(42), 500)); var result = scope.completions() .map(Future::resultNow) .toList(); System.out.println("*** ALL result: " + result); } finally { var finish = System.currentTimeMillis(); System.out.println( "*** Exiting main, executed in " + (finish - start) + "ms"); } } This way, we just convert all the completed features into the list of results and keep our fingers crossed that there were no errors. Let’s turn all successfully completed futures into a result list, disregarding potential errors entirely. No exceptions will ever be thrown within this scope: Java var result = scope.completions() .filter(f -> f.state() == Future.State.SUCCESS) .map(Future::resultNow) .toList(); Or simply find the first result available: Java var result = scope.completions() .filter(f -> f.state() == Future.State.SUCCESS) .map(Future::resultNow) .findAny() .orElse("<NONE>"); Or, alternatively, select no more than the first N results: Java var N = 5; var result = scope.completions() .filter(f -> f.state() == Future.State.SUCCESS) .map(Future::resultNow) .limit(N) .toList(); In these two recent examples, any remaining futures will automatically be terminated once the try-with-resources block in the main method exits. Clearly, we can also handle errors while gathering results and terminate prematurely — if the code logic doesn't permit intermediate errors: Java var result = scope.completions() .peek(f -> { if (f.state() == Future.State.FAILED) throw new CompletionException(f.exceptionNow()); }) .map(Future::resultNow) .limit(2) .toList(); If you're already acquainted with JEP 505, you’ll understand what is being replaced here: StructuredTaskScope.Joiner. Now, you can mimic any type of "join" behavior without the need to subclass/implement StructuredTaskScope.Joiner. The Stream pipeline API over the completions queue serves as an expressive tool to achieve this out of the box. Plus, with the introduction of Gatherers, there’s room for truly ad hoc scenarios, such as managing result windows — think fixed-size batches of completed results processed as soon as they are ready. It’s also worth noting that in JEP 505, a certain StructuredTaskScope.Joiner implementations produce streams as their output. However, it’s the Joiner that determines when all forks have finished processing and opens the resulting stream post-join. In the alternative methodology described here, the decision of where and how joins occur resides within user-defined scope-flow logic. It’s a lazy, on-demand process — guided by conditions that may take more into account than just Future results. For instance, elements like internal object state or in-scope variables can directly influence decisions about which results to collect and which errors, if any, can be disregarded in the operation. Now to the real challenge. A notable limitation with the code given is its inability to propagate context, namely, the current ScopedValue-s bindings. This characteristic is sometimes cited as a primary strength of JEP 505 StructuredTaskScope. To be fair, one might argue it's an unfair advantage, one that exists solely because JDK-internal mechanisms make it achievable. Current bindings are captured and propagated by using jdk/internal/misc/ThreadFlock — a utility inaccessible to code outside of the JDK. Perhaps, in a more ideal universe, there is a JDK 25, equipped with the following official API for java/util/concurrent/ThreadFactory, introducing possibilities for bridging this gap: Java public interface ThreadFactory { abstract Thread newThread(Runnable code); default ThreadFactory captureContext() { ThreadFactory delegate = this; Object currentScopedValueBindings = SomeInternalClass.captureValueBindingsForTheCurrentThread(); return new ThreadFactory() { public Thread newThread(Runnable code) { Thread result = delegate.newThread(code); SomeInternalClass.applyValueBindings(result); return result; } }; } } But that's not the case for us. Thankfully, the classes from the java/util/concurrent package offer immense customizability and are remarkably adaptable tools (a big nod to Dr. Douglas S. Lea for this). So you can find another class, TaskScopeContextual, in the same Gist. This class adopts StructuredTaskScope to the ExecutorService API, solely aimed at promoting ScopedValue bindings for forked tasks. The following example highlights all the advantages of employing this alternative structured scope design: Java // Example F - true structured concurrency with context passing public static void main(String[] argv) { var start = System.currentTimeMillis(); ScopedValue.where(DEMO_SV, "VALUE_DEFINED_IN_MAIN").call(() -> { try (var scope = new TaskScopeContextual()) { scope.fork(produceValue("A", 1000)); scope.fork(produceValue("B", 2000)); scope.fork(produceValue("C", 2000)); scope.fork(produceValue("D", 2000)); var timeout = scope.fork(produceValue(null, 2500)); scope.fork(produceValue("E", 2000)); scope.fork(produceValue("F", 3000)); scope.fork(produceValue("G", 3000)); var result = scope.completions() .takeWhile(f -> f != timeout) .filter(f -> f.state() == Future.State.SUCCESS) .limit(6) .map(Future::resultNow) .sorted() .toList(); System.out.println("*** ALL result: " + result); } finally { var finish = System.currentTimeMillis(); System.out.println( "*** Exiting main, executed in " + (finish - start) + "ms"); } return null; }); } Take note of the elegant handling of timeouts with Streams. Unlike the current approach in JEP 505, there's no necessity to incorporate it into the API. In summary, here’s a recap: There's no requirement for StructuredTaskScope.Subtask — the existing java/util/concurrent/Future API already does the job adequately. Consequently, the inclusion of StructuredTaskScope.Subtask.State is redundant — even with the current JEP 505, Future.State is more than sufficient. StructuredTaskScope.Joiners demand subclassing for all but the simplest cases. A java/util/stream/Stream pipeline over the completed futures would serve as a much more convenient solution. The StructuredTaskScope.FailedException feels unnecessary — even in the current API, java/util/concurrent/CompletionException fulfills the same purpose just fine. Built-in StructuredTaskScope timeouts possess timing characteristics that are challenging to predict (e.g., try adding lengthy blocking calls before the initial fork). It's far simpler and more controlled to handle timeouts explicitly. I'm really interested to hear readers' opinions. Do you share my ideas or do you support JDK team's statement that Futures "are counterproductive in structured concurrency" (see the "Alternatives" section of JEP 505)? Would you say that the well-known and adaptable Stream API is superior to Joiners or strict set of Joiners is simpler?
If you've ever inherited a Spark job that runs in 35 minutes and someone asks you to make it faster, you know the routine. You start by checking partition counts, then file sizes, then shuffle stages, then broadcast hints. You find a handwritten OPTIMIZE schedule from 2022, a Z-ORDER on the wrong column, and a cluster sized for last year's data volume. By the time you've made the job fast, you've absorbed three new things to maintain. The next person to inherit it will absorb four. This pattern — call it the hand-tuning treadmill — is what the declarative optimization story on Databricks is trying to break. It's not a single feature; it's a cluster of capabilities that collectively let teams describe what a table should look like and let the engine handle the physical optimizations. What follows is the practical view of those patterns: where they fit, what they replace, and how to migrate without a rewrite weekend. 1. The Hand-Tuning Treadmill: Why Imperative Optimization Doesn't Scale Before getting into the declarative side, it's worth being concrete about what "imperative Spark optimization" actually means in production. The shape is consistent across teams I've audited: Layout decisions frozen on day one. Somebody picks a partition column when the table is created. The data shape changes a year later. Nobody re-partitions because the migration is scary. Query plans drift toward full scans.Maintenance jobs that nobody owns. An OPTIMIZE / Z-ORDER / VACUUM script lives in a notebook scheduled at 3 AM. It runs on a cluster that's slightly mis-sized. When data volume grows, the job runs into the morning workload, and people complain about latency.Cluster sizing as a guess. Worker count is a heuristic from a senior engineer's memory of last year's spike. Half the time it's too big, half the time it's too small, and the cost discussion gets emotional.Hint-driven plans. Broadcast hints, repartition hints, coalesce (N) — sprinkled through pipelines to fix yesterday's problem, kept indefinitely because removing them feels risky. None of these are bugs. They're symptoms of the imperative model: the team owns the layout, the maintenance, the sizing, and the plan tuning. In small pipelines, ownership is fine. At scale, it becomes the bottleneck that the team can't outsource. 2. What "Declarative" Means in the Spark Optimization Context Declarative is a word that gets used in two different ways here, and it's worth pulling them apart. Within Lakeflow pipelines (formerly DLT), it means "describe the tables, not the steps" — the engine builds the DAG and runs it. But in the broader optimization story, declarative also means "describe the desired property of the table or workload, not the operations to maintain it": Layout: I want this table clustered by these columns; figure out when and how to re-cluster.Maintenance: I want this table optimized and vacuumed; figure out the schedule.Ingestion: I want all new files in this path picked up exactly once; figure out checkpointing and listing.Quality: These rows must satisfy these expectations; enforce them and report what gets dropped.Compute: I want this query fast and not wasteful; size and scale appropriately. Each one of those bullets corresponds to a piece of the declarative stack. Used together, they replace a remarkable amount of the boilerplate that has historically lived in Spark pipelines. The mental shift: You stop writing operations against the table and start writing properties of the table. The engine becomes the actor; you become the editor. 3. The Declarative Optimization Stack on Databricks The chart below maps each thing the team declares to the engine capability that handles it, ending at the physical Delta table. It's the picture I draw on whiteboards when teams ask, "What's the order to adopt these in?" Figure 1. The declarative optimization stack: each user-facing intent at the top maps to a continuous engine behavior, which keeps the underlying Delta tables well-clustered, compacted, and statistically up-to-date — without human intervention. Two things are worth highlighting in this picture. First, every box in the engine row is something that runs continuously, not on a cron — there is no daily "optimization window" anymore. Second, the bottom layer is identical to what you'd get from any well-tuned imperative pipeline: 256 MB Parquet files with current statistics. The declarative path doesn't change what good looks like; it changes who does the work to keep things looking good. 4. Layout: Liquid Clustering Replaces Hand-Maintained Z-ORDER Liquid Clustering is the change with the largest practical impact, because partition-key choices are where most lakehouse pipelines accumulate the most technical debt. The declarative version: you specify the columns the data is most often filtered or joined by, and the engine maintains a layout that supports those access patterns — incrementally, as new data arrives, without a full rewrite. When access patterns change, you change the cluster columns, and the engine re-clusters in the background. Defining Liquid-Clustered Tables SQL -- New table, clustered by the columns most commonly filtered on. -- No more PARTITIONED BY, no more guessing at partition cardinality. CREATE TABLE prod.gold.daily_totals ( account_id STRING, region STRING, ingest_date DATE, daily_total DECIMAL(18,2), txn_count BIGINT ) USING DELTA CLUSTER BY (region, ingest_date, account_id); -- Even better: let the engine pick the clustering columns by -- observing real query patterns over time. CREATE TABLE prod.gold.events_clustered USING DELTA CLUSTER BY AUTO AS SELECT * FROM prod.silver.events; Migrating an Existing Partitioned/Z-ORDER Table SQL -- Convert a legacy partitioned table to liquid clustering. -- Existing data files are not rewritten immediately; the engine -- rebalances incrementally on subsequent writes + maintenance. ALTER TABLE prod.silver.transactions CLUSTER BY (account_id, ingest_date); -- Force the first clustering pass for a freshly converted table OPTIMIZE prod.silver.transactions FULL; Why this matters: the recurring 2 AM Slack thread of "can we re-partition this table?" goes away. Layout becomes a property you change with one DDL statement, not a multi-week rewrite project. 5. Maintenance: Predictive Optimization Replaces Cron-Driven OPTIMIZE/VACUUM Predictive optimization is the part that retired the most legacy code in the pipelines I've migrated. Once enabled at the catalog or schema level, the engine monitors each table's read and write patterns and decides on its own when to compact files, re-cluster, vacuum, and refresh statistics. The big win isn't the operations themselves — the imperative pipeline could already run those — it's that the timing is observed-driven, not schedule-driven. Tables that get heavy ingestion get more frequent maintenance. Cold tables get left alone. SQL -- Turn it on at the catalog level once; new tables inherit. ALTER CATALOG prod SET PREDICTIVE OPTIMIZATION = ENABLED; -- Or at the schema level for a phased rollout ALTER SCHEMA prod.gold SET PREDICTIVE OPTIMIZATION = ENABLED; -- Inspect what the engine has been doing on a given table SELECT operation, operation_metrics.numFilesAdded AS files_added, operation_metrics.numFilesRemoved AS files_removed, operation_metrics.numOutputBytes AS output_bytes, timestamp FROM (DESCRIBE HISTORY prod.gold.daily_totals) WHERE userMetadata IS NULL -- engine-driven, not user AND operation IN ('OPTIMIZE', 'VACUUM') AND timestamp >= current_timestamp() - INTERVAL 7 DAYS ORDER BY timestamp DESC; What you should delete after enabling this: the nightly notebook that runs OPTIMIZE on every table in a schema, the VACUUM cron job, the ANALYZE TABLE wrapper, and the alerting that wakes someone up when those jobs run long. None of them are needed anymore, and leaving them on creates duplicate work that the engine and the cron will fight over. 6. Ingestion: Auto Loader Replaces Listing-Based File Detection Auto Loader is the declarative answer to the perennial "which files have we processed already?" problem. Instead of listing a directory, comparing it to a state file, and figuring out the new bits, you describe the source location and the format and let the engine maintain its own incremental state. It uses cloud-native event notifications (S3 events, ADLS notifications, or efficient directory listing as a fallback), and the checkpoint is just another piece of state the engine owns. Python from pyspark.sql.functions import current_timestamp # Streaming ingest from S3 with schema inference + evolution. # Replaces hand-maintained checkpointing, listing logic, and # whatever file-tracking table the team built two years ago. (spark.readStream .format("cloudFiles") .option("cloudFiles.format", "json") .option("cloudFiles.inferColumnTypes", "true") .option("cloudFiles.schemaLocation", "s3://acme-checkpoints/txns_schema") .option("cloudFiles.schemaEvolutionMode", "addNewColumns") .load("s3://landing/txns/") .withColumn("_ingest_ts", current_timestamp()) .writeStream .format("delta") .option("checkpointLocation", "s3://acme-checkpoints/txns_writer") .trigger(availableNow=True) # batch-style; runs to completion .toTable("prod.bronze.txns")) Two notes from production. First, schemaEvolutionMode is the option that prevents the silent-data-loss class of bugs when partner schemas change; pick the policy explicitly rather than letting it default. Second, trigger(availableNow=True) gives you batch ergonomics on a streaming source — the job runs until it has consumed everything and exits, which is what most teams actually want for daily ingestion. 7. Transforms and Quality: Declarative Pipelines Replace Bare Spark + External DQ The final piece is the transformation layer. Lakeflow pipelines (the rebrand of Delta Live Tables) let you declare each table as a Python or SQL definition, and add expectations as a first-class concept. The engine derives the DAG from the dependencies and enforces the expectations on every write — the data quality framework, the lineage layer, and the orchestration glue collapse into a single artifact. Python import dlt from pyspark.sql.functions import sum as _sum, col @dlt.table( name="silver_txns", table_properties={ "delta.enableChangeDataFeed": "true", "delta.tuneFileSizesForRewrites": "true", }, cluster_by=["account_id", "ingest_date"], ) @dlt.expect_or_drop("non_null_amount", "amount IS NOT NULL") @dlt.expect_or_fail("valid_currency", "currency IN ('USD','EUR','GBP')") @dlt.expect("unique_txn", "txn_id IS NOT NULL") def silver_txns(): return (dlt.read_stream("bronze_txns") .dropDuplicates(["txn_id"])) @dlt.table(name="gold_daily_totals") def gold_daily_totals(): return (dlt.read("silver_txns") .groupBy("ingest_date", "account_id", "region") .agg(_sum("amount").alias("daily_total"))) The decorators do four things at once: define the table, declare its layout (cluster_by), declare its quality rules, and let the engine infer that gold_daily_totals depends on silver_txns from the dlt.read call. There is no DAG file. There is no separate Great Expectations suite. Lineage is generated for free in Unity Catalog, including column-level edges. If you want to query how the expectations have been performing — useful for SLO dashboards or alerting — the event log surfaces it directly: SQL -- Pass / fail / drop counts per expectation, last 24 hours SELECT flow_name, details:flow_progress.data_quality.expectations[0].name AS exp_name, details:flow_progress.data_quality.expectations[0].passed_records AS passed, details:flow_progress.data_quality.expectations[0].failed_records AS failed, details:flow_progress.data_quality.expectations[0].dropped_records AS dropped, timestamp FROM event_log("<pipeline-id>") WHERE event_type = 'flow_progress' AND timestamp >= current_timestamp() - INTERVAL 1 DAY ORDER BY timestamp DESC; 8. Putting It Together: Where to Start, What to Measure Adopting all of this at once is a recipe for pain. The order I've seen work, and a small set of metrics to verify the change is paying off: Step Adopt Retire Verify with 1 Predictive optimization at schema level Nightly OPTIMIZE / VACUUM jobs Reduction in maintenance-cluster cost 2 Liquid clustering on top 5 tables Static partitioning + Z-ORDER p95 query latency on the same workloads 3 Auto loader for 1-2 ingestion pipelines Custom file-tracking + listing logic End-to-end data freshness 4 Lakeflow pipelines for new pipelines only External DQ + DAG glue (for new work) Lines of pipeline code per table 5 Serverless compute for SQL warehouses + DLT Hand-sized job clusters Cost-per-query, scale-up time What you do not need to migrate: imperative pipelines that already work and aren't growing. Declarative patterns are about new work and high-pain hot spots, not a heroic rewrite of every notebook ever shipped. 9. Honest Limitations and Where Imperative Still Wins Three places where the declarative model still bites — worth knowing before you commit: Procedural logic still belongs in Jobs. If your pipeline is really a sequence of API calls with branching error handling, that's a Lakeflow Job (or external code), not a declarative table. Don't try to bend dlt around it.Predictive optimization needs observation time. On a table that's a week old, the engine hasn't seen enough patterns to make great decisions. For tables under heavy initial load, an explicit OPTIMIZE FULL after the first big ingest still helps.Cluster-by-column choice still matters. CLUSTER BY AUTO is great for stable workloads with predictable filters. For tables whose access pattern is genuinely heterogeneous across teams, an explicit cluster-by based on the dominant query is usually faster.Hint-driven escapes are still allowed. If a particular query benefits from a /*+ BROADCAST(t) */ hint and AQE isn't catching it, the hint is fine. Just keep them rare and document why. Conclusion The declarative optimization story isn't a single feature you toggle — it's a quiet shift in who owns the boring parts of a Spark pipeline. Layout, maintenance, ingestion bookkeeping, plan tuning, cluster sizing, data quality enforcement: every one of those was traditionally a thing the team owned and paid for in toil. The current Databricks stack lets you express each as an intent and let the engine handle the operations underneath. Adopt them in order, retire what they replace, and the optimization treadmill slows from a daily concern to a quarterly review. That's the actual win, and it's the reason the declarative paradigm has gone from a Lakeflow detail to the default mental model for new pipelines on Databricks.
For decades, Jakarta EE has addressed the challenge of building enterprise systems that endure technological change. The platform has evolved from monoliths to microservices, from application servers to Kubernetes, and from relational databases to distributed data platforms, all while maintaining its core strength: compatibility. Jakarta EE 12 marks another significant transition, shifting the focus beyond cloud-native infrastructure and APIs to prioritize data. Modern enterprise systems now operate in diverse environments that extend beyond relational databases and synchronous CRUD applications. Current architectures integrate SQL, document databases, graph engines, key-value stores, event streams, vector databases, and AI-driven workflows. The primary challenge is to provide a unified programming model that manages fragmented data ecosystems without vendor lock-in or frequent application rewrites. Jakarta EE 12 addresses this by elevating querying, data access, initialization, and semantic consistency to platform-level concerns. This release marks the beginning of the “Data Age” for Enterprise Java. Central to this evolution is Jakarta Query, a unified semantic query model that connects Jakarta Persistence, Jakarta Data, and Jakarta NoSQL through a common abstraction. Rather than having each specification define its own querying semantics, Jakarta EE 12 introduces a shared language that spans multiple persistence technologies while supporting specialized execution models. This architectural shift reduces ecosystem fragmentation and delivers a more consistent developer experience for polyglot persistence systems. Jakarta EE 12 also extends beyond traditional dependency injection and request processing. CDI now offers more predictable startup and lifecycle management, which is essential for cloud-native deployments, serverless runtimes, AI orchestration, and agent-based architectures. With Java 21 as the new platform baseline, Jakarta EE is positioned as a modern platform that supports long-lived, adaptive systems in a data- and AI-driven world. This article will examine how Jakarta EE 12 transforms the enterprise Java ecosystem through Jakarta Query, Jakarta Data, Jakarta NoSQL, CDI 5.0, Persistence 4.0, and new initiatives such as Jakarta Agentic AI. We will also discuss how these specifications form a unified platform strategy that simplifies enterprise development while maintaining the stability and interoperability that have made Java a leading software ecosystem. The Evolution of Enterprise Complexity Software architecture has consistently evolved to address complexity. Initially, organizations relied on centralized mainframes, where applications, infrastructure, and data resided in a single environment. The shift to client-server and three-tier models introduced distributed systems, separating presentation, business logic, and persistence into distinct layers. Today, cloud-native systems span clusters, distributed networks, containers, Kubernetes, edge devices, and globally replicated databases. Modern enterprise software functions as an ecosystem of interconnected services across infrastructure that developers may not fully control. This evolution has significantly increased the cognitive demands on software engineers and architects. Today’s technology landscape includes a wide array of frameworks, runtimes, databases, APIs, messaging systems, orchestration platforms, and AI-driven tools. Developer experience is now a competitive market, with platforms promising productivity, simplicity, and scalability. Engineers must continually balance trade-offs among performance, consistency, scalability, operational complexity, and vendor lock-in. The industry also faces the “hype effect,” where technologies gain popularity before their long-term impacts are fully understood. As systems became more distributed, architectural styles proliferated. Traditional layered architectures now exist alongside microservices, event-driven systems, CQRS, orchestration platforms, microkernels, and domain-driven designs. Each style addresses specific challenges. Microservices enhance deployment independence, event-driven systems improve scalability and resilience, and CQRS manages complex read-and-write workloads. However, this variety has led to fragmentation. Developers must now master not only programming languages and frameworks, but also distributed systems theory, consistency models, observability, fault tolerance, asynchronous communication, and operational automation. Data complexity has evolved similarly. For decades, enterprise applications relied primarily on relational databases and SQL. Today, organizations use document databases, graph databases, key-value stores, wide-column engines, streaming systems, vector databases, and combinations of these. This trend, known as polyglot persistence, reflects the fact that different data models address different business needs. For example, recommendation engines may require graph traversal, financial systems depend on transactional consistency, and AI systems increasingly use vector similarity search. As a result, enterprise development now extends beyond writing business logic. Engineers must manage distributed architectures, multiple persistence models, cloud-native infrastructure, security, asynchronous communication, and increasingly, AI-driven workflows. In this environment, standards are essential. Growing complexity makes fragmentation a significant long-term risk. Without common abstractions and interoperable APIs, organizations risk costly migrations, vendor lock-in, and operational instability. Jakarta EE 12 addresses these challenges. Instead of treating persistence, querying, dependency injection, and runtime behavior as separate concerns, the platform adopts a unified model for modern distributed systems. Its goal is not to eliminate architectural diversity, but to offer a stable and coherent foundation that supports it. Why Jakarta EE Still Matters Enterprise Java has evolved for nearly three decades. Launched in the late 1990s, Java EE aimed to standardize enterprise application development amid a fragmented landscape of proprietary technologies. The ecosystem progressed from J2EE to Java EE and, now, to Jakarta EE under the Eclipse Foundation. Each transition mirrored broader industry shifts, including the emergence of web applications, distributed systems, cloud-native computing, and AI-driven architectures. Java’s dominance in enterprise environments stems from more than the language itself. Its success lies in uniting two elements that rarely coexist: open standards and open source. Many ecosystems offer only one. Some are open source but lack governance and interoperability. Others provide standards but evolve slowly or lose touch with developer needs. Jakarta EE bridges these worlds, delivering both specification-driven consistency and open-source innovation. Historically, standards have been essential for human scalability. Shared languages enabled cooperation, writing systems preserved knowledge, and standard units like the metric system supported global trade and science. Software faces similar challenges. As systems expand and teams become more distributed, shared abstractions and interoperability are crucial. Standards reduce ambiguity, improve team communication, and allow technologies to evolve without frequent rewrites. This is especially important in enterprise environments, where systems often outlast the technologies used to build them. Enterprise applications are rarely rewritten. Banks, governments, healthcare providers, airlines, and retailers operate systems that may persist for decades while evolving internally. In this context, open standards and open source are strategic choices. They reduce operational lock-in, improve vendor portability, support long-lived systems, and enable incremental modernization rather than risky rewrites. Jakarta EE addresses these needs by not imposing a single architecture, runtime, or deployment model. The platform supports monoliths, modular systems, microservices, reactive architectures, and cloud-native deployments. It integrates seamlessly with modern frameworks and runtimes, including those many developers use daily, often without realizing Jakarta EE specifications underpin them. Technologies such as Spring, Quarkus, Micronaut, Hibernate, Tomcat, and Payara implement, extend, or depend directly on Jakarta EE specifications. This is precisely what makes Jakarta EE uniquely relevant today. In a market flooded with, this unique combination makes Jakarta EE especially relevant today. In a market filled with rapidly changing frameworks and infrastructure trends, Jakarta EE offers stability without stagnation. The platform evolves thoughtfully, maintaining compatibility while adapting to new realities such as cloud-native computing, polyglot persistence, and AI-driven systems. Jakarta EE 11 established a modern foundation, with specifications such as Jakarta Data. Jakarta EE 12 builds on this, moving Enterprise Java into what can be called the Data Age. Jakarta EE 12 and the Rise of Unified Data Access A key change in Jakarta EE 12 is the acknowledgment that data access can no longer be limited to relational databases. Modern enterprise applications now span SQL databases, NoSQL engines, distributed caches, event streams, and AI-focused data stores. The primary challenge has shifted from persistence alone to ensuring consistent developer interaction across diverse data systems. Jakarta EE 12 addresses this by introducing a unified semantic model for querying and data access. Central to this is Jakarta Query, a new abstraction that serves as a common query foundation for Jakarta Persistence, Jakarta Data, and Jakarta NoSQL. Rather than each specification defining separate query semantics, Jakarta Query provides a shared language for filtering, ordering, restrictions, and query composition across multiple persistence technologies. Enterprise Java has evolved through several generations of query languages, from JDBC’s direct SQL focus to JPA’s JPQL and various framework-specific abstractions. These independent developments have led to fragmentation. Jakarta EE 12 seeks to address this by separating semantic intent from execution strategy, enabling developers to use a common conceptual model for queries while allowing each technology to optimize execution as needed. This is especially important in polyglot persistence architectures. Relational databases optimize joins and transactions, document databases offer schema flexibility, and graph databases emphasize relationship traversal. Jakarta Query does not eliminate these differences but provides a consistent developer experience across technologies, reducing reliance on vendor-specific APIs. Jakarta Data 1.1 exemplifies this approach with its fluent, type-safe query model. Developers can dynamically compose queries using semantic restrictions and ordering rules in Java, rather than relying on string-based query construction. Java List<Product> found = products.findAll( Restrict.all( _Product.type.equalTo(ProductType.PHYSICAL), _Product.price.greaterThan(10.00f), _Product.name.contains("Jakarta") ), Order.by( _Product.price.desc(), _Product.name.asc() ) ); This approach enhances readability and reduces runtime query errors often found in string-based query languages. More importantly, it aligns queries with the domain model, supporting a core principle of domain-driven enterprise applications. Jakarta Data 1.1 also extends the repository model beyond basic CRUD operations. Stateful repositories now include lifecycle-oriented operations, such as persist, merge, refresh, detach, and remove, within their abstractions. Java @Repository public interface Products extends DataRepository<Product, String> { @Persist void add(Product product); @Merge Product merge(Product product); @Remove void remove(Product product); @Refresh void reload(Product product); @Detach void detach(Product product); } This evolution is significant because repositories are no longer just convenience wrappers for persistence operations. They now serve as standardized data access contracts, consistently supporting both query semantics and entity lifecycle management across implementations. More broadly, Jakarta EE 12 is guiding enterprise Java toward a unified data platform. Instead of requiring developers to switch mental models between persistence technologies, the platform unifies how applications express intent for querying, filtering, lifecycle management, and data interaction. As distributed systems and polyglot persistence become more prevalent, this semantic consistency may become a key architectural advantage for Enterprise Java.
In the SAP ABAP world, BAPIs (Business Application Programming Interfaces) have long been the standard for programmatic access to business functionality. These are essentially RFC-enabled function modules representing operations on business objects. However, with the advent of SAP S/4HANA and the new ABAP RESTful Application Programming Model (RAP), the classic BAPI approach is being phased out in favor of modern, RAP-based APIs. SAP now provides a new generation of released APIs built on RAP, often referred to as RAP BO interfaces or behavior definitions, which align with the clean core and cloud-ready strategy of S/4HANA. Why Replace BAPIs With RAP-Based APIs? For expert ABAP developers, the motivation to migrate from BAPIs to RAP interfaces comes down to modernization, maintainability, and future-proofing. In fact, SAP considers RAP business object interfaces the modern alternative to classic BAPIs. Here are some key reasons to make the switch: Clean Core and Upgrade Safety RAP interfaces are part of SAP’s clean core strategy. They are released and maintained by SAP, ensuring that your custom code calls stable, upgrade-safe APIs instead of custom wrappers or unofficial function modules. This means less retrofitting when you upgrade your S/4HANA system. Cloud Readiness In SAP S/4HANA Cloud, many traditional BAPIs are not whitelisted or available. RAP-based APIs, on the other hand, are designed for cloud use and adhere to strict SAP release contracts. They can be used in both private and public cloud editions as officially supported integration points. Modern API Design RAP uses contemporary ABAP development paradigms, CDS views for data modeling, behavior definitions (BDEF) for business logic, and OData/HTTP exposure by design. Instead of procedural function calls, you interact with entities through a standardized interface. This leads to clearer, more maintainable code and aligns with how SAP Fiori apps communicate. In essence, RAP interfaces provide a modern, RESTful facade over business objects, whereas BAPIs are older RPC-style calls. Built-in Transaction Handling Classic BAPIs often require manual transaction control. RAP introduces the Entity Manipulation Language (EML), which handles operations in a transactional context. You can bundle multiple operations on related entities and then perform one commit for all — simplifying the code and reducing errors. We’ll see this in the example below. Extensibility and Lifecycle Management RAP business object interfaces allow you to expose only what is needed to consumers, supporting versioning and gradual enhancements. They act as a stable facade while the underlying logic can evolve without breaking external contracts. This is analogous to how BAPIs provided stable interfaces in the past, but RAP makes it more flexible. In short, RAP BO interfaces let you keep your core objects clean and internal, exposing a controlled API layer to the outside world. Given these benefits, SAP recommends using a released RAP API whenever one is available for your scenario. Next, let's walk through a hands-on example of replacing a common BAPI with an RAP-based API in an S/4HANA 2022 system. ABAP RAP in S/4HANA 2022 at a Glance SAP S/4HANA 2022 includes the ABAP Platform 2022, which significantly expands RAP’s capabilities and the number of standard business objects exposed via RAP. Many operations that used to be performed with BAPIs now have equivalent released RAP BO interfaces provided by SAP. These interfaces are essentially pre-built RAP business objects that SAP delivers for developers to use instead of calling older BAPIs. For example: Bank creation: BAPI_BANK_CREATE to I_BANKTP (RAP Business Object interface)Cost center creation: BAPI_COSTCENTER_CREATE to I_COSTCENTERTP_2 (RAP BO interface)Sales order creation: BAPI_SALESORDER_CREATEFROMDAT2 to I_SALESORDERTP (RAP BO interface successor) These are just a few examples; SAP provides a growing list of such RAP-based APIs for S/4HANA 2022 and beyond. Each interface typically comes with documentation and code snippets to demonstrate how to consume it. If you are unsure whether a particular BAPI has an RAP equivalent, SAP’s Clean Core Object search tool can help identify if a released RAP interface (BDEF) exists for that business object. Hands-On Example: Modernizing a BAPI to RAP To make this concrete, let’s walk through replacing a classic BAPI with a RAP-based implementation. We’ll use the task of creating a Sales Order as our example. In the past, an ABAP developer might use the BAPI BAPI_SALESORDER_CREATEFROMDAT2 to programmatically create a sales order in SAP ECC or S/4HANA. In S/4HANA 2022, SAP provides the RAP business object interface I_SALESORDERTP as the official successor for this BAPI. We’ll demonstrate how to identify the replacement and implement the create operation using RAP. Step 1: Identify the RAP Replacement Interface First, determine if a released RAP API exists for the BAPI in question. In our case, we discovered that I_SALESORDERTP is the released RAP interface corresponding to the Sales Order business object (as a replacement for BAPI_SALESORDER_CREATEFROMDAT2). This interface is delivered by SAP as part of S/4HANA 2022. You can find such mappings via SAP documentation or tools; for instance, SAP's help docs and community posts confirm that I_SALESORDERTP covers create/read/update operations for sales orders via RAP. Once the appropriate interface is identified, you can plan to refactor your code to use it. Step 2: Review the Classic BAPI Usage (For Comparison) Let's briefly recap how the classic BAPI would be used to create a sales order. Typically, one would call the function module and then commit the work, for example: Plain Text DATA: lt_return TYPE TABLE OF bapiret2. CALL FUNCTION 'BAPI_SALESORDER_CREATEFROMDAT2' EXPORTING order_header_in = ls_order_header " header data structure TABLES order_items_in = lt_items " item lines to create return = lt_return. " return messages IF sy-subrc = 0 AND ( NOT line_exists( lt_return[ type = 'E' ] ) ). " If no errors, commit the transaction to finalize the create CALL FUNCTION 'BAPI_TRANSACTION_COMMIT' EXPORTING wait = 'X'. ENDIF. In the above pseudocode, we populate structures for the sales order header and items, call the BAPI, then explicitly call BAPI_TRANSACTION_COMMIT to ensure the new order is saved. We also have to check the return table for errors or messages. This imperative style works, but it requires handling commits manually and ensuring the data structures match what the BAPI expects. It also doesn’t automatically integrate with OData or modern Fiori UI frameworks without creating a custom OData service on top. Step 3: Implement the RAP-Based Create via EML Now, let's perform the same operation using the RAP interface I_SALESORDERTP and ABAP’s Entity Manipulation Language. With RAP, you don't call a function module; instead, you use EML statements on the interface’s entities. Suppose we want to create a sales order with one line item. We can do it in one shot as follows: Plain Text DATA(ls_failed) = REF #( ). DATA(ls_reported) = REF #( ). MODIFY ENTITIES OF i_salesordertp ENTITY salesorder CREATE FIELDS ( salesordertype salesorganization distributionchannel organizationdivision soldtoparty purchaseorderbycustomer ) WITH VALUE #( ( %cid = 'H001' " temporary ID for the new order %data = VALUE #( salesordertype = 'TA' " Order type salesorganization = '1010' " Sales Org distributionchannel = '10' " Distribution Channel organizationdivision = '00' " Division soldtoparty = '0010100001' " Sold-to Customer ID purchaseorderbycustomer= 'PO-12345' " Customer PO reference ) ) ) CREATE BY \_item " now specify the item(s) FIELDS ( product requestedquantity ) WITH VALUE #( ( %cid_ref = 'H001' " link to the order with temp ID %target = VALUE #( ( %cid = 'I001' " temp ID for new item product = 'MAT-1000' requestedquantity = 5 ) ) ) ) FAILED DATA(ls_failed) REPORTED DATA(ls_reported). In this code snippet, we use the MODIFY ENTITIES OF i_salesordertp statement to perform a deep create of a SalesOrder along with one SalesOrderItem. A few things to note from an engineer’s perspective: We specify the fields we want to set for the sales order and its item, similar to filling out the BAPI’s parameters, but here it's done with a structured syntax using WITH VALUE #(...) expressions. This ensures compile-time checking of field names and data types.The use of %cid (client ID) and %cid_ref is a mechanism to link the parent and child in one request. We assign a temporary ID 'H001' to the new sales order, and then reference it (%cid_ref = 'H001') when creating the item, so that the framework knows this item belongs to the new order. This allows creating a header and its items together in one call (a deep insert), which is handled seamlessly by RAP.The FAILED DATA(ls_failed) and REPORTED DATA(ls_reported) clauses serve to capture any errors or messages from the operation. This is analogous to checking the BAPI return table. If the create fails the ls_failed structure would contain the failed keys and error details. The ls_reported can hold messages or generated keys. At this point, the sales order is created in memory. To finalize and commit it to the database, we need to execute a commit in the RAP context. Step 4: Finalize With Commit In an interactive RAP scenario (like a Fiori app), the commit is managed by the framework when the user saves their changes. But in our manual ABAP code (above), we explicitly trigger the commit. In RAP, this is done using COMMIT ENTITIES rather than the classic BAPI_TRANSACTION_COMMIT. For example: Plain Text COMMIT ENTITIES EXPORTING RESPONSE OF i_salesordertp FAILED DATA(ls_save_failed) REPORTED DATA(ls_save_reported). IF sy-subrc <> 0. " Handle the error (e.g., log the messages in ls_save_reported) ENDIF. COMMIT ENTITIES will finalize all pending changes made by EML statements. Here, we also capture any final errors or messages in ls_save_failed/ls_save_reported for error handling. After this commit, the sales order is persisted, and the actual Sales Order number is available. SAP RAP supports late numbering, meaning the real keys might only be assigned at commit time, a detail to be aware of. You can retrieve the final keys using the CONVERT KEY statement if needed, but for our purposes, assume the commit gives us the new order number in the reported data. Now we have successfully created a sales order using the RAP API instead of the classic BAPI. The code is more declarative and leverages the RAP framework for integrity checks and relationships. We didn't have to call any function modules directly; everything was done through the RAP interface of the Sales Order business object. Engineer’s Perspective: Key Takeaways From the above process, a few important differences stand out when replacing BAPIs with RAP-based APIs: Finding the right interface: It may require a bit of research to find the correct RAP interface name for a given legacy BAPI. SAP provides documentation to help with this. Once found, using the RAP interface ensures you are calling an SAP-supported API that will remain stable across updates.Refactoring effort: Replacing a BAPI call with a RAP call is not a one-to-one code replacement; it often involves refactoring. You will define data structures according to RAP’s CDS-based types and use EML syntax. There is a learning curve, but the benefit is cleaner ABAP code that is easier to maintain. Tools in ABAP Development Tools (ADT) can help by autocompleting structure components and EML keywords, since the interface and its types are known in the dictionary.Behavior and validation: When you use RAP interfaces, you automatically leverage SAP’s implemented behavior logic. For instance, any checks or defaulting logic SAP built into the Sales Order RAP BO will execute. You no longer manually call multiple BAPIs or perform intermediate checks; the RAP framework triggers all the necessary business logic as part of the modify/commit process. This leads to fewer errors and a more consistent outcome with standard SAP behavior.Transaction management: As shown, instead of BAPI_TRANSACTION_COMMIT, you use COMMIT ENTITIES. This aligns with the unit of work paradigm in RAP. It also allows bundling multiple operations.Upgrade and extensibility: By adopting RAP-based APIs, your custom code is positioned for the future. SAP will continue to enhance these interfaces, and you can opt in to those changes by switching to a new interface version or extending the CDS views. In contrast, classic BAPIs in S/4HANA are largely in maintenance mode; they exist mainly for backward compatibility and might not cover new business scenarios introduced in S/4HANA. Moving to RAP ensures you can take advantage of new SAP innovations with minimal effort. Conclusion Replacing BAPIs with RAP-based APIs is a strategic move for any ABAP developer working on S/4HANA 2022 and beyond. It aligns your custom developments with SAP’s modern, cloud-ready programming model. As we’ve seen, a classic scenario like sales order creation can be accomplished with RAP’s EML in a way that is more in line with modern development practices, yielding cleaner and more robust code. SAP itself positions RAP Business Object interfaces as the evolution of BAPIs, a way to build stable, public APIs that keep the core clean and extensible. While there is an upfront effort to learn RAP and refactor existing code, the payoff is significant in the long run, with better maintainability, fewer upgrade headaches, and the ability to run your extensions in the cloud. Importantly, whenever SAP provides a released RAP API to replace a classic BAPI, you should take that path. By doing so, you leverage SAP’s latest technology and ensure your solutions remain supported in future releases.
A Practical Walkthrough Most distributed systems eventually hit a wall with their messaging layer, whether it's Kafka's tight coupling between compute and storage, RabbitMQ's limited replay capabilities, or the operational overhead of managing multiple tools for queuing and streaming. Apache Pulsar was engineered to address these gaps from the ground up. In this article, we'll dissect a working Go-based demo that wires together a Pulsar producer, consumer, and Prometheus monitoring layer into a cohesive, observable messaging pipeline. The full source is on GitHub. Why Pulsar Deserves a Closer Look Pulsar's architecture makes a deliberate trade-off that most messaging systems avoid: it physically separates the broker tier (which handles routing, subscriptions, and protocol) from the storage tier (Apache BookKeeper, which handles persistence). This isn't just an implementation detail. It means you can independently autoscale message routing capacity without touching your storage cluster, and vice versa. Beyond the architecture, a few capabilities stand out for engineering teams: Multi-tenancy at the protocol level – tenants, namespaces, and topics form a three-level hierarchy, making it practical to run a single Pulsar cluster for multiple teams or services without namespace collisions.Four distinct subscription semantics – Exclusive, Shared, Failover, and Key_Shared give you precise control over how messages are distributed across consumer instances, something Kafka's consumer group model doesn't natively offer.Cursor-based message retention – Pulsar retains messages based on subscription cursors, not time-based log compaction. A consumer that falls behind doesn't lose messages; it simply catches up from its last acknowledged position.Native schema enforcement – The built-in schema registry validates message payloads at the broker level before they reach consumers, catching contract violations at the boundary rather than deep inside application logic. What the Demo Builds The project is structured as three independent Go binaries, each with a single responsibility: reStructuredText ├── producer/ │ ├── main.go # HTTP server → Pulsar publisher │ ├── go.mod │ └── go.sum ├── consumer/ │ └── main.go # Pulsar subscriber → message processor ├── monitor/ │ └── main.go # Pulsar producer + Prometheus metrics server ├── prometheus.yml # Scrape configuration └── README.md All three use `github.com/apache/pulsar-client-go/pulsar` — the official, Apache-maintained Go client. The client is not a thin wrapper; it implements the full Pulsar binary protocol, connection pooling, producer batching, and automatic Prometheus metric registration. Component 1: The Producer The producer exposes a single HTTP endpoint. An incoming HTTP request triggers a Pulsar publish operation, decoupling the caller from any direct knowledge of the messaging infrastructure. Go package main import ( "context" "fmt" "log" "net/http" "github.com/apache/pulsar-client-go/pulsar" ) var client pulsar.Client func main() { var err error client, err = pulsar.NewClient(pulsar.ClientOptions{ URL: "pulsar://localhost:6650", }) if err != nil { log.Fatal(err) } defer client.Close() http.HandleFunc("/publish", func(w http.ResponseWriter, r *http.Request) { msg := r.URL.Query().Get("msg") err := publishMessage(msg) if err != nil { w.Write([]byte("msg failed to published")) } else { w.Write([]byte("msg successfully published")) } }) if err := http.ListenAndServe(":8080", nil); err != nil { log.Fatal(err) } } func publishMessage(msg string) error { producer, err := client.CreateProducer(pulsar.ProducerOptions{ Topic: "my-topic", }) if err != nil { log.Fatal(err) } _, err = producer.Send(context.Background(), &pulsar.ProducerMessage{ Payload: []byte("Hello"), }) return err } A few architectural observations worth unpacking: The HTTP-to-Pulsar bridge pattern is deliberately pragmatic. Rather than requiring every upstream service to embed a Pulsar client, you expose a thin HTTP adapter. This is particularly valuable when integrating with systems that speak HTTP natively — CI/CD pipelines, third-party webhooks, or legacy services that can't easily adopt a new client library. `pulsar.NewClient` establishes a connection pool, not a single TCP connection. The client maintains persistent connections to the broker and handles reconnection, load balancing across broker nodes, and TLS negotiation transparently. Calling `client.Close()` via `defer` ensures all in-flight messages are flushed before the process exits. `producer.Send` with `context.Background()` submits the message to the producer's internal send queue. The Pulsar client batches outgoing messages by default (configurable via `BatchingMaxMessages` and `BatchingMaxPublishDelay`), which significantly improves throughput under load without any changes to application code. For production use, the producer instance should be created once at startup and reused across requests. Creating a new producer per request incurs connection overhead and bypasses the batching optimization entirely. Component 2: The Consumer The consumer subscribes to a topic and processes messages with explicit acknowledgment. The subscription type chosen here — `pulsar.Shared` — has meaningful implications for how the system scales. Go consumer, err := client.Subscribe(pulsar.ConsumerOptions{ Topic: "topic-1", SubscriptionName: "my-sub", Type: pulsar.Shared, }) if err != nil { log.Fatal(err) } defer consumer.Close() for i := 0; i < 10; i++ { msg, err := consumer.Receive(context.Background()) if err != nil { log.Fatal(err) } fmt.Printf("Received message with Id: %#v -- content: '%s'\n", msg.ID(), string(msg.Payload())) consumer.Ack(msg) } if err := consumer.Unsubscribe(); err != nil { log.Fatal(err) } Subscription Semantics in Depth Pulsar's subscription model is one of its most differentiating features. Here's how the four types behave at the broker level: Exclusive – The broker enforces that only one consumer holds the subscription at any time. A second consumer attempting to subscribe with the same name will receive an error. This guarantees strict message ordering but eliminates horizontal scaling.Shared – The broker distributes messages across all active consumers in round-robin order. Any number of consumers can join or leave the subscription dynamically. This is the right choice for stateless workloads where processing order doesn't matter and throughput is the priority.Failover – The broker designates one consumer as the active receiver. Others remain connected but idle, ready to take over if the active consumer disconnects. This preserves ordering while providing high availability — a pattern common in financial transaction processing.Key_Shared – The broker routes messages with the same key consistently to the same consumer instance. This enables stateful processing (e.g., per-user session aggregation) without external coordination, as long as the consumer count remains stable. Why Explicit Acknowledgment Matters `consumer.Ack(msg)` signals to the broker that the message has been durably processed and can be removed from the subscription's cursor. If the consumer process crashes between `Receive` and `Ack`, the broker will redeliver the message to another consumer in the subscription. This is the mechanism behind at-least-once delivery. For workloads that require exactly-once semantics, Pulsar supports transactional acknowledgment, where the `Ack` and any downstream writes are committed atomically. That's a more advanced topic, but the foundation is the same `Ack` call shown here. Component 3: The Monitor The monitor is architecturally the most interesting component. It runs two HTTP servers concurrently — one for the application endpoint, one for Prometheus metrics — and uses a Pulsar producer to generate observable traffic. Go package main import ( "context" "fmt" "log" "net/http" "strconv" "github.com/apache/pulsar-client-go/pulsar" ) func main() { client, err := pulsar.NewClient(pulsar.ClientOptions{ URL: "pulsar://localhost:6605", }) if err != nil { log.Fatal(err) } defer client.Close() prometheusPort := 2112 go func() { if err := http.ListenAndServe(":"+strconv.Itoa(prometheusPort), nil); err != nil { log.Fatal(err) } }() producer, err := client.CreateProducer(pulsar.ProducerOptions{ Topic: "topic-1", }) if err != nil { log.Fatal(err) } defer producer.Close() ctx := context.Background() webPort := 8082 http.HandleFunc("/produce", func(w http.ResponseWriter, r *http.Request) { msgId, err := producer.Send(ctx, &pulsar.ProducerMessage{ Payload: []byte(fmt.Sprintf("hello-world")), }) if err != nil { log.Fatal(err) } else { log.Printf("Published message: %v", msgId) fmt.Fprintf(w, "Message Published: %v", msgId) } }) if err := http.ListenAndServe(":"+strconv.Itoa(webPort), nil); err != nil { log.Fatal(err) } } How the Pulsar Client Registers Prometheus Metrics When `pulsar.NewClient` is called, the Go client automatically registers a set of Prometheus collectors with the default `prometheus.DefaultRegisterer`. No additional instrumentation code is required. The metrics are served at `/metrics` on whatever port you bind `http.DefaultServeMux` to port — in this case, port `2112`. The metrics exposed include: `pulsar_client_producers_opened` / `pulsar_client_producers_closed` – producer lifecycle counters`pulsar_client_consumers_opened` / `pulsar_client_consumers_closed` – consumer lifecycle counters`pulsar_client_messages_published_total` – cumulative publish count per topic`pulsar_client_publish_latency_seconds` – histogram of end-to-end publish latency`pulsar_client_bytes_published_total` – total bytes written to the broker Running the Prometheus metrics server in a goroutine while the main goroutine handles the application HTTP server is idiomatic Go concurrency. Both servers share the same `http.DefaultServeMux`, which is why the Prometheus `/metrics` handler (registered automatically by the client library) is accessible on the metrics port without any explicit route registration. Prometheus Scrape Configuration YAML scrape_configs: - job_name: pulsar-client-go-metrics scrape_interval: 10s static_configs: - targets: - localhost:2112 The `scrape_interval: 10s` is a reasonable starting point for development. In production, you would typically align this with your alerting resolution requirements — a 30-second interval is common for dashboards, while 10 seconds or less is appropriate for latency-sensitive alerting rules. With these metrics flowing into Prometheus, you can build Grafana panels that surface producer throughput, consumer lag, and publish latency percentiles with three signals that matter most when diagnosing messaging pipeline issues. Running the Full Pipeline Prerequisites Apache Pulsar standalone Go 1.18+Prometheus (optional) Startup Sequence 1. Launch Pulsar in standalone mode: Shell bin/pulsar standalone 2. Start the producer service: Shell cd producer && go run main.go 3. Start the consumer service: Shell cd consumer && go run main.go 4. Start the monitor service: Shell cd monitor && go run main.go 5. Start Prometheus: Shell prometheus --config.file=prometheus.yml 6. Publish a message via the producer endpoint: Shell curl "http://localhost:8080/publish?msg=test_pulsar_message_publish_event" # msg successfully published 7. Trigger the monitor producer: Shell curl http://localhost:8082/produce # Message Published: (messageId) 8. Inspect raw Prometheus metrics: Shell curl http://localhost:2112/metrics | grep pulsar Engineering Takeaways Decouple publish triggers from client library dependencies. The HTTP-to-Pulsar adapter pattern used in the producer is not just a demo convenience. It is a legitimate architectural boundary. Services that need to emit events don't need to know anything about Pulsar's protocol, topic naming, or client configuration. They make an HTTP call; the adapter handles the rest.Match subscription type to processing semantics, not just throughput. A common mistake is defaulting to `Shared` for everything because it scales horizontally. If your processing logic is stateful, for example, aggregating events per user session. `Key_Shared` gives you the same horizontal scalability while preserving per-key ordering without any application-level coordination.Treat the acknowledgment boundary as your consistency boundary. Everything between `Receive` and `Ack` is your processing window. Any side effects (database writes, downstream API calls, cache updates) that happen in this window must be idempotent, because Pulsar will redeliver the message if the consumer fails before acknowledging. Design your processing logic around this constraint from the start, not as an afterthought.Zero-cost observability is a genuine advantage. The fact that `pulsar-client-go` registers Prometheus metrics automatically means you get throughput, latency, and connection health data from the moment your application starts, without writing a single line of instrumentation code. This is a meaningful operational advantage over client libraries that require manual metric registration. Extending the Demo The current implementation is intentionally minimal. Here are technically meaningful extensions worth exploring: Schema enforcement – Replace raw `[]byte` payloads with Pulsar's schema-aware producer/consumer API. Using `pulsar.NewAvroSchema` or `pulsar.NewJSONSchema` moves payload validation to the broker, preventing malformed messages from ever reaching consumers.Dead-letter topic routing – Configure `DeadLetterPolicy` on the consumer to automatically route messages that exceed a maximum redelivery count to a separate topic. This prevents poison-pill messages from blocking the subscription indefinitely.Producer batching tuning – Set `BatchingMaxMessages`, `BatchingMaxSize`, and `BatchingMaxPublishDelay` on `ProducerOptions` to optimize the throughput/latency trade-off for your specific workload profile.Graceful shutdown – Add `os/signal` handling to flush in-flight messages and close the producer cleanly before the process exits. The current `defer client.Close()` handles the happy path but won't fire on `SIGKILL`.Kubernetes-native deployment – Package each component as a container and deploy using the official Pulsar Helm chart. The producer and monitor can be exposed as Kubernetes Services; the consumer can run as a Deployment with HPA scaling based on the `pulsar_client_messages_published_total` metric exported to a custom metrics adapter. Conclusion The `apache-pulsar` project demonstrates that building a production-grade messaging pipeline with Apache Pulsar and Go doesn't require much code. It requires understanding the right abstractions. The producer-consumer-monitor triad covers the three concerns that matter in any event-driven system: getting data in, getting data out, and knowing what's happening in between. Pulsar's architecture, decoupled storage, flexible subscription semantics, and built-in observability make it a strong candidate for teams that have outgrown simpler messaging systems and need more precise control over delivery guarantees, scaling behavior, and operational visibility. Source Code https://github.com/shivik/apache-pulsar-demo
This article provides a comprehensive guide to achieving zero-downtime deployments for Java-based applications on Kubernetes. We cover deployment strategies, Kubernetes primitives, Java-specific considerations, session state handling, database migrations, traffic shifting techniques, CI/CD pipelines, GitHub Actions, Jenkins with automated rollbacks, observability (Prometheus, Grafana, Jaeger), Helm/ArgoCD examples, testing strategies (canary analysis, chaos, smoke tests), and troubleshooting. Deployment Strategies Kubernetes offers several strategies for deploying new versions without downtime: Rolling Update Incrementally replace old pods with new ones, maintaining availability. Kubernetes Deployment object uses rolling updates by default. You can control maxUnavailable and maxSurge to tune the rollout. Blue-Green Deployment Run two separate environments: Blue = current, green = new. Only one serves live traffic at a time. Once the Green version is verified, switch the Service or Ingress to point at Green, then scale down Blue. This allows instant rollback by redirecting traffic back to Blue. Argo Rollouts defines a blue/green strategy with an active and preview Service. Traffic flows only to the active version until promotion. Canary Deployment Gradually shift a small percentage of traffic to the new version. Start with a few pods of v2, monitor, then incrementally increase. Tools like Istio or Argo Rollouts can control traffic weights. For instance, sending 10% of traffic to v2 can be done by running 9 v1 pods and 1 v2 pod (10%). Argo defines a canary rollout with setWeight steps and pauses for analysis. Shadow/Mirroring The new version receives a copy of live requests for testing under real load, but its responses are not returned to users. This is low risk but does not assist in rollback decisions since users don’t see the new behavior. Kubernetes Primitives for Zero Downtime Deployment A Deployment naturally performs rolling updates. By default, it creates a new ReplicaSet and scales it up while scaling down the old one controlled by maxUnavailable/maxSurge. This ensures some pods always serve traffic. To use blue/green, you would deploy two separate Deployments (e.g., app-blue, app-green) and switch Services. Service and Ingress A Service fronts pods. For blue/green, you can point a single Service at either the blue or green pods. Ingress can also switch between backend services. E.g., label selectors can be adjusted to redirect traffic from version blue to version green pods. PodDisruptionBudget Ensures a minimum number of pods stay running during voluntary disruptions. For instance, setting minAvailable 1 ensures at least one pod remains during a rolling update. To avoid complete downtime during maintenance. Horizontal Pod Autoscaler (HPA) Scales pods based on CPU/memory or custom metrics. It automatically updates a workload to match demand. An HPA can be attached to the Deployment so that if traffic spikes during a rollout, new pods will be created to handle the load. Example: YAML apiVersion autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: myapp-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: myapp minReplicas: 2 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 50 Liveness and Readiness Probes Critical for zero downtime. A liveness probe checks if the app is alive; if it fails, K8 restarts the pod. A readiness probe tells if the app is ready to serve traffic. During startup or shutdown, the readiness probe should fail, causing the pod to be removed from the service load balancer. Spring Boot Actuator provides /actuator/health for this. In K8S YAML: YAML livenessProbe: httpGet: path: /actuator/health/liveness port: 8080 initialDelaySeconds: 15 periodSeconds: 10 readinessProbe: httpGet: path: /actuator/health/readiness port: 8080 initialDelaySeconds: 5 periodSeconds: 5 Spring Boot exposes health/liveness and health/readiness groups by default. Quarkus and Micronaut have similar health endpoints. Spring Boot supports graceful shutdown by setting server.shutdown is equals to graceful and tuning spring.lifecycle.timeout-per-shutdown-phase. This causes the embedded server, either Tomcat/Jetty/Undertow, to stop accepting traffic and wait up to the timeout for active requests. Java @Component public class ShutdownListener implements SmartLifecycle { private boolean running = true; @Override public void stop() { running = false; } @Override public boolean isRunning() { return running; } } Quarkus provides graceful shutdown configuration. By setting quarkus.shutdown.timeout=10s, Quarkus will wait up to 10 seconds for current requests to finish before exiting. You can annotate a bean method with @Shutdown to run cleanup code. Micronaut has @EventListener for ShutdownEvent: Java @Singleton public class ShutdownBean { @EventListener void onShutdown(ShutdownEvent event) { } } Kubernetes Hooks You can use a preStop hook in the Deployment spec to run a script before SIGTERM. YAML lifecycle: preStop: exec: command: ["/bin/sh","-c","sleep 5"] terminationGracePeriodSeconds: 30 The grace period (default 30s) should be tuned to let the app finish. K8S doc 77†L99-L107 describes the sequence container enters Terminating, runs preStop, sends SIGTERM, waits terminationGracePeriodSeconds, then SIGKILL. JVM Tuning Set -XX +ExitOnOutOfMemoryError to avoid hanging. Tune thread pools so they drain quickly. Monitor GC pause times, consider using low-latency GC to minimize pause before shutdown. Session and State Handling To maintain zero downtime when pods switch: Stateless services: Best practice is to keep services stateless. Store session state or user data in an external store, such as Redis or a database. This way, any pod can handle any request, and pods can be replaced without losing the user session.Sticky sessions: If an app uses in-memory sessions, you can enforce sticky sessionsService affinity: Set sessionAffinity: ClientIP on the Service. Kubernetes routes requests from the same client IP to the same pod.Ingress affinity: Use Ingress annotations to bind a user’s requests to one pod. However, sticky sessions introduce risk and are not suitable for autoscaling.StatefulSets: For true stateful workloads, use StatefulSet with stable identities. StatefulSets pair pods with PersistentVolumes, which are not zero-downtime by themselves. GitHub Actions CI/CD Pipeline zero-downtime: YAML name: Deploy on: push: branches: [ main ] jobs: build: runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 - uses: actions/setup-java@v3 with: { java-version: '17' } - name: Build run: mvn clean package -DskipTests name: Docker Build & Push run: | docker build -t ghcr.io/myorg/myapp:${{ github.sha } echo ${{ secrets.GITHUB_TOKEN } | docker login ghcr.io -u ${{ github.actor } --password-stdin docker push ghcr.io/myorg/myapp:${{ github.sha } - name: Set image tag run: echo "::set-output name=image::ghcr.io/myorg/myapp:${{ github.sha } deploy: needs: build runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 with: { path: manifests } - name: Update K8s deployment uses: azure/setup-kubectl@v3 - name: Deploy to Kubernetes run: | kubectl set image deployment/myapp-deployment myapp=ghcr.io/myorg/myapp:${{ needs.build.outputs.image } kubectl rollout status deployment myapp-deployment This workflow builds the image, pushes it, and updates the deployment. The rollout status command waits for all new pods to become ready. If health checks fail, it will abort without downtime. Conclusion Zero-downtime deployment on Kubernetes combines careful architecture and automation, using rolling updates, progressive strategies, ensuring graceful shutdown and health checks in your Java apps, externalizing state, managing database changes, and orchestrating with CI/CD pipelines. Kubernetes primitives like Deployments, Services, Probes, and HPA, along with tools like Istio or Argo Rollouts, provide the building blocks.
The Aberration We build Java applications like Go or Rust programs. Fat JARs. Docker images. Kubernetes deployments. Everyone does it, so it looks normal. It contradicts Java’s design DNA. Java has always been a language for managed environments. Applets ran inside browsers. Servlets ran inside application servers. EJBs ran inside containers like JBoss and WebLogic. OSGi bundles ran inside runtime containers like Eclipse Equinox. In every generation, the pattern was the same: a managed runtime hosts the application. The application handles business logic. The runtime handles infrastructure. The fat-jar era threw that away. We stopped letting Java be Java. We started bundling web servers, serialization frameworks, service discovery clients, configuration management, health checks, metrics libraries, and logging frameworks into every application. Then we wrapped the result in a Docker container and deployed it to an orchestration platform that reimplements — poorly — the infrastructure management that Java runtimes used to provide natively. This article introduces Pragmatica Aether: a distributed runtime that returns Java to its natural habitat. The application handles business logic. Runtime handles infrastructure. This isn’t radical — it's returning to what Java was designed for. The Problem: Infrastructure Wearing a Business Logic Mask Think of what a typical Java microservice carries. A web server (Tomcat, Netty, Undertow). A serialization framework (Jackson, Gson). A dependency injection container (Spring, Guice). A service discovery client (Eureka, Consul). Health check endpoints. Configuration management (Spring Cloud Config, Consul KV). A metrics library (Micrometer, Dropwizard). A logging framework (Logback, Log4j2). Retry logic (Resilience4j). Circuit breakers. HTTP client configuration. The application is wearing a heavy winter coat of infrastructure, armed to the teeth to survive in a hostile environment. Now consider the coupling this creates. Update the Java version — rebuild and test every service. Change your message broker from RabbitMQ to Kafka — modify, rebuild, and redeploy every application that touches messaging. Add a new observability tool and update dependencies in every microservice. Switch cloud providers — rewrite configuration, SDK calls, and deployment manifests across the entire fleet. Each change ripples through dozens or hundreds of services because infrastructure is entangled with business logic at the dependency level. This is the coupling trap. Your application’s pom.xml doesn't distinguish between business dependencies and infrastructure dependencies. They compile together, deploy together, and break together. A security patch in Netty requires a new build of every service that embeds a web server, which is all of them. Framework lock-in worsens this. It isn’t a vendor problem — it's an architecture problem. Spring’s dependency injection fights with Kubernetes service mesh for control over service routing and circuit breaking. The framework’s configuration system overlaps with Consul KV and Kubernetes ConfigMaps. Your cloud SDK’s retry logic conflicts with Resilience4j. Every layer claims authority over the same cross-cutting concerns, and the conflicts surface as subtle bugs in production — not during development. This is an architecture problem. Architectural problems have architectural solutions. Aether: The Core Idea What you write: an interface annotated with @Slice, plus business logic implementation. Java @Slice public interface OrderService { Promise<OrderResult> placeOrder(PlaceOrderRequest request); static OrderService orderService(InventoryService inventory, PricingEngine pricing) { return request -> inventory.check(request.items()) .flatMap(available -> pricing.calculate(available)) .map(priced -> OrderResult.placed(priced)); } } What you don’t write: everything else. No HTTP clients — inter-slice calls are direct method invocations via generated proxies. No service discovery — the runtime tracks where every slice instance lives. No retry logic — built-in retry with exponential backoff and node failover. No circuit breakers — the reliability fabric handles failure automatically. No serialization code — request/response types are serialized transparently. A method call via an imported interface is the only visible contract. The only hint that the actual call might be remote is a design requirement: slice methods should be idempotent. This isn’t a limitation — it's what enables retry, scaling, and fault tolerance to work transparently. The same request, processed by any available instance, produces the same result. Most read operations are naturally idempotent. For writes, standard patterns like idempotency keys and conditional writes handle it cleanly. Everything else is the environment’s job: resource provisioning, scaling, transport, discovery, retries, circuit breakers, configuration, observability, logging, tracing, monitoring, and security. None of these are application concerns, and none should be handled at the business logic level. The JBCT Leaf pattern serves two purposes here: it documents the design (“what we expect from an external implementation”) and encourages exactly one interface per dependency. Different implementations may have different technical properties — performance, latency, memory consumption — but as long as they’re compatible with the interface, business logic works unchanged. You write basically pure business logic that scales from your local computer to a global multi-zone distributed deployment, transparently. Under The Hood: What Makes It Work Five architectural decisions make this possible. Consensus KV Store. A single source of truth for all configuration, deployment state, and service discovery. Based on the Rabia protocol, a crash-fault-tolerant, leaderless consensus algorithm was published in 2021. Any node can propose; agreement is reached through a two-round voting protocol with a fast path when a supermajority agrees in round one. No external config servers. No etcd. No Consul. Configuration changes propagate through consensus and take effect cluster-wide. Built-in Artifact Repository. DHT-based storage with configurable replication — 3 replicas with quorum reads/writes in production, full replication in development. Artifacts are chunked into 64KB pieces, distributed across nodes via consistent hashing, and integrity-verified with MD5 and SHA-1 on every resolve. No external Nexus or Artifactory is needed. During development, slices resolve from your local Maven repository. In production, the cluster is self-contained. ClassLoader Isolation. Each slice runs inside its own SliceClassLoader with child-first delegation. Two slices can use different versions of the same library without conflict. Shared dependencies like Pragmatica Lite core are loaded once in a parent classloader. No dependency conflicts. No classpath hell between slices. Declarative Deployment. Blueprints — TOML files — describe the desired state: which slices, how many instances. TOML id = "org.example:commerce:1.0.0" [[slices]] artifact = "org.example:inventory-service:1.0.0" instances = 3 [[slices]] artifact = "org.example:order-processor:1.0.0" instances = 5 Apply with one command: aether blueprint apply commerce.toml. The cluster resolves artifacts, loads slices, distributes instances across nodes, registers routes, and starts serving traffic. The cluster converges to the desired state automatically. Infrastructure Independence. Aether nodes are identical — there's only one deployment artifact to manage at the infrastructure level. Node updates and application deployments run on completely independent schedules. Update Java — roll it out across nodes without touching applications. Update the Aether runtime — same. Update business logic — deploy new slice versions without touching infrastructure. Each independently, each without downtime. This is the fundamental benefit of proper separation: when layers don’t share a deployment unit, they don’t share a deployment schedule. Fault Tolerance: The 50% Rule The system survives the failure of less than half the nodes. Performance may degrade until replacements spin up, but functionality remains intact — actual redundancy, not just graceful degradation. A 5-node cluster tolerates 2 simultaneous failures. A 7-node cluster tolerates 3. The same request, processed by any available node, produces the same result. Quorum requires (N/2) + 1 nodes — as long as a majority is alive, the cluster operates normally. Leader failover is consensus-based and near-instant. Node replacement happens automatically — the Cluster Deployment Manager detects the deficit and provisions a replacement through the NodeProvider interface. The entire recovery sequence — from failure detection through state restoration to serving traffic — completes without human intervention. When a node fails, the recovery is automatic. Requests to slices on the failed node are immediately retried on healthy nodes. A replacement node is provisioned. It connects to peers, restores consensus state from a cluster snapshot, re-resolves artifacts from the DHT, and reactivates assigned slices. Dead nodes are automatically removed from routing tables. The new leader reconciles the stale state. No human intervention required. Rolling updates leverage this fault tolerance for zero-downtime deployments with weighted traffic routing: SQL aether update start org.example:order-processor 2.0.0 -n 3 aether update routing <id> -r 1:3 # 25% to v2, 75% to v1 aether update routing <id> -r 1:1 # 50/50 aether update complete <id> # 100% to v2, drain v1 Deploy during business hours. Shift traffic gradually — 10% canary, then 25%, 50%, 75%, 100%. Monitor health metrics at each step. If health degrades — error rate exceeds thresholds, latency spikes — instant rollback with one command: aether update rollback <id>. Traffic immediately shifts back to the old version. The 3 AM pager alert becomes an audit log entry. For Every Project: Legacy, Greenfield, And Everything Between Legacy Migration Your legacy Java system doesn’t need a complete rewrite. It needs a path forward. Pick a relatively independent part of your system — something hitting limits, something with clear boundaries. Extract an interface. Annotate it with @Slice. Wrap the legacy implementation: Java private Promise<Report> generateReport(ReportRequest request) { return Promise.lift(() -> legacyReportService.generate(request)); } One line to enter the Aether world. Promise.lift() wraps the legacy call, catches exceptions, and returns a proper Result inside a Promise. Your legacy code keeps running. Call sites don't change. You haven't added risk — the initial deployment in Ember runs in the same JVM as your existing application, which means it's no worse than what you have today. You've laid the foundation for removing risk, not adding it. Moving from Ember to a full Aether cluster is a configuration change, not a code change — and that's when the 50% rule starts to apply. From there, it’s the strangler fig pattern. Extract a hot path, deploy it as a slice, route traffic, repeat. Each extracted slice can be gradually refactored using the peeling pattern: first wrap everything in Promise.lift(), then decompose into a Sequencer with each step still wrapped, then peel individual steps into clean JBCT patterns. Tests pass at every step. The lift() calls mark exactly where legacy code remains, making progress visible and remaining work obvious. No rewrite is required. No big bang migration. One sprint to the first slice in production. The migration article covers the full path in detail — from initial wrapping through gradual peeling to clean JBCT code. Greenfield Development For new projects, slices enable a granularity that’s impossible with traditional microservices. Each slice can be as lean as a single method — and that’s the recommended approach. There are no operational or complexity tradeoffs for small slices because Aether handles all the infrastructure overhead. No container to configure, no load balancer to provision, no monitoring to set up per service. You get per-use-case scaling: one slice serving 50 instances during peak load while another idles at minimum. That kind of granularity would be operationally insane with traditional microservices — each needing its own container, load balancer, monitoring, and deployment pipeline. With Aether, it’s the default. JBCT patterns — Leaf, Sequencer, Fork-Join, Condition, Iteration, and Aspects — compose naturally within slices. Each slice method is a data transformation pipeline: parse input, gather data, process, respond. The patterns provide consistent structure within slices. Slices provide consistent boundaries between them. The Spectrum Same slice model, different granularity. A service slice wraps an entire legacy component. A lean slice implements a single method. Both coexist in the same cluster, deployed and scaled independently. Slice is the executable unit. It can be big or small as necessary and convenient. The architecture accommodates both monolith migration and greenfield development simultaneously. Your legacy system gains fault tolerance while new features get maximum deployment flexibility. Scaling: Two Levels, Three Tiers of Intelligence Two-Level Horizontal Scaling Aether scales in two dimensions independently: Slice scaling: Spin up more instances of a specific slice on existing nodes. Classes are already loaded—scaling takes milliseconds, not seconds.Node scaling: Add more machines to the cluster. The node connects, restores state, and begins accepting work. Independent controls, combined effect. Each node hosts at most one instance of a given slice, so scaling a slice beyond the current node count requires adding nodes first. Add 2 more nodes to a 3-node cluster, then scale a hot slice to 5 instances—one per node. No coordination between the two dimensions is required. Three-Tier Decision System Tier 1—Decision Tree (1-second intervals) Instant reactive decisions based on CPU utilization, request latency, queue depth, and error rate. CPU above 70%? Add an instance. Below 30% sustained? Remove one (if above minimum). Latency exceeding the P95 threshold? Scale up. Error rate above 1% due to timeouts? Scale up. Deterministic, predictable, fast. Handles routine load changes with configurable cooldown periods — 30 seconds for scale-up, 5 minutes for scale-down — to prevent oscillation. Tier 2—TTM Predictor (60-second intervals) An ONNX-based machine learning model (Tiny Time Mixers) analyzes a 60-minute sliding window of metrics — CPU usage, request rate, P95 latency, and active instances. Forecasts load and adjusts the Decision Tree’s thresholds preemptively. If TTM predicts a load increase, it lowers the scale-up CPU threshold by 20% so the reactive tier responds earlier. The cluster scales before the spike arrives, not after. The key design principle: the cluster always survives on Tier 1 alone. TTM enhances; it doesn’t replace. If TTM fails — model load error, insufficient data, inference failure — the Decision Tree continues with default thresholds. The error is logged and recorded in metrics. No scaling disruption. Tier 3—LLM-based (planned) Long-term capacity planning and cluster health monitoring. Seasonal pattern prediction, maintenance window planning, anomaly investigation. This tier is not yet implemented — the current system operates with Tiers 1 and 2. Fault tolerance makes preemptible instances viable for burst scaling. If a spot instance gets reclaimed, the cluster survives — it was designed for nodes to disappear. You don’t need a PhD in distributed systems or a dedicated platform team. The scaling system manages itself. Development Experience: From Laptop To Production Three Environments, Zero Code Changes Ember Single-process runtime with multiple cluster nodes running in the same JVM. Fast startup, simple debugging. Deploy your slices alongside your existing application — slices call each other directly in-process. No network overhead. Standard debugger breakpoints work as expected. Perfect for local development and unit testing. Forge A 5-node cluster simulator running on your laptop. Real consensus. Real routing. Real failure scenarios. Kill nodes, crash the leader, trigger rolling restarts — and watch the cluster recover in real time through a web dashboard with D3.js topology visualization, per-node metrics (CPU, heap, leader status), and event timeline. Configurable load generation with TOML-based multi-target configuration lets you stress-test realistic scenarios — set request rates, define body templates, and run duration-limited load tests. Chaos operations include node kill, leader kill, and rolling restart. Forge validates the entire dependency graph before starting anything. Aether Production cluster. Same slices, same code, different scale. Your code doesn’t know which environment it’s running in. Whether inter-slice calls are in-process or cross-network is transparent. Tooling 37 CLI commands cover deployment, scaling, updates, artifacts, observability, controller configuration, and alerts — in both single-command and interactive REPL modes. A web dashboard streams real-time metrics via WebSocket — no polling. 30+ REST management endpoints enable full programmatic control of everything the CLI can do. Prometheus-compatible metrics export (/metrics/prometheus) integrates with existing monitoring stacks. Metrics are push-based at 1-second intervals, with zero consensus overhead — they bypass the consensus protocol entirely. Per-method invocation tracking with P50/P95/P99 latency and configurable slow-invocation detection strategies (fixed threshold, adaptive, per-method, composite) surfaces performance issues before users notice. Dynamic aspects let you toggle LOG/METRICS/LOG_AND_METRICS modes per method at runtime via REST API, without redeployment. Test realistic failure scenarios on your laptop. Deploy to production with a config change, not a code change. Maturity Aether is a working system, not a concept paper. 81 end-to-end tests are run against real 5-node clusters in Podman containers, validating cluster formation, quorum establishment, slice deployment and scaling, blueprint application with topological ordering, multi-instance distribution, artifact upload, and cross-node resolution with integrity verification, leader failure and recovery, node restart with state restoration, and orphaned state cleanup after leader changes. The recovery and fault tolerance claims come from automated tests against real clusters, not marketing slides. Let Java Be Java Java’s lineage leads here. From applets managed by browsers, through servlets managed by application servers, through EJBs managed by enterprise containers, through OSGi managed by runtime frameworks, to Aether, managed by a distributed runtime. The fat-jar era was a detour. An understandable one — when Docker emerged, it offered a universal packaging format, and the industry standardized on it regardless of language. Java adopted the patterns of languages that were designed to produce standalone binaries. We started treating Java applications like Go programs with a heavier runtime. But it was never the destination. Java was designed for managed environments. The JVM makes it possible. The runtime manages the application. That’s the lineage. Aether continues it. Two entry points exist today. Wrap your legacy monolith behind a @Slice interface in one sprint and gain fault tolerance without rewriting anything. Or start fresh with maximum clarity — lean slices, explicit contracts, per-use-case scaling. Both paths converge on the same runtime, the same cluster, the same operational model. Both paths can coexist — legacy service slices and new lean slices running side by side. Fault tolerance is not an afterthought — it's the foundation. Scaling is not your problem — it's the environment’s. Infrastructure is not your code — it's the runtime’s. The heavy winter coat comes off. The application breathes. Resources Pragmatica Aether—project siteGitHub Repository—source code
The idea of “asking data questions in plain English” has been around for a while, but most implementations never made it into production in a serious way. The usual reason is not the language model itself but everything around it: security boundaries, schema ambiguity, cost control, and the fact that analytics systems are rarely clean enough for unconstrained natural language to work reliably. What has changed in the last couple of years is not that natural language is suddenly perfect, but that data platforms have started bringing computation, metadata, and AI into the same controlled environment. One example of this approach is the way agents are being built directly inside data warehouses like Snowflake. The important detail is not the brand itself, but the architectural pattern: the model, the data, and the execution layer sit together rather than being stitched across multiple systems. That shift changes how analytics tools are designed. Instead of building external “AI layers” on top of a warehouse, teams are embedding the agent logic inside the warehouse itself using tools like Snowpark and managed LLM services such as Snowflake Cortex. The result is a system where natural language is just another input format, not a separate application tier. From Dashboards to Agent-Driven Querying Traditional analytics workflows are structured around predefined models: dashboards, semantic layers, and curated datasets. A user question is usually translated into one of these prebuilt views. If the question does not fit, someone writes SQL or updates the dashboard. Agent-based systems invert this flow. Instead of forcing questions into predefined structures, they attempt to generate the structure dynamically. At a high level, the flow looks like this: A user submits a natural language questionThe system enriches the prompt with schema and access contextA model generates SQL or an execution planThe query runs inside the warehouseResults are returned in a structured form or visualized output The key difference from earlier “text-to-SQL” experiments is that steps two and three are tightly grounded in the database context. The model is not guessing a schema from generic training data. It is being provided with actual table definitions, column descriptions, and sometimes usage statistics. This context injection is what makes the system usable in production. Without it, SQL generation tends to fail in subtle ways: incorrect joins, wrong aggregations, or hallucinated columns. Agent Architecture Inside the Warehouse A practical implementation of an analytics agent inside a warehouse typically has three layers. 1. Context and Permission Layer Before any model is called, the system resolves what the user is allowed to see. This includes: Role-based access controlRow-level filtersColumn masking rulesAvailable schemas and tables This step is often underestimated, but it is what makes the system safe enough for real usage. Without it, natural language becomes a bypass mechanism for data access control. 2. Language Model Translation Layer Once context is assembled, the prompt is passed into an LLM hosted within the data platform. In Snowflake’s case, this is handled through Cortex services, but the pattern is not unique to any vendor. The model’s job is not just to produce SQL but to produce SQL that is: Syntactically validAligned with schema constraintsConsistent with security rulesOptimized for warehouse execution patterns For example, a question like, “Show top 10 products by revenue in Q1 2024 grouped by region,” might become: SQL SELECT region, product_name, SUM(revenue) AS total_revenue FROM sales.fact_sales WHERE transaction_date >= '2024-01-01' AND transaction_date < '2024-04-01' GROUP BY region, product_name ORDER BY total_revenue DESC LIMIT 10; The challenge here is not generating SQL that looks correct, but ensuring it respects business definitions. Revenue, for example, might need to be net of returns or adjusted for currency conversion, depending on the organization. 3. Execution and Governance Layer Once SQL is generated, it is executed inside the warehouse engine. This is where the architecture becomes important: nothing leaves the system. The same security policies that apply to human-written queries apply here as well. Because execution happens inside the warehouse, audit logs remain consistent. Every agent action can be traced as a query event, which is important for compliance-heavy environments. Why Snowpark Matters in This Setup Tools like Snowpark extend this model beyond SQL generation. Instead of limiting the agent to query rewriting, Snowpark allows it to execute Python-based logic directly next to the data. This becomes useful when the question is not purely relational. For example: “Forecast next month’s sales for product X.” A simple SQL query cannot answer this. The agent can instead generate a Snowpark Python job that: Extracts historical time series dataConverts it into a DataFrameApplies a forecasting model such as ARIMA or ProphetWrites predictions back into a table The important point is that the data is never exported to an external notebook environment. The compute moves to the data, not the other way around. This pattern also applies to machine learning inference. Pretrained models can be registered as user-defined functions, and the agent can call them like regular SQL functions: SQL SELECT feedback_text, predict_sentiment(feedback_text) AS sentiment_score FROM customer_feedback; From a systems perspective, the agent becomes a planner rather than just a translator. It decides whether SQL is sufficient or whether a Python-based workflow is required. The Streamlit Layer: Turning Queries Into Applications While the warehouse handles computation and the agent handles reasoning, users still need an interface. One of the simpler ways to build this layer is with Streamlit. Streamlit is often used because it reduces the overhead of building internal analytics tools. Instead of designing full frontend systems, teams can wrap agent logic into lightweight interactive apps. A minimal pattern looks like this: Python import streamlit as st st.title("Data Agent Interface") query = st.text_input("Ask a question about your data") if query: result = run_agent(query) st.subheader("Generated SQL") st.code(result["sql"]) st.subheader("Results") st.dataframe(result["data"]) In more mature setups, the Streamlit layer becomes more than a query box. It evolves into a dynamic dashboard generator: Charts are generated based on query intentFilters are derived from schema metadataResults can be saved into reusable viewsUsers can refine queries conversationally This reduces dependency on static dashboards, which are often slow to update and hard to maintain. Governance Is the Real Constraint, Not AI Capability A common misconception is that the main challenge in building these systems is model accuracy. In practice, governance is the harder problem. Three constraints usually define whether an agent system is viable: Data access control must remain intact. Natural language cannot become a bypass layer for restricted data.Query cost must be predictable. Poorly generated queries can become expensive quickly in large warehouses.Results must be reproducible. Two identical questions should not produce different interpretations unless the underlying data changes. Warehouse-native architectures help with this because they centralize execution. There is no separate “AI data layer” that can drift from governance rules. Limitations of the Current Approach Despite progress, these systems are not fully autonomous analytics engines. There are still recurring issues: Ambiguous business definitions lead to incorrect aggregationsComplex joins across poorly modeled schemas still fail frequentlyLLMs may overgeneralize metrics like revenue or churnLatency increases when multi-step reasoning is required In practice, most teams deploy agents as assistants rather than replacements for BI systems. They are good at exploration, not final reporting. Closing Thoughts What is emerging is not a replacement for SQL or dashboards, but a new interface layer on top of them. Natural language becomes a routing mechanism that decides how to query or compute over structured data. The interesting architectural shift is that the intelligence layer is moving closer to the data itself. Whether implemented in Snowflake or other platforms, the pattern is consistent: context-aware models, governed execution, and embedded compute through tools like Snowpark and Cortex. Streamlit or similar tools then complete the stack by providing a lightweight interface that can evolve from simple query boxes into full analytical applications. The result is not “analytics without SQL” but something more realistic: analytics where SQL is still present, but no longer the only way in.
Alvin Lee
Founder,
Out of the Box Development, LLC