Integration Resources

DZone's Featured Integration Resources

API Facade vs. Orchestration vs. Eventing, Now With AI in the Loop

By Jubin Abhishek Soni

CORE

AI Doesn't Replace Your Architecture; It Becomes Part of It Picture this. Your team has just integrated a large language model into your enterprise application. The demo looked compelling. The agent interpreted user intent, called several APIs, and returned a coherent result. Everyone in the room was impressed. Then the questions started. What happens when the LLM misinterprets a request and calls the wrong API? Who owns the business logic embedded in that prompt? If the model changes, does the integration break? How do you audit what the AI decided and why? These aren't AI questions. They're architecture questions, and they don't go away just because you've added intelligence to the system. The most important architectural decision you'll make about AI isn't which model to use. It's where the AI sits relative to your existing integration layers. Get that right, and AI becomes a powerful, governable component in a coherent system. Get it wrong, and you'll end up with business logic scattered across prompts, brittle integrations that break when the model updates, and no clear line of accountability when something fails. The question isn't "Can AI call APIs?" It's "Where should AI sit within your architecture?" There are three architectural roles worth separating clearly. API facade. The edge layer that translates external requests into internal operations.Workflow orchestration. The layer that manages multi-step business processes and decision logic.Event-driven integration. The layer that lets systems react to changes without tight coupling. Each serves a different purpose, and AI belongs in different places depending on the business problem you're solving. Figure 1 lays out all three roles side by side, including what AI owns and does not own in each one. Figure 1. Where AI Sits: Three Architectural Roles The table below gives a quick reference for how the three patterns differ before we walk through each one in detail. Pattern Purpose Coupling Determinism Where AI Fits API Facade Translate external requests into internal operations Tight, synchronous Low, request-driven Interpreting intent, extracting parameters Workflow Orchestration Sequence multi-step business processes Moderate, coordinated High, explicit branching Providing probabilistic input to decision points Event-Driven Integration Let systems react to change asynchronously Loose, decoupled Variable, per consumer Consuming and enriching events, never the bus itself This article walks through where AI fits within each pattern, and just as importantly, where it doesn't. 1. Start by Defining What the AI Is Responsible For Before you touch an integration pattern, answer a more fundamental question. What is the AI actually accountable for in this system? This sounds obvious but gets skipped constantly. Teams reach for an LLM because it handles natural language well, then gradually load it with responsibilities it shouldn't own, like validating business rules, managing state, enforcing authorization logic, and driving deterministic workflows. The AI ends up doing everything, which means the architecture owns nothing clearly. Ask these questions before making any integration decisions. Is the AI interpreting human input? Natural language understanding, intent classification, and entity extraction are AI-native tasks where models genuinely add value.Is the AI making recommendations or decisions? A recommendation, such as "this customer is likely to churn," is a probabilistic output. A decision, such as "cancel this subscription," is a deterministic action with business consequences. These require different ownership models.Is the AI coordinating business processes? If yes, be careful. Orchestration logic embedded in prompts is invisible to your governance tooling, untestable in any traditional sense, and will silently drift as the model updates.Which steps require human approval? Any action that is irreversible, regulated, or high stakes should have an explicit human checkpoint that lives in your workflow layer, not inside a prompt. The cleaner your answer to these questions, the cleaner your integration design will be. Blurry responsibilities produce brittle architectures. Define the boundary first. 2. AI at the API Facade, the Conversational Edge The API facade pattern sits at the edge of your system. It's the layer that translates external requests into internal operations. Traditionally, this meant REST or GraphQL endpoints that routed structured requests to back-end services. AI belongs here when the primary challenge is bridging the gap between unstructured human intent and structured system operations. Think of an enterprise procurement assistant. A buyer types, "Reorder the same supplies we used for the Sydney office fit-out, but increase quantity by 20% and flag anything over $5,000 for manager approval." No traditional API handles that sentence on its own. The facade layer is exactly where an LLM adds value. It parses intent, extracts parameters, resolves ambiguity, and maps the request to specific downstream API calls. What AI does well at the facade includes intent resolution, turning natural language into structured API parameters. It also handles entity extraction, pulling order IDs, product codes, dates, and names from conversational input. It supports contextual disambiguation, using conversation history to resolve references like "that vendor" back to a specific vendor ID mentioned earlier. And it enables response synthesis, taking structured API responses and returning natural language answers. What AI should not own at the facade is just as important. Authorization logic belongs in your API gateway or identity layer. Rate limiting and throttling are infrastructure concerns, not model concerns. Core business rules, such as "orders over $5,000 require approval," should live in your workflow layer rather than in a prompt where they're invisible to compliance tooling. The practical pattern is that AI at the facade acts as a structured parameter extractor. It takes conversational input, produces a clean structured intent object, and hands off to APIs that were designed for deterministic consumption. The model interprets. The API executes. The example below shows what that structured intent object might look like once the model has parsed the procurement request above. JSON { "intent": "create_purchase_order", "reference_order": "sydney_office_fitout_2026", "quantity_multiplier": 1.2, "approval_required_above": 5000, "currency": "USD", "extracted_from": "conversational_input", "confidence": 0.94 } Listing 1: Example structured intent object produced at the API facade. Design your facade APIs to accept both human-readable context and machine-structured parameters. Build explicit validation at the API boundary so that when the model produces a malformed or out-of-range parameter, the error is caught and surfaced clearly, not silently swallowed or, worse, acted upon incorrectly. 3. AI Inside Orchestration, Where Flexibility Meets Business Workflows Workflow orchestration manages multi-step business processes, including the sequence of steps, branching logic, error handling, retries, and human approval gates. It's the layer that knows how work gets done, in what order, and under what conditions. The central tension when introducing AI into orchestration is that orchestration is deterministic by design, while AI is probabilistic by nature. A well-governed workflow produces the same output given the same inputs. An LLM does not. Mixing these carelessly produces workflows that are auditable on paper but unpredictable in practice. The architectural resolution is to keep the orchestration layer deterministic while allowing AI to provide probabilistic inputs into specific decision points. Think of AI as a specialized step inside the workflow, one that produces an output that the workflow then acts on according to explicit, auditable logic. A claims processing workflow illustrates this well. The overall process — intake, validation, AI-assisted assessment, human review, approval, and payment — is orchestrated deterministically. The AI participates at the assessment step. It analyzes claim documentation and produces a structured output: an estimated validity score, a list of missing documents, and a recommended action. The workflow then applies explicit branching logic. A score above 0.85 triggers auto approval. A score below 0.4 gets flagged for denial review. Everything in between routes to a human adjudicator. The AI informs. The orchestration decides. Figure 2 shows this flow end to end. Figure 2. AI Inside Orchestration: Claims Processing Workflow A few design principles matter here. Treat AI steps as typed operations with defined inputs and outputs. The orchestration layer should pass a structured payload to the AI and receive a structured response, not an open-ended conversation. This makes the AI step testable, replaceable, and governable. The snippet below shows a minimal example of what a typed contract for an AI step might look like. TypeScript // Typed contract for an AI step inside orchestration interface ClaimAssessmentInput { claimId: string; documents: DocumentRef[]; } interface ClaimAssessmentOutput { validityScore: number; // 0.0 to 1.0 missingDocuments: string[]; recommendedAction: "approve" | "review" | "deny"; } Listing 2: Example typed input/output contract for an AI step inside an orchestrated workflow. Never let the AI own branching logic that has compliance or audit implications. If a decision must be explainable to a regulator, it should live in the orchestration layer where it's visible, versionable, and logged. Design explicit human approval gates. In enterprise workflows, AI recommendations that trigger consequential actions, such as financial transactions, customer notifications, or system changes, should route through a human checkpoint unless you've explicitly validated and signed off on full automation. Build retry and fallback paths. An AI step that fails, times out, or returns a low-confidence result needs a defined fallback, whether that's routing to a human, using a default, or escalating, built into the orchestration rather than handled ad hoc in the calling code. Platforms like OutSystems, which provide visual workflow design alongside AI integration capabilities, make this separation of concerns tangible. You can see exactly where in the process flow an AI step participates, what it receives, and what happens next based on its output. 4. AI and Event-Driven Architecture, Reacting Without Controlling Event-driven architecture decouples systems through a shared event bus. Producers emit events when something happens, and consumers subscribe and react without either party knowing the other exists. It's the pattern that makes large distributed systems composable and independently evolvable. AI fits naturally into event-driven systems, but as a consumer and enricher, not as the event bus itself. The pattern works like this. A transactional system emits a clean, well-defined business event, such as OrderPlaced, CustomerChurnRiskFlagged, or SupportTicketOpened. An AI consumer subscribes, processes the event asynchronously, and either emits a derived event, like ChurnRiskClassified or TicketCategorized, or writes to a downstream store. Core transaction systems remain untouched. This architecture has a key property for AI integration, which is isolation. The AI component can be updated, replaced, or retrained without touching the transactional system that produced the event. The event schema is the contract between them. As long as the AI consumer honors its output schema, the downstream systems don't care what model is running behind it. AI adds value in event-driven systems in several ways. Real-time classification lets an incoming support ticket event trigger AI categorization and routing before a human ever sees it. Anomaly detection allows a stream of transaction events to feed an AI consumer that flags unusual patterns and emits a FraudSignalDetected event. Content enrichment means a DocumentUploaded event can trigger an AI pipeline that extracts entities, generates a summary, and writes structured metadata back to the event stream. A few cautions are worth noting too. Don't use AI to produce events that trigger irreversible transactional operations without a validation step. An AI-emitted event that directly drives a financial settlement or account closure is a governance risk. Keep AI consumers idempotent, since event-driven systems often deliver events at least once, and your AI consumer should produce the same output for the same event input regardless of how many times it processes it. Version your event schemas independently of your AI models. When the model changes, the event contract should remain stable. Break this rule, and you'll find yourself coordinating model updates with schema migrations across multiple teams. 5. Design APIs for AI Variability, Not Just Traditional Applications Traditional API design assumes well-behaved clients. They send valid, structured requests, handle errors predictably, and operate within known parameters. AI agents are different clients. They may generate requests outside expected parameter ranges, retry with slight variations when uncertain, pass natural language fragments where IDs are expected, or call endpoints in unconventional sequences. This changes how APIs should be designed when AI is a first-class consumer. Be explicit about parameter constraints and semantics. Document not just the type of a parameter but what it means and what values are valid. An AI agent that doesn't understand that "customer_status" is an enum with five specific values will guess, and it may guess wrong. Explicit schemas with enumerated values and clear descriptions dramatically reduce the error surface. Return structured, self-describing error responses. When an AI agent calls an API and gets a validation error, the response should tell the agent exactly what was wrong and what correction is expected. A generic 400 with "invalid input" gives the agent nothing to act on. A structured error that says the field "quantity" must be a positive integer, and that a negative value was received, allows the agent to self-correct on retry. Design for idempotency on write operations. AI agents may retry failed calls. Any write operation that could be called multiple times should be idempotent, meaning calling it twice with the same payload should produce the same result as calling it once. This is a baseline requirement for reliable agentic workflows. Consider AI-specific API profiles alongside your standard endpoints. Some teams are building enriched API descriptions, effectively structured, semantic documentation that LLMs can consume during function calling or tool use scenarios. These profiles describe not just syntax but intent, preconditions, and expected postconditions. If your platform supports it, these descriptions significantly improve agentic reliability. 6. Preserve Loose Coupling as AI Capabilities Evolve If there is one thing that is certain about the current AI landscape, it's that it will look different in 18 months. Model capabilities are improving rapidly. New reasoning architectures, longer context windows, better function calling, and multimodal inputs will change what AI can reliably do, which means the design decisions you make today about where AI participates in your architecture will need to evolve. The integration architectures that will age best are the ones that treat AI as a replaceable component behind a stable interface, not as a load-bearing structural element that the rest of the system is built around. Practically, this means a few things. The interface between your AI component and the rest of the system should be typed and versioned, just like any other service boundary. If you replace the LLM behind that interface with a better model, the orchestration layer and downstream consumers shouldn't need to change. Business logic should not live in prompts. Prompts that embed business rules, such as approval thresholds, eligibility criteria, or routing conditions, will drift as models are updated and will be invisible to your governance tooling. Extract that logic into the orchestration or rules layer where it can be versioned and audited. Test AI steps in isolation. Build evaluation harnesses that validate the AI component's outputs against known good test cases. When you upgrade a model, run the evaluation before you promote to production. This is standard software engineering discipline. It just hasn't been applied consistently to AI components yet. Plan for model-level fallback. If a primary model is unavailable or underperforming, your architecture should support routing to a fallback. This is easier to build in advance than to retrofit during an incident. The teams that will maintain architectural coherence as AI evolves are the ones that applied the same separation of concerns discipline to AI components that they've always applied to services, databases, and APIs. 7. Build Observability Across AI and Integration Layers Debugging traditional distributed systems is hard. Debugging systems where one of the components is an LLM is harder. The failure modes are different. The system may be technically healthy while producing incorrect, inconsistent, or subtly wrong outputs. A 200 OK from an AI step tells you the HTTP call succeeded. It says nothing about whether the response was accurate, relevant, or safe. Observability in AI integrated architectures needs to span multiple layers simultaneously. At the AI component level, teams should capture the full prompt sent to the model, not just the output, along with the raw model response before any parsing or post-processing. Token counts, latency, and model version matter too, as do confidence scores or reasoning traces where the model provides them, and retry attempts or fallback triggers. At the integration layer, capture which APIs the AI called, with what parameters, and what the responses were. Track workflow step durations and branching decisions, event payloads at each stage of processing, and human review decisions and overrides. At the business outcome level, ask whether the end-to-end process completed successfully, whether AI-assisted decisions matched expected patterns, and where AI components are producing outputs that require human correction. Platforms that provide centralized monitoring across application logic, integrations, and workflows, such as OutSystems, reduce the instrumentation burden by giving teams a single observability surface rather than requiring separate tooling for each layer. This matters most during incident response, when you need to trace a failure from a user-visible symptom back through the AI component, through the API calls it made, and into the underlying workflow state, quickly. One practice worth establishing early is shadow mode evaluation. Before promoting AI-assisted decisions to full automation, run the AI in parallel with existing logic and compare outcomes without acting on the AI's output. This builds confidence in the model's reliability on your specific data distribution before you depend on it in production. Conclusion. Integration Architecture Is Still the Foundation AI agents are sophisticated components, but they're still components. They have inputs and outputs. They can fail. They need to be tested, monitored, versioned, and replaced, and crucially, they need to sit somewhere coherent in your architecture. The teams that will get the most out of AI are the ones that ask the architectural questions first. What is the AI responsible for? Where does its output go? Who owns the logic around it? How will we know when it's wrong? The answer isn't a different architecture for AI. It's the same architectural discipline that enterprise systems have always required, applied with precision to a new kind of component. API facade, orchestration, and event-driven architecture were built to manage complexity, enforce separation of concerns, and keep systems evolvable. AI makes all three more valuable, not less. The question is simply where, within each, the intelligence belongs. References APISDOR. "How AI Agents Are Reshaping Enterprise Software Architecture." 2026. https://www.apisdor.com/blog/how-ai-agents-are-reshaping-enterprise-software-architecture/Elementum. "Enterprise AI Orchestration: Complete Architecture Guide." 2026. https://www.elementum.ai/blog/enterprise-ai-orchestration-architectureDevRev. "AI Agent Orchestration: Patterns, Pitfalls & the Shared Memory Architecture." 2026. https://devrev.ai/blog/ai-agent-orchestrationViston AI. "Architecture for Enterprise AI Orchestration: A 2026 Blueprint." 2026. https://viston.tech/recommending-a-production-ready-architecture-for-enterprise-ai-orchestration/"Autonomous Event-Driven Multi-Agent Orchestration for Enterprise AI at Scale." arXiv, 2026. https://arxiv.org/pdf/2606.20058Zuplo. "The API Readiness Gap: How to Design APIs That AI Agents Can Actually Use." 2026. https://zuplo.com/learning-center/api-readiness-gap-agent-callable-apis freeCodeCamp. "How to Design APIs for AI Agents." 2026. https://www.freecodecamp.org/news/how-to-design-apis-for-ai-agents/"Self-Reflective APIs: Structure Beats Verbosity for AI Agent Recovery." arXiv, 2026. https://arxiv.org/pdf/2606.05037 "Building Customer Support AI Agents at 100M-User Scale: An Evaluation-Driven Framework." arXiv, 2026. https://arxiv.org/pdf/2606.08867"Characterizing Faults in Agentic AI: A Taxonomy of Types, Symptoms, and Root Causes." arXiv, 2026. https://arxiv.org/pdf/2603.06847 Agentive AI Agents. "AI Agent Error Handling: 7 Proven Practices." 2026. https://agentiveaiagents.com/ai-agent-error-handling-best-practices/ More

How to Build a Brand Monitoring Dashboard With SerpApi and Python

By Tomas Murua

Knowing what people say about your product usually means checking Google News, scrolling through YouTube, and digging into different social media threads. That's three tabs, three interfaces, and no way to compare what you find. This tutorial builds a single dashboard that pulls brand mentions from all three sources using Python and SerpApi. By the end, you'll have a Streamlit app with three tabs, one for news articles, one for YouTube videos, and one for social media and forum discussions. We'll use "serpapi" as the search query, but you can swap the brand or product name. Brand monitoring dashboard showing metrics row with total mentions, news articles, YouTube videos, and perspectives counts Set Up Your Environment Requirements: Python 3.8+SerpApi API Key (the free plan includes 250+ searches/month)Dependencies (serpapi, pandas, streamlit, altair) The serpapi package is the official Python SDK. It handles request signing, retries, and response parsing. The complete code, including a Jupyter notebook version, is available in the SerpApi tutorials repository. The Pipeline The app follows the same three-step pattern from the GitHub Issues dashboard: fetch raw data, transform it, and display the analysis. Pipeline diagram showing three stages: fetch, transform, and display The difference this time is three separate engines running in parallel. Each returns a different response structure, so the transform step normalizes everything into DataFrames before the dashboard consumes it. Fetch the Data A single SerpApi client instance works for all three engines: Python import serpapi import os SERPAPI_KEY = os.environ.get("SERPAPI_KEY", "") client = serpapi.Client(api_key=SERPAPI_KEY) Google News The Google News API returns articles through the news_results key. Each result includes title, link, source (a dict with name and icon), date, and snippet. Python def fetch_news(client, brand): """Fetch news articles mentioning the brand via Google News.""" results = client.search({ "engine": "google_news", "q": brand, "gl": "us", "hl": "en", }) return results.get("news_results", []) For more use cases with this engine, refer to the news monitoring. YouTube The YouTube Search API uses search_query instead of q, and the sp parameter controls time filters. The values EgIIAw%3D%3D (this week) and EgIIBA%3D%3D (this month) are YouTube's internal encoding for upload date filters. You can grab these from YouTube's URL bar after applying a filter manually. We run both filters and deduplicate by link, since the month results include everything from the week: Python YT_FILTER_WEEK = "EgIIAw%3D%3D" YT_FILTER_MONTH = "EgIIBA%3D%3D" def fetch_youtube(client, brand): """Fetch YouTube videos, combining week and month filters.""" seen = set() videos = [] for sp_filter in (YT_FILTER_WEEK, YT_FILTER_MONTH): results = client.search({ "engine": "youtube", "search_query": brand, "sp": sp_filter, }) for video in results.get("video_results", []): link = video.get("link", "") if link and link not in seen: seen.add(link) videos.append(video) return videos For more examples using the YouTube API, refer to this link. Google Perspectives Google Perspectives API surfaces user-generated content from LinkedIn, Reddit, Quora, and blogs. It uses the standard Google engine, and the results appear under the perspectives key: SerpApi search with the Google perspective results Python def fetch_perspectives(client, brand): """Fetch user-generated content (Reddit, LinkedIn, Quora).""" results = client.search({ "engine": "google", "q": brand, "google_domain": "google.com", }) return results.get("perspectives", []) Fetch in Parallel Three sequential API calls take roughly three seconds. Running them in parallel with Python ThreadPoolExecutor brings that down to about one second. Each call runs in its own thread while the others wait for their response: Python from concurrent.futures import ThreadPoolExecutor @st.cache_data(ttl=300) def fetch_all_mentions(brand): """Fetch all brand mentions from three engines in parallel.""" client = serpapi.Client(api_key=SERPAPI_KEY) with ThreadPoolExecutor(max_workers=3) as pool: news_future = pool.submit(fetch_news, client, brand) yt_future = pool.submit(fetch_youtube, client, brand) persp_future = pool.submit(fetch_perspectives, client, brand) return news_future.result(), yt_future.result(), persp_future.result() SerpApi also offers a server-side async parameter for large-scale batch processing, where you submit searches and retrieve results later. For our three concurrent calls, client-side threading is simpler and equally effective. The @st.cache_data(ttl=300) decorator caches results for 5 minutes. Without it, every Streamlit interaction would re-trigger the API calls. This works alongside SerpApi's own 1-hour result cache, which serves identical queries from the cache at no extra search cost unless you explicitly pass no_cache=true. Together, these two layers minimize redundant API calls during development and testing. For more optimization techniques when working with SerpApi at scale, refer to this blog. Transform the Data All three engines return dates as relative strings ("3 hours ago", "2 days ago"). We need a shared parser to convert them into datetime objects for sorting. Parse Relative Dates Two details worth noting. The regex is compiled once and reused since this function runs for every result in all three engines. And the fallback returns datetime.now(timezone.utc) instead of None, so results without a parseable date sort to the top rather than breaking pandas operations. Python import re from datetime import datetime, timedelta, timezone RELATIVE_DATE_RE = re.compile( r"(\d+)\s+(second|minute|hour|day|week|month|year)s?\s+ago", re.IGNORECASE ) UNIT_TO_TIMEDELTA = { "second": lambda n: timedelta(seconds=n), "minute": lambda n: timedelta(minutes=n), "hour": lambda n: timedelta(hours=n), "day": lambda n: timedelta(days=n), "week": lambda n: timedelta(weeks=n), "month": lambda n: timedelta(days=n * 30), "year": lambda n: timedelta(days=n * 365), } def parse_relative_date(text): """Convert '3 hours ago' into a datetime object.""" if not text: return datetime.now(timezone.utc) match = RELATIVE_DATE_RE.search(str(text)) if not match: return datetime.now(timezone.utc) amount = int(match.group(1)) unit = match.group(2).lower() delta = UNIT_TO_TIMEDELTA.get(unit, lambda n: timedelta())(amount) return datetime.now(timezone.utc) - delta Build DataFrames Each engine gets into its own transformer. Here's the news version: Python def transform_news(results): """Convert raw Google News results into structured records.""" records = [] for item in results: source = item.get("source") or {} source_name = source.get("name", "Unknown") if isinstance(source, dict) else str(source) records.append({ "title": item.get("title", ""), "link": item.get("link", ""), "source": source_name, "date": parse_relative_date(item.get("date", "")), "snippet": item.get("snippet", ""), }) return records The source field can be a dict or a plain string depending on the result, so the isinstace check handles both. YouTube and Perspectives follow the same pattern, with two differences worth highlighting. YouTube views come back as strings like "1,234 views", so we strip non-numeric characters before converting: Python views = item.get("views") or 0 if isinstance(views, str): views = int(re.sub(r"[^\d]", "", views) or 0) Build the Dashboard The Streamlit interface starts with a form for the brand query and a row of summary metrics across all three sources: Python st.set_page_config(page_title="Brand Monitoring Dashboard", layout="wide") st.title("Brand Monitoring Dashboard") with st.form("brand_form"): brand = st.text_input("Brand or keyword to monitor", value="serpapi") submitted = st.form_submit_button("Search") Brand or keyword selector to monitor After fetching, the dashboard shows four metrics at the top for a quick overview, then splits into three tabs: Python col1, col2, col3, col4 = st.columns(4) col1.metric("Total Mentions", total_mentions) col2.metric("News Articles", len(news_records)) col3.metric("YouTube Videos", len(yt_records)) col4.metric("Perspectives", len(persp_records)) Dashboard metrics row displaying total mentions across three sources News Tab The News tab pairs an Altair bar chart of top sources with a sortable table. Altair ships with Streamlit, so there's nothing extra to install. We use it instead of st.bar_chart because it gives control over orientation, tooltips, and styling. Python source_df = news_df["source"].value_counts().head(10).reset_index() source_df.columns = ["source", "count"] source_chart = alt.Chart(source_df).mark_bar( cornerRadiusTopRight=4, cornerRadiusBottomRight=4 ).encode( x=alt.X("count:Q", title="Articles"), y=alt.Y("source:N", sort="-x", title=""), color=alt.value("#4A90D9"), tooltip=["source:N", "count:Q"], ).properties(height=350) st.altair_chart(source_chart, use_container_width=True) News tab with horizontal bar chart of top sources and sortable article table The table uses st.column_config.LinkColumn so each article title links directly to its source. YouTube Tab The YouTube tab shows views by channel and a sorted video table. The chart groups views by channel to surface which creators talk about the brand the most. Python channel_df = yt_df.groupby("channel")["views"].sum().reset_index() channel_df = channel_df.sort_values("views", ascending=False).head(10) channel_chart = alt.Chart(channel_df).mark_bar( cornerRadiusTopRight=4, cornerRadiusBottomRight=4 ).encode( x=alt.X("views:Q", title="Views", axis=alt.Axis(format="~s")), y=alt.Y("channel:N", sort="-x", title=""), color=alt.value("#4A90D9"), tooltip=["channel:N", alt.Tooltip("views:Q", format=",")], ).properties(height=350) YouTube tab showing views by channel chart and video table Perspectives Tab The Perspectives tab splits the layout between a discussion table on the left, and a donut chart of mentions by platform on the right. The donut chart makes it easy to see where conversations happen, whether it's LinkedIn, Reddit, X, etc. Python platform_chart = alt.Chart(platform_df).mark_arc( innerRadius=60, outerRadius=120 ).encode( theta=alt.Theta("count:Q"), color=alt.Color("source:N", legend=alt.Legend(title="Platform")), tooltip=["source:N", "count:Q"], ).properties(height=350) Perspectives tab with discussions table on the left and donut chart of mentions by platform on the right When to Use This Approach Ideal for: Tracking brand mentions across news, video, and social in one viewMonitoring product launches, PR campaigns, or competitor namesBuilding internal dashboard for marketing or DevRel teams Not recommended for: Real-time alerting. The API returns a snapshot, not a stream. For notifications, schedule the script on an interval and compare results.Historical analysis. Each engine returns recent results, not a complete archive. If you want to explore the API response before writing code, the SerpApi Playground lets you test any engine interactively. And if you only need news coverage, the Google News API alone handles most brand monitoring use cases. Where to Go from Here This dashboard gives you a live snapshot. The natural next step is turning it into a historical record. Store each fetch in a database (SQLite, PostgreSQL, or even a CSV), and you can compare mention volume week over week, track which sources cover your brand consistently, and spot trends that a single snapshot can't show. With historical data in place, you can layer on more analysis. Identify content gaps by looking at what topics competitors get covered on, but you don't. Track which YouTube channels mention your product and how their view counts trend over time. Flag new platforms or authors that start discussing your brand. The data is yours to work with however fits your needs. The three engines give you the raw material; what you build on top depends on the questions you're trying to answer. Conclusion The full application is about 350 lines in a single Python file. Three API calls, three DataFrames, three tabs. The query input at the top lets you switch brands without changing the code. What started as a way to check where "serpapi" shows up on the web became a tool that surfaces patterns you miss manually. The Perspectives tab pulls in LinkedIn posts, Reddit threads, and Quora answers that don't appear in regular news or video searches, and combining them in one view gives you the full picture. Check out the full SerpAPI article collection here. More

AI Won't Keep You from Hitting the Scalability Wall

By Bru Woodring

HTTP QUERY in Java: The Missing Method for Complex REST API Searches

By Otavio Santana

CORE

Getting Started With RabbitMQ in Spring Boot

By Gunter Rotsaert

CORE

OBO SSO in Java Applications: Securely Calling Downstream APIs on Behalf of a User

Modern enterprise applications rarely operate in isolation. A user may authenticate through a web or mobile application, invoke a Java-based backend API, and that backend may need to call additional downstream services such as microservices or third-party APIs. In these scenarios, simply using the application's identity is often insufficient. The downstream service may need to know which user initiated the request and enforce authorization based on that user's permissions. This is where the OAuth 2.0 On-Behalf-Of (OBO) flow becomes invaluable. In this article, I will summarize how the OBO flow works, where it fits in a modern Java architecture, and how to implement it securely in a Spring Boot application. How Does Each Downstream Service Know Who the Original User Is? One of the first assumptions many engineers make is that the backend can simply reuse its own application credentials when communicating with another service. While this works for machine-to-machine communication, it falls short whenever user-specific authorization is required. Consider a healthcare application where a physician logs into a patient portal and requests medical records. The initial Java API authenticates the request, but retrieving those records may require calling another internal API responsible for patient information. That downstream API needs to know which physician initiated the request before deciding whether access should be granted. If the Java backend uses only its own application identity, the downstream service loses the user context and cannot perform authorization based on the physician's permissions. This is exactly the problem that the OAuth 2.0 On-Behalf-Of (OBO) flow was designed to solve. What Is OBO (On-Behalf-Of) Flow? The OBO flow allows a middle-tier service (API A) to obtain an access token for another downstream service (API B) while preserving the identity and permissions of the signed-in user. Instead of API A calling API B using its own application credentials, API A exchanges the user's access token for a new token intended for API B. The flow looks like this: Plain Text User | v Web/Mobile Applications | | Access Token v Java API A | | OBO Token Exchange v Identity Provider | | New Access Token v Java API B As a result, API B receives a token representing the actual user, allowing it to perform proper authorization checks. Why Not Use Client Credentials? Many developers mistakenly use the Client Credentials flow when calling downstream APIs. While Client Credentials works for service-to-service communication, it does not carry user context. Consider a healthcare application like ours: Dr. Smith logs into a patient portal.The Java API retrieves patient records from another service.The downstream service must verify Dr. Smith's permissions. If Client Credentials is used, the downstream service only sees the application identity and loses visibility into the actual user making the request. OBO solves this problem by preserving delegated permissions. Typical Enterprise Use Cases OBO is commonly used for Healthcare applications accessing patient records, Enterprise microservices, Multi-tier API architectures, internal service authorization, and audit and compliance requirements. Many organizations implementing zero-trust architectures rely heavily on delegated authorization models such as OBO. Implementing OBO in Spring Boot Let's assume the following: Microsoft Entra ID (Azure AD) is the Identity Provider.API A is a Spring Boot application.API B is a downstream service. Step 1: Add MSAL4J Dependency XML <dependency> <groupId>com.microsoft.azure</groupId> <artifactId>msal4j</artifactId> <version>1.15.0</version> </dependency> Step 2: Acquire Token On Behalf Of User The incoming access token is received from the frontend application. Java String clientId = "YOUR_CLIENT_ID"; String clientSecret = "YOUR_CLIENT_SECRET"; IClientCredential credential = ClientCredentialFactory.createFromSecret(clientSecret); ConfidentialClientApplication app = ConfidentialClientApplication.builder( clientId, credential) .authority( "https://login.microsoftonline.com/TENANT_ID") .build(); UserAssertion userAssertion = new UserAssertion(incomingUserToken); OnBehalfOfParameters parameters = OnBehalfOfParameters.builder( Collections.singleton("api://api-b/.default"), userAssertion) .build(); IAuthenticationResult result = app.acquireToken(parameters).join(); String downstreamAccessToken = result.accessToken(); At this point, the Java application has obtained a new token that can be used to call API B while preserving the user's identity. Step 3: Call the Downstream API Using Spring's RestTemplate: Java HttpHeaders headers = new HttpHeaders(); headers.setBearerAuth(downstreamAccessToken); HttpEntity<Void> request = new HttpEntity<>(headers); ResponseEntity<String> response = restTemplate.exchange( "https://api-b.company.com/patients", HttpMethod.GET, request, String.class); return response.getBody(); API B now receives a delegated token representing the authenticated user. Security Best Practices Implementing OBO correctly is critical. 1. Validate Incoming Tokens Always validate: SignatureIssuerAudienceExpiration Never trust tokens received from clients without validation. 2. Apply Least Privilege Only request scopes required by the downstream API. Bad: Plain Text https://graph.microsoft.com/.default Better: Plain Text User.Read Limiting scopes reduces the blast radius if a token is compromised. 3. Never Log Access Tokens Avoid: Plain Text logger.info(token); Access tokens often contain sensitive claims and permissions. 4. Secure Client Secrets Store secrets in: Azure Key VaultAWS Secrets ManagerHashiCorp Vault Avoid storing secrets in: Plain Text application.properties or source code repositories. 5. Implement Token Caching Repeated token acquisition creates unnecessary latency. Consider caching OBO tokens until they expire. Most enterprise identity libraries already provide token caching support. Common Mistakes Some common issues I frequently encounter when onboarding new developers include: Using Client Credentials instead of OBOPassing user tokens directly to downstream APIsRequesting excessive scopesLogging JWT tokensNot validating token audiencesHardcoding client secrets These mistakes often lead to authorization failures or security vulnerabilities. Conclusion As organizations adopt microservices and API-first architectures, preserving user identity across service boundaries becomes increasingly important. The OAuth 2.0 On-Behalf-Of flow provides a secure and standards-based approach for allowing Java applications to call downstream APIs while maintaining the original user's context and permissions. By implementing OBO correctly, developers can build applications that are more secure, auditable, and aligned with modern zero-trust security principles. For enterprise Java teams, understanding OBO is no longer optional; it is becoming a fundamental requirement for building secure distributed systems.

By Muhammed Harris Kodavath

Building Your API Gateway From OpenAPI Specs: A Spec-Driven Approach

Generating an API Gateway From OpenAPI Specs Five Key Takeaways When your OpenAPI specification becomes the single source of truth, the gap between your API contract and your gateway configuration simply stops existing.Generating the gateway from the spec scales far better than hand-maintaining per-endpoint configuration as your API surface grows into the hundreds.Generated, human-readable service code keeps day-to-day operations manageable — you can read it, reason about it, and trace failures like ordinary software.The genuinely hard part is not the generation; it's the regeneration workflow and the discipline around where custom logic is allowed to live.Adopt the model on new APIs first, prove it's boring and trustworthy, and only then migrate existing ones. The Quiet Way Gateways Rot Every public API gateway I've worked with started its life clean and, over a few years, quietly accumulated a second universe of hand-written configuration sitting alongside the services it fronts. None of it looked dangerous at the time. A path rewrite here. A parameter rename there. A response transform to make an internal field look the way customers expect it to. A content-type translation to bridge two teams that made different choices years apart. Each individual edit was sensible, small, and well-intentioned. The danger was never any single change — it was the accumulation, and more importantly, the separation. That configuration described how the gateway should behave, but it lived in a different place from the thing it was describing: the API's actual contract. Two artifacts, two repositories, two owners, two review processes, two release cadences — all trying to stay in agreement about the same set of endpoints. Anyone who has run a system like this knows how that story ends. The two drift apart. A backend team renames a field and ships their service. The matching gateway mapping doesn't get updated because it's someone else's pull request in someone else's repo. Nothing fails loudly. A customer-facing response is simply, silently wrong. And the place you now have to go and debug is the gateway — the one component that every single request flows through, and therefore the one component nobody wants to touch under pressure. This is the real tax of treating gateway configuration as a hand-maintained artifact. It isn't the effort of writing the config. It's the slow, compounding cost of keeping two sources of truth honest with each other, forever, across a growing number of teams who have no structural reason to remember that the other one exists. A Different Premise: The Contract Is the Configuration The shift that fixes this is conceptually simple, even if it takes real engineering to operationalize. Instead of describing the gateway's behavior in a separate configuration language, you let the API's own contract define it. If you already maintain an OpenAPI specification — and most teams serving public APIs do — then that document already contains nearly everything the gateway needs to know. It knows the routes. It knows the HTTP methods. It knows the path and query parameters, the request bodies, the response shapes, the status codes. It is, in effect, a complete and precise description of the public surface. The premise of a specification-driven gateway is to stop treating that document as mere documentation and start treating it as the source from which the gateway is produced. You write the contract once. The gateway is generated from it. There is no second artifact to keep in sync, because there is no second artifact. When the contract changes, the gateway changes, by construction, because one is derived from the other rather than maintained in parallel with it. That single move — collapsing two sources of truth into one — is where almost all of the long-term benefit comes from. Everything else is mechanics. What "Generating the Gateway" Really Means In practice, you point a generator at the specification, and it produces standardized service code: the routing, the request and response models, the parameter binding, all wired to the operations the spec declares. The open-source tooling for this is mature; a single command turns a specification file into a working, conventional codebase. It looks roughly like this, and this is about as code-heavy as the idea needs to get: Shell openapi-generator-cli generate -i widgets.yaml -g spring -o ./widgets-gateway What comes out the other side is not a black box or an opaque configuration blob. It is ordinary source code — the kind your engineers already know how to read, test, and debug. That property turns out to matter enormously in production. When something misbehaves at two in the morning, the difference between "decode the gateway's configuration DSL and infer what it's doing" and "open the generated method and read it" is the difference between a long incident and a short one. The second property that matters is consistency. Because every API is produced by the same generator from the same kind of input, every API behaves the same way. They log the same way, page the same way, validate the same way, and fail the same way. Cross-cutting behavior — the way you emit metrics, the shape of your error responses, your house conventions — is expressed once, in the generation templates, and then applied uniformly to every endpoint on the platform. What used to be a hundred manual edits scattered across a hundred config files becomes a single change in one place. That uniformity is quietly one of the most valuable things you get, because it makes the entire API surface predictable, and predictability is what lets you operate at scale without heroics. Where the Custom Logic Is Allowed to Live No real gateway is pure generation. There is always behavior that the specification doesn't capture cleanly — renaming a public field to its internal equivalent, translating between formats, enforcing authentication, applying rate limits that differ by endpoint. The instinct, when you first hit one of these, is to reach into the generated code and edit it by hand. That instinct is po design against, because the moment someone hand-edits generated output, the next regeneration either erases their work or forces a painful manual merge, and the whole model starts to feel fragile and untrustworthy. The discipline that keeps it healthy is to treat the generated code as strictly read-only and to give every piece of custom behavior a designated home outside of it. Field mappings and transforms become declarative descriptors that the build applies on top of the generated services — they sit next to the contract, version alongside it, and never touch the generated files. Custom authentication filter lives in clearly marked extension points that the generator is explicitly told to leave alone. Cross-cutting platform concerns — auth, rate limiting, observability — don't get regenerated per API at all; they run as middleware in front of the generated handlers and read their policy from annotations carried in the specification itself, so even the operational rules trace back to the one source of truth. Stated as a rule for code review, it's a single sentence: never hand-edit generated code. Everything custom is either an override of its templates, or an extension point it has been told not to overwrite. Hold that line, and the model stays clean for years. Let it slip, and you slowly reinvent the very drift you were trying to escape. The Regeneration Workflow Is the Actual Work It's tempting to think the generator is the hard part. It isn't. The generator is a solved problem. The thing that determines whether this approach survives contact with a real organization is the workflow around it — the pipeline that takes a change to the specification and turns it into a deployed gateway, reliably and visibly, every time. That pipeline has a few stages, and each one earns its place. The specification gets validated on every change, so a mes generation. The code gets regenerated in a clean, reproducible step with a pinned tool version, so there's no "work on my machine" drift. The generated output gets diffed, and that diff is surfaced directly in the pull request — this is the stage teams are most tempted to skip and most regret skipping, because being able to see exactly what a one-line spec change did to the gateway is the single thing that earns engineers' trust in the whole system. And then it gets tested, including a contraended change to the public shape before a customer does. The cultural shift underneath all of this is that you start treating the specification like source code rather than lisioned, reviewed, and gated on the pipeline passing, exactly as your application code is. Once teams internalize that the spec is the system, the rest follows naturally. The Trade-Offs, Stated Honestly This approach is not free, and pretending otherwise does no one any favors. The regeneration loop has a real cost: if it's slow or flaky, engineers will route around it and start hand-editing, and the moment they do, you've lost the entire benefit. Making that loop fast, reproducible, and trustworthy is not a nice-to-have; it's the price of admission. Generated code also tends to brson would hand-write — that's the cost of standardization, and while it usually pays for itself many times over, it's a real thing your engineers will notice. And custom logic, as discussed, needs a clear and well-understood home from day one, or it quietly leaks back into the generated files and rots the model from the inside. None of these are reasons not to do it. They're reasons to do it deliberately, with the workflow and the conventions designed up front rather than discovered painfully later. How to Adopt It Without a Painful Migration The worst way to introduce this is all at once, as a big-bang rewrite of an existing, heavily-configured gateway. The right way is to start where there's nothing to reconcile: new APIs. Build them spec-first from the beginning, and use them to prove out the generator, the templates, the extension points, and the pipeline in a low-stakes setting. Let the workflow become boring — trust without thinking about it- begin migrating existing APIs one resource at a time, leaning on the diff step to confirm that each one behaves identically before and after the switch. Incremental, observable, reversible — that's how a model like this takes root without putting the platform at risk. Closing Thoughts Generating your gateway from OpenAPI specifications doesn't make complexity disappear. What it does is move the complexity from a place where it hurts you - brittle, drifting, hand-maintained configuration spread across teams — to a place where it's manageable: a disciplined specification-and-pipeline workflow with a single source of truth. In exchange, you get a gateway whose be contract your consumers already read, generated consistently across your entire API surface, debuggable like ordinary software, and safe to change because every change flows through validation, generation, and a visible diff. For a large, multi-team platform, that is a trade well worth making — and, in my experience, one you only wish you'd made sooner.

By sahil arora

WebSockets, gRPC, and GraphQL in the Core

Three connectivity features landed together this week, and they belong in one place because they build on each other. WebSockets moved into the core; the GraphQL client uses that same WebSocket support for subscriptions; and gRPC reuses the exact code-generation pattern GraphQL and OpenAPI already follow. This post is a tutorial for all three. By the end, you will have a live chat, a typed GraphQL client, and a typed gRPC client, and you will see how little code each one takes. These features come from PR #5133 (WebSockets) and PR #5141 plus PR #5099 (the typed clients). Part 1: WebSockets, No cn1lib Required WebSockets used to require the cn1-websockets cn1lib. They are now part of the framework as com.codename1.io.WebSocket, implemented natively on every port (a hand-rolled RFC 6455 handshake on JavaSE and Android, NSURLSessionWebSocketTask on iOS, the browser WebSocket on JavaScript), with no third-party dependencies pulled into your build. If you're using cn1-websockets you can keep using it. There's no change required from you. We moved the package up one level, so there's no conflict. Step 1: Open a Connection The new API is a final, fluent class with lambda handlers. You build it, attach handlers, and connect: Java // Good practice although in reality all current Codename One Platforms support WebSockets if (!WebSocket.isSupported()) { return; } WebSocket ws = WebSocket.build("wss://echo.example.com/socket") .onConnect(() -> Log.p("connected")) .onTextMessage(text -> addIncoming(text)) .onClose((code, reason) -> Log.p("closed " + code + " " + reason)) .onError(ex -> Log.e(ex)) .connect(); There is no URL-in-constructor subclassing trap from the old API; the connection is an object you hold. send(...) has a String and a byte[] overload, getReadyState() returns a WebSocketState, and close() does a clean close handshake. Step 2: Build the Chat Screen Here is a compact chat form. Outgoing messages are added immediately; incoming ones arrive on the onTextMessage handler, and because the handler can touch the UI we wrap that in callSerially: Java private WebSocket ws; private Container conversation; private void showChat(Form parent) { Form chat = new Form("Live Chat", BoxLayout.y()); conversation = chat.getContentPane(); TextField input = new TextField("", "Message", 20, TextField.ANY); Button send = new Button("Send"); send.addActionListener(e -> { String text = input.getText(); if (text.length() > 0 && ws != null) { ws.send(text); addBubble(text, true); input.clear(); } }); Container bar = BorderLayout.centerEastWest(input, send, null); chat.add(BorderLayout.SOUTH, bar); ws = WebSocket.build("wss://chat.example.com/room/general") .onTextMessage(text -> Display.getInstance() .callSerially(() -> addBubble(text, false))) .connect(); chat.show(); } private void addBubble(String text, boolean mine) { Label bubble = new Label(text); bubble.setUIID(mine ? "ChatBubbleMe" : "ChatBubbleThem"); Container line = FlowLayout.encloseIn(bubble); line.getStyle().setAlignment(mine ? Component.RIGHT : Component.LEFT); conversation.add(line); conversation.animateLayout(150); } That is a working real-time chat. The screen it produces, rendered in the simulator: Step 3: Negotiate a Subprotocol When You Need One If your server speaks a named subprotocol, set it during the handshake and read back what the server chose: Java WebSocket ws = WebSocket.build(url) .subprotocols("graphql-transport-ws") .onConnect(() -> Log.p("using " + ws.getSelectedSubprotocol())) .connect(); That graphql-transport-ws value is not an accident; it is exactly what the GraphQL subscriptions in the next part use. One reason to trust this implementation: our own screenshot CI now runs on it. The pipeline that ships rendered PNGs from each device back to the host machine uses a WebSocket as its transport, so the same code your app calls is carrying the binary payloads that validate the framework on every commit. Part 2: A Typed GraphQL Client cn1:generate-graphql turns a GraphQL schema into a typed client, and @GraphQLClient is the interface you write against. The runtime lives in com.codename1.io.graphql, and a GraphQLResponse<T> carries data and errors together so partial results survive. Step 1: Declare the Client Java @GraphQLClient("https://swapi.example.com/graphql") public interface StarWarsApi { @Query("query HeroName($episode: Episode) { hero(episode: $episode) { name homeworld { name } species { name } filmConnection { totalCount } } }") void hero(@Var("episode") Episode episode, OnComplete<GraphQLResponse<HeroData>> callback); @Subscription("subscription OnReview($ep: Episode!) { reviewAdded(episode: $ep) { stars } }") GraphQLSubscription onReview(@Var("ep") Episode ep, GraphQLSubscription.Handler<ReviewData> handler); static StarWarsApi of(String endpoint) { return GraphQLClients.create(StarWarsApi.class, endpoint); } } The build-time processor emits the implementation and a bootstrap that registers it; you never write the HTTP plumbing. The generator has two modes. The precise operations mode emits per-selection types from your operation documents; the schema-only quick-start mode auto-selects fields to a bounded depth (cn1.graphql.maxDepth). Step 2: Call It and Render the Result Java StarWarsApi api = StarWarsApi.of("https://swapi.example.com/graphql"); api.hero(Episode.EMPIRE, response -> { if (!response.isOk()) { return; } Container list = heroForm.getContentPane(); for (Hero h : response.getResponseData().heroes) { MultiButton row = new MultiButton(h.name); row.setTextLine2(h.homeworld + " . " + h.species); row.setUIID("HeroRow"); list.add(row); } heroForm.revalidate(); }); The list this populates, rendered in the simulator: Step 3: Subscriptions Ride the Core WebSocket A @Subscription returns a GraphQLSubscription backed by the core WebSocket using the graphql-transport-ws protocol from Part 1. New events arrive on the handler: Java GraphQLSubscription sub = api.onReview(Episode.JEDI, review -> Display.getInstance().callSerially(() -> showStars(review.stars))); // later sub.close(); This is the payoff of putting WebSockets in the core: the GraphQL layer did not need its own socket implementation; it just used the frameworks. Part 3: A Typed gRPC Client cn1:generate-grpc does the same trick for proto3. Point it at your .proto files and it emits hand-editable @ProtoMessage, @ProtoEnum, and @GrpcClient sources; the annotation processor generates the binary protobuf codecs and call sites into target/generated-sources so your source tree stays clean. There is no protoc dependency. Step 1: The Proto Java syntax = "proto3"; service Greeter { rpc SayHello (HelloRequest) returns (HelloReply); } message HelloRequest { string name = 1; } message HelloReply { string message = 1; } Step 2: Call the Generated Client Java GreeterGrpc g = GreeterGrpc.of("https://api.example.com"); HelloRequest req = new HelloRequest(); req.name = "world"; g.sayHello(req, "Bearer " + token, response -> { if (response.isOk()) { renderGreeting(response.getResponseData().message); } }); The wire protocol is gRPC-Web binary (application/grpc-web+proto), the standard variant for mobile and browser clients, which works with Envoy, the official grpcweb Go proxy, and the gRPC-Web filter in modern gRPC servers. Version one covers unary RPCs, all scalar types, nested messages, enums, and repeated fields; streaming, map<K,V>, well-known types, and import are out for now, and the parser errors cleanly when it meets one. Enums Bind Across All of It All three connectors share the build-time JSON and XML mapper, and that mapper now binds enums. Previously an enum field was treated as a nested reference, found no mapper, and silently did not serialize. It now writes with name() and reads with valueOf (unknown values decode to null), and it handles List<Enum>, across both JSON and XML. That is why the GraphQL Episode above is a real enum rather than a String, and it is a welcome fix for anyone using @Mapped directly. Keep Your Tokens Out of the Binary The gRPC and GraphQL samples pass a bearer token, so the rule bears repeating: never hard-code a token, and never check it into source or embed it in the app. Fetch it from your backend at runtime and store it with SecureStorage. A shipped binary can be unpacked, so anything baked into it is effectively public. These connectors learn from real specs. If a schema or a proto file does not generate the client you expected, please file an issue at github.com/codenameone/CodenameOne/issues with the source attached. The previous deep dive covered native Mac builds and desktop integration, and the release post has the full index. Tomorrow's post is the new advertising API.

By Shai Almog

CORE

Why Push-Based Systems Fail at Scale — and How Hybrid Fan-Out Fixes It

Real-time systems look simple on architecture diagrams. A user posts content, the backend publishes an event, and connected users instantly receive notifications through persistent WebSocket connections. At small scale, the model works beautifully. At large scale, it becomes one of the fastest ways to melt distributed infrastructure. Most push-based architectures fail for one reason: they assume traffic is evenly distributed. Production traffic never is. One user may have 50 followers. Another may have 10 million. Designing both scenarios using the same fan-out strategy creates massive operational problems during peak traffic. That is why large-scale platforms evolved from naive push delivery into hybrid push/pull systems optimized around uneven load distribution. The Naive Push Architecture The first design most engineers create is straightforward: A user publishes a postThe backend sends the event to a brokerWebSocket servers receive the eventNotifications are pushed to all connected followers On paper, the architecture looks clean. The system appears scalable because: WebSockets provide real-time deliveryBrokers decouple servicesHorizontal scaling seems possible But hidden underneath the simplicity is a dangerous scaling assumption: every user generates similar traffic patterns. That assumption collapses the moment a celebrity account posts. The Celebrity Fan-Out Problem Imagine a user with 10 million followers posting a new update. The system now attempts to: Generate millions of delivery events,Route them through brokers,Maintain millions of active socket writes,Deliver updates almost simultaneously. The bottleneck is no longer application logic. The bottleneck becomes: Broker throughputConnection managementQueue depthNetwork bandwidthRetry amplification This is where many real-time systems fail in production. As delivery pressure increases: Queues begin backing upConsumers lag behindWebSocket nodes become saturatedLatency grows from milliseconds into seconds or minutes Then retries begin. Clients retry because acknowledgments are delayed. Servers retry because deliveries fail. Load balancers redistribute unstable traffic. The system begins amplifying the overload condition itself. This behavior is common in distributed systems: Reliability mechanisms designed to recover from failure end up accelerating collapse under overload. The architecture appears stable during normal traffic. It fails at the exact moment traffic matters most. Why Pure Push Architectures Break The real issue is fan-out-on-write. Every post immediately creates work proportional to follower count. For small accounts, this is inexpensive. For celebrity-scale accounts, a single write operation generates massive downstream pressure: Enormous queue pressureHigh-volume socket deliveryEnormous broker traffic The system becomes optimized around worst-case fan-out instead of average workload. That is operationally expensive and difficult to stabilize. This is why most large-scale feed systems avoid pure push delivery for all users. The Hybrid Push/Pull Model Modern systems solve the problem differently. Instead of treating every account identically, they dynamically switch between: Push-on-writePull-on-read The decision is usually based on follower thresholds. Push-on-Write for Small Accounts For smaller accounts: Updates are immediately pushed,Queue workers fan out notifications,Followers receive low-latency real-time updates. This keeps the user experience fast while infrastructure costs remain manageable. Pull-on-Read for Large Accounts For celebrity-scale accounts: Posts are stored normallyFan-out is avoidedFeeds are assembled when users open the app Instead of generating millions of writes immediately, the workload shifts to read time. This dramatically reduces broker pressure and prevents large fan-out storms from destabilizing the platform. Twitter/X publicly discussed similar strategies years ago because global push fan-out becomes prohibitively expensive at scale. The important engineering insight is: Push and pull are not competing architectures. They are complementary scaling strategies selected dynamically based on traffic patterns. Feed Assembly Introduces New Complexity Once systems adopt pull-on-read, another problem appears: feed assembly. Now the platform must dynamically build personalized feeds using: Follower relationshipsRanking algorithmsMuted usersBlocked accountsRecent activityRecommendation signals This shifts complexity from writes to reads. To reduce repeated database work, systems commonly introduce: Redis timeline cachesMaterialized feed viewsAsynchronous feed buildersHot-feed caching layers The challenge becomes balancing: FreshnessLatencyConsistencyInfrastructure costCache invalidation The architecture is no longer just “real-time delivery.” It becomes distributed workload management. WebSockets Make Infrastructure Stateful Many system design discussions stop once WebSockets are introduced. Production systems become significantly harder after that point. WebSockets create stateful infrastructure. Now the platform must know: Which user is connectedWhich server owns the connectionHow to recover missed events after reconnects This changes routing behavior completely. Requests can no longer be routed blindly across stateless servers. Most systems introduce: Sticky sessions,Session affinity,Distributed connection registries,Redis pub/sub coordination. Then mobile networks create another challenge: temporary disconnects. A user loses connectivity for three seconds. What happened during that gap? Without replay recovery, notifications disappear permanently. Replay Buffers and Recovery Logic Reliable real-time systems usually implement: Sequence IDsReplay buffersReconnect checkpointsGap recovery logic When the client reconnects: It sends the last processed sequence IDThe server identifies missing eventsReplay buffers resend missed messagesLive streaming resumes This is where systems move beyond interview-level architecture. The challenge is no longer simply delivering events. The challenge is maintaining continuity during instability. Real-world distributed systems spend enormous engineering effort handling: Partial failuresReconnect stormsDuplicate deliveryInconsistent network conditions Operational Tradeoffs Teams Often Underestimate One of the biggest mistakes in real-time architectures is optimizing only for delivery speed while ignoring operational cost. Push-heavy systems keep large numbers of persistent connections open simultaneously. At global scale, this introduces pressure across multiple infrastructure layers: Connection memory usageBroker throughputNetwork egressHeartbeat trafficReconnect storms during outages Even healthy systems can become unstable during regional network disruptions. For example, if thousands of mobile clients reconnect at the same time after a temporary outage, WebSocket gateways may suddenly experience authentication spikes, replay requests, and connection churn simultaneously. This often creates secondary overload events long after the original incident is resolved. This is why mature systems introduce additional controls such as: Connection rate limitingReplay window expirationBackpressure handlingCircuit breakersAdaptive retry strategies Another overlooked problem is message ordering. In distributed fan-out systems, messages may arrive out of order because events are processed asynchronously across multiple workers or partitions. Without sequence tracking, users may briefly see inconsistent timelines or duplicate notifications. Production-grade systems therefore prioritize the following instead of assuming perfect real-time synchronization: Idempotent delivery,Sequence-aware replay,Eventual consistency handling The engineering challenge is not simply pushing events quickly. The challenge is maintaining stability while millions of users interact with the platform under unpredictable traffic conditions. Final Thoughts Most distributed systems look elegant until traffic becomes uneven. That is the hidden reality behind large-scale architecture. The difficult part is not handling average load. The difficult part is surviving pathological load without collapsing the platform. Real systems evolve through operational pain: Broker saturationRetry stormsReplay failuresQueue buildupCascading latency amplification The best architectures are rarely the simplest ones. They are the ones that continue functioning when the system is under maximum stress. In distributed systems, every design is ultimately a negotiation between: LatencyThroughputDurabilityAvailabilityCost Those forces shape every scalable platform on the internet. The systems that survive at scale are not the ones with the cleanest diagrams. They are the ones designed to absorb failure without collapsing under pressure. References Apache Kafka DocumentationRedis Pub/Sub DocumentationWebSocket Protocol RFC 6455Twitter Scalability Architecture DiscussionDesigning Data-Intensive Applications by Martin KleppmannGoogle SRE Book — Handling Overload

By Jayapragash Dakshnamurthy

Mac Native Builds, Live Protocols, And Open Issues Under 350

Our focus was all over the place this week with work that targeted many different directions: desktop, monetization, communication, media, and more. This fits with our roadmap of one platform that delivers the promise Java never delivered: WORA for Everything Everywhere. But before we dig into the new features, there's one number I'm particularly proud of… Open Issues Are Under 350 The open issue count is now below 350 (332 at the moment of this writing). That is the result of a deliberate pass through the tracker, closing things that were already fixed, reproducing the ones that were not, and fixing a batch of them outright. Some of the reports we closed this week had been open since 2015. We got a little side-tracked in the process (it is hard to read an old report without wanting to fix it on the spot), but the direction is set: we want this number to keep dropping, and by a lot. We went over the issue tracker, starting with the oldest issues and working our way back. You may recall that when we started tracking this, going under the 500-issue mark was a major milestone, and that was just a few weeks back! What Shipped Every one of the bigger items has its own deep-dive tutorial. Here is the tour, with the links to the full posts. Your App Is Now a Native Mac App Your existing Codename One app can ship as a 100% native Mac app today, with zero porting effort. No rewrite, no bundled JVM, no Electron shell: the project you already have produces a lean native Mac binary the same way it produces your iPhone app, on the same Metal renderer and battle-tested native pipeline. And it arrives feeling like a real Mac app, not a phone in a window: native title bar, native menu bar, interactive scrollbars, and desktop notifications come with it. Two of this week's features in one shot: the sample below uses the new advertising API covered later in this post, running as a native Mac app from the same Java code that produces the iOS and Android builds: The full tutorial, including the new desktop menu and shortcut APIs and the Mac signing hints, is in Your Codename One App, Now A Native Mac App. WebSockets in the Core com.codename1.io.WebSocket is now part of the framework with no cn1lib required, implemented natively on every port. A live connection is a fluent one-liner: Java WebSocket ws = WebSocket.build("wss://chat.example.com/room/general") .onConnect(() -> Log.p("connected")) .onTextMessage(text -> Display.getInstance() .callSerially(() -> addBubble(text, false))) .connect(); This is the foundation the GraphQL subscriptions below ride on, and it is trusted enough that our own screenshot CI uses it as the transport for device renders. The hands-on tutorial that builds a live chat with it is WebSockets, gRPC, and GraphQL In The Core. A Typed GraphQL Client From Your Schema cn1:generate-graphql turns a GraphQL schema into a typed client: you declare an interface against your operations and the build emits the implementation, with zero HTTP plumbing on your side. A @Subscription gets you live server-pushed updates over the new core WebSocket: Java @GraphQLClient("https://swapi.example.com/graphql") public interface StarWarsApi { @Query("query HeroName($episode: Episode) { hero(episode: $episode) { name } }") void hero(@Var("episode") Episode episode, OnComplete<GraphQLResponse<HeroData>> callback); @Subscription("subscription OnReview($ep: Episode!) { reviewAdded(episode: $ep) { stars } }") GraphQLSubscription onReview(@Var("ep") Episode ep, GraphQLSubscription.Handler<ReviewData> handler); } Note that Episode is a real enum: enum binding landed in the JSON/XML mapper alongside this work, fixing a long-standing gap for @Mapped users too. The full tutorial is in WebSockets, gRPC, And GraphQL In The Core. gRPC Clients With No Protoc cn1:generate-grpc does the same for proto3. Point it at your .proto files and it generates the messages, the binary protobuf codecs, and the client, speaking the standard gRPC-Web wire protocol that works with Envoy and modern gRPC servers. There is no protoc to install and no native dependency; calling a service looks like this: Java GreeterGrpc g = GreeterGrpc.of("https://api.example.com"); HelloRequest req = new HelloRequest(); req.name = "world"; g.sayHello(req, "Bearer " + token, response -> { if (response.isOk()) { renderGreeting(response.getResponseData().message); } }); Together with cn1:generate-openapi and cn1:generate-graphql, this means a typed client for practically any backend is one Maven goal away. The walkthrough from proto file to running call is in WebSockets, gRPC, And GraphQL In The Core. A New Advertising API Advertising support had quietly rotted across three dead-end legacy mechanisms. A pluggable, format-complete com.codename1.ads subsystem replaces all of them, with a modern AdMob reference provider, GDPR consent and iOS App Tracking Transparency built in, and a simulator placeholder provider so you can exercise every format without a device. A rewarded ad, the format people most often ask about, is now this short: Java RewardedAd ad = new RewardedAd("your-rewarded-ad-unit-id"); ad.setAdListener(new AdListener() { public void onLoaded() { ad.show(reward -> grantCoins(reward.getType(), reward.getAmount())); } }); ad.load(); The full story, including banners, native ads, app-open ads, and the provider SPI, is in A New Advertising API, Built From The Ground Up. Background Execution and Push Constraint-based background work, foreground services, push topics, shared-content handling, and a much richer local notification API, all of it usable in the simulator so you can debug these flows on your desktop. You describe what the work needs, not when to poll: Java WorkRequest req = WorkRequest.builder("daily-sync", SyncWorker.class) .setRequiresNetwork(true) .setRequiresCharging(true) .setPeriodic(6 * 60 * 60 * 1000L) .build(); BackgroundWork.schedule(req); The walkthrough, from progress notifications and inline replies to push topics and shared content, is Background Work, Push Topics, and Richer Notifications. Building Screens From Screenshots, and a Simpler Initializr Generated apps ship with a codename-one agent skill under .claude/skills/, so an AI agent working in your project already knows how to build, test, and screenshot a Codename One UI. PR #5161 teaches it the single most common design task: "make this screen look like this mockup." The hard part of that task was that the agent had no objective measure of how close it had gotten. The existing screenshot test only compares a render to a baseline the system produced itself, which measures consistency, not correctness. The new CompareToMockup tool is a single-file, pure-JDK CLI that scores a render against a target image and prints a similarity percentage: a STRUCTURAL score (an SSIM-style perceptual measure, robust to font and anti-aliasing noise against a vector mockup) and a PIXEL score (the fraction of pixels within the framework's own three-channel "same pixel" tolerance). It has a region mode so device chrome does not sabotage the score, a --diff heatmap, and a --min gate. That gives the agent a real signal: render, score, read the heatmap, adjust, repeat until the number stops climbing. A companion DesignImport tool turns a Figma, Sketch, or Adobe XD file (and, after PR #5168, the tokens.css from an HTML or React mockup) into a starter theme.css so the agent adjusts rather than starting from a blank page. The skill can also update itself now through an UpdateSkills tool, so a project generated months ago can pull the current guidance instead of carrying a frozen copy. Agents can now automatically update the skills to the latest versions and also describe the content of Codename One GUIs. This is valuable as they review their work and don't need to use vision, which is both more expensive and not as accurate. We also added the ability to check component alignment, which is often a problem that LLMs find difficult. There is also a new linter that I think we should expose to the human developers as well in the future. Right now, you can see all of these tools and use them just as an agent would, but they are more CLI-oriented. PR #5168 also rebuilt the Initializr, the tool that scaffolds a new project, around the Codename One design language, and trimmed it so it is easier to approach. It leads with the essentials (main class, package, and a Java or Kotlin toggle) and tucks IDE, localization, Java version, and current settings into collapsible cards, with a live preview and a single generate bar at the bottom. The four-template picker became the Java/Kotlin toggle, and the accent, rounded-buttons, and custom-CSS controls were dropped. The project model behind it is unchanged, so generated projects are the same; this is purely about lowering the barrier to getting started. Smaller Fixes Worth Knowing About Several of these came straight out of the issue-tracker pass: cubic-bezier() motion now matches CSS. PR #5122 fixes #1524: Motion.createCubicBezierMotion was feeding its control points into a 1D polynomial directly, so the curve did not match the CSS cubic-bezier() it was modeled on. It does now so animations might act differently in some cases.Always-tensile on the X axis. PR #5112 closes #1399, the next-oldest open issue (filed March 2015): setAlwaysTensile(true) now applies horizontally as well as vertically. This means you would see the rubber-band effect also on X axis scrolling.Validation highlights on tap-away. PR #5123 closes #1459: a field with an invalid value is now highlighted when you tap into a different field, not only when you press the virtual keyboard's next/enter.EncodedImage.dispose() actually frees memory. PR #5127 makes dispose() release the decoded image and the encoded bytes (it was a no-op before) and adds isDisposed(). Closes #3733.NetworkManager.ping(). PR #5130 adds ping(url, timeoutMillis), a real server-reachability probe to pair with the device-side isConnected(). Closes #3669.ImageViewer drag bubbling. PR #5132: a vertical drag on an ImageViewer at zoom 1 now scrolls the parent container instead of being swallowed. Closes #3700. This means you can now include many image viewers in a scrollable Y container.Graphics.isVisible(). PR #5129 adds a clip-intersection primitive so a zoomed canvas can cull off-screen content and skip the decode/scale. Closes #3846.Screenshot block now covers peers. PR #5107: ios.blockScreenshotsOnEnterBackground=true was hiding the render surface but leaving peer components (such as a BrowserComponent's WKWebView) visible in the app-switcher snapshot. Fixed.Better site search. PR #5090 sorts the on-site search index newest-first and stops the giant developer-guide page from crowding out every result. We also have dates visible next to blog post results in the search, and a new highlight explaining we don't search the developer guide/Javadoc. A Note on Contributions We stopped accepting community pull requests. We want to be precise about why, because it would be easy to read more into this than is there. This is not about AI-generated PRs. We are not policing how a contribution was written. The real reason is mechanical: our CI does not run correctly against pull requests from forks. The screenshot pipeline, the device runners, and the protocol tests all need credentials and a setup that a forked PR cannot get, so a community PR cannot actually be validated by the same gates we hold our own work to. This isn't something we can easily solve without introducing major security vulnerabilities to our process. Recently, a change that looked completely safe slipped through and triggered a CI regression that took the builds down. That was on us, not on the contributor, but it convinced us that merging code we cannot fully run is the wrong trade. So the door is open the other way. Please keep filing issues. A clear issue with a test case is genuinely no trouble for us to pick up, and as the number above shows, we are actively working on the tracker. If you have a fix in mind, describe it in the issue, attach the failing case, and we will carry it through CI ourselves. To everyone who has sent us patches over the years, and especially the people who contributed recently: thank you. The effort was real and appreciated, and this decision is about our pipeline, not your work. Upcoming Attractions Four deep-dive tutorials follow this one, one per day: Saturday. Native Mac builds and deeper desktop integration. PRs #5053, #5136, #5170.Sunday. WebSockets, gRPC, and GraphQL in the core. PRs #5133, #5099, #5141.Monday. The new advertising API. PR #5169.Tuesday. Background work, push topics, and richer notifications. PR #5142. Wrapping Up The issue tracker is here, and it is the best place to reach us right now. The discussion forum is here, and the Build Cloud console is at /console/. The Playground, Initializr, and Skin Designer are where they have always been.

By Shai Almog

CORE

AI, OAuth, and Other Platform APIs in the Core

This is the second follow-up to June 5's release post. It covers the platform APIs that moved into the framework core this release. There are two headline pieces (AI/LLM and the modern OAuth/OIDC stack) and two smaller pieces (WiFi/connectivity and share-sheet result callbacks). This continues the direction the previous release set when we moved NFC, biometrics, and cryptography into the framework core. The full background on that earlier set is in NFC, Crypto, Biometrics, And A New Build Cloud. AI: A First-Class LLM Client and a ChatView Component PR #5035 lands the com.codename1.ai package, the ChatView UI component, the speech and TTS additions, and the build-time dependency injection that wires the native pieces in. PR #5057 lands the developer-guide chapter and the agent-skill addition, so any project generated from the Initializr inherits the new APIs through its bundled AGENTS.md. LlmClient: The Basic Chat Request com.codename1.ai.LlmClient is the entry point. The simplest possible use: Java LlmClient client = LlmClient.openai(apiKey); ChatRequest req = new ChatRequest.Builder() .model("gpt-4o-mini") .system("You are a helpful assistant.") .user("What is the capital of France?") .temperature(0.7) .build(); client.chat(req).onResult((resp, err) -> { if (err != null) { Log.e(err); return; } Log.p(resp.firstChoice().content()); LlmClient.openai(...), LlmClient.anthropic(...), LlmClient.gemini(...), LlmClient.ollama(...), and LlmClient.openAiCompatible(baseUrl, apiKey) are the factories. All five are fully implemented native clients. The OpenAI client also drives Ollama, vLLM, llama.cpp, and any other endpoint that speaks the OpenAI wire format, so most local-model stacks plug in through LlmClient.openAiCompatible(...) without a separate driver. Streaming Chat (What You Actually Want for Chat UIs) For any UI that types responses out token-by-token, the streaming entry point is the one to reach for. The callback fires on the EDT, so you can append directly to a text component: Java client.chatStream(req, new ChatStreamListener() { @Override public void onDelta(ChatDelta d) { responseLabel.setText(responseLabel.getText() + d.contentDelta()); responseLabel.getParent().revalidateLater(); } @Override public void onComplete(ChatResponse fin) { sendButton.setEnabled(true); } @Override public void onError(Throwable t) { Log.e(t); sendButton.setEnabled(true); } Under the hood this is a custom ConnectionRequest subclass that parses SSE line-by-line and dispatches each delta through Display.callSerially. AsyncResource.cancel() kills the socket. So a chat UI that has a cancel button is a one-line cancellation. Tool Calls If you want the model to call back into your app, Tool / ToolChoice give you OpenAI-style function calling. Define the tool, hand the model your model and the available tools, and the response surfaces structured ToolCall objects you dispatch: Java Tool getWeather = Tool.builder() .name("get_weather") .description("Look up the current weather for a city.") .parameter("city", "string", "The city name, e.g. \"Paris\".") .build(); ChatRequest req = new ChatRequest.Builder() .model("gpt-4o-mini") .user("Is it raining in Tel Aviv right now?") .tool(getWeather) .toolChoice(ToolChoice.AUTO) .build(); client.chat(req).onResult((resp, err) -> { if (err != null) return; for (ToolCall call : resp.firstChoice().toolCalls()) { if ("get_weather".equals(call.name())) { String city = call.argument("city").asString(); String json = lookupWeather(city); // Loop the result back into the conversation client.chat(req.replyWithToolResult(call, json)) .onResult((followUp, e) -> updateUi(followUp)); } } The shape mirrors the OpenAI function-calling contract one for one, so anything you have written against the OpenAI API directly maps across without rethinking. Embeddings LlmClient.embed(...) returns a vector for any input string. Useful for similarity search against a local SQLite store (tomorrow's post will cover the new ORM that pairs with this): Java EmbeddingRequest er = new EmbeddingRequest.Builder() .model("text-embedding-3-small") .input("Codename One is a cross-platform mobile framework.") .build(); client.embed(er).onResult((emb, err) -> { float[] vector = emb.firstVector(); // store, search, compare Image Generation DALL-E and a Replicate scaffold are surfaced through ImageGenerator: Java ImageGenerator gen = ImageGenerator.openAiDallE(apiKey); gen.generate("A red bicycle leaning against an olive tree", "1024x1024") .onResult((img, err) -> { if (err != null) return; myImageComponent.setIcon(img); Working Against Ollama in the Simulator (No API Charges) JavaSEPort pings localhost:11434 at startup. If it finds Ollama, it sets the cn1.ai.ollamaDetected property. With cn1.ai.simulatorRedirect=auto (or =ollama) every LlmClient.openai(...) call routes through the local Ollama endpoint instead of OpenAI's. Production code does not change. The iteration loop, your tests, and your offline debugging stop costing money and stop needing an internet connection. In common/codenameone_settings.properties: Properties files simulator.cn1.ai.simulatorRedirect=auto (The simulator. prefix scopes the property to the JavaSE simulator path.) Then run Ollama locally with whichever model your code expects (ollama run llama3.2 or similar) and your existing LlmClient.openai(...) calls go to localhost. How to Handle API Keys A direct word on credentials before any of the above sees production. LLM provider API keys (OpenAI, Anthropic, Gemini, your Auth0 / Firebase configs) are bearer tokens with a budget attached. They must never be checked into source control, embedded in your app binary, or hard-coded in code. A leaked key can be extracted from any APK or IPA in minutes and used to drain your account. The correct shape is to fetch the key from your own backend over an authenticated request, then store it on the device using the platform's keychain / keystore. The framework provides both pieces: com.codename1.crypto.SecureStorage (from the previous release) is the cross-platform wrapper over iOS Keychain Services and Android EncryptedSharedPreferences. Values are encrypted at rest using the platform's hardware-backed protection class where one is available.This release adds a single-argument get / set / remove(account, ...) overloads next to the existing biometric-gated methods. The new overloads store the value without a per-read Face ID / Touch ID prompt, which is what you want for an LLM API key (you read it on every network call; a biometric prompt every time is not workable). The biometric-gated methods are still there for credentials you do want to gate per use. A reasonable shape: Java private static AsyncResource<String> getOpenAiKey() { String cached = SecureStorage.get("openai_api_key"); if (cached != null) { return AsyncResource.complete(cached); } return Rest.get(myServer + "/v1/credentials/openai") .bearerToken(userSessionToken()) .fetchAsString() .onResult((key, err) -> { if (err == null) { SecureStorage.set("openai_api_key", key); } }); Your server gates the credential request behind the user's session, your app caches the result on the keychain, and the key never sits anywhere a reverse-engineering pass could find it. If your server rotates the key, invalidate the cache and refetch. Existing biometric-gated SecureStorage calls keep working unchanged. The new overloads are additive. ChatView: A Ready-Made Streaming Chat UI com.codename1.components.ChatView is the matching UI component. Scrollable message list, ChatBubble for the per-message bubble (theme-aware UIIDs so it picks up the iOS Modern / Material 3 native themes consistently), ChatInput for the bottom input bar, and a one-line bindToLlm(...) that wires the input to a streaming chat request: Java ChatView view = new ChatView(); getOpenAiKey().onResult((key, err) -> { view.bindToLlm(LlmClient.openai(key), new ChatRequest.Builder() .model("gpt-4o-mini") .system("You are a friendly tutor for " + "Codename One developers.") .build()); }); Form f = new Form("Chat", new BorderLayout()); f.add(BorderLayout.CENTER, view); The result is a standard mobile chat layout, picked up from whichever native theme the project uses: If you want more control than bindToLlm(...) gives you (custom message styling, a "thinking" placeholder, hand-rolled retry, persistence to your own model class), drive the view by hand: Java ChatView view = new ChatView(); ConversationStore store = ConversationStore.open("tutor-thread"); view.setMessages(store.load()); LlmClient client = LlmClient.openai(apiKeyFromKeychain); view.setInputListener(userText -> { ChatMessage userMsg = ChatMessage.user(userText); view.appendMessage(userMsg); store.append(userMsg); ChatMessage assistant = ChatMessage.assistant(""); view.appendMessage(assistant); ChatRequest req = new ChatRequest.Builder() .model("gpt-4o-mini") .messages(store.load()) .build(); client.chatStream(req, new ChatStreamListener() { @Override public void onDelta(ChatDelta d) { view.appendToLastMessage(d.contentDelta()); } @Override public void onComplete(ChatResponse fin) { store.append(ChatMessage.assistant(view.lastMessage().content())); view.setInputEnabled(true); } @Override public void onError(Throwable t) { view.appendToLastMessage(" [error: " + t.getMessage() + "]"); view.setInputEnabled(true); } }); appendToLastMessage(...) is the streaming entry point; it marshals through callSerially so deltas land on the EDT in order. ConversationStore persists the thread (the default backing is Storage; pluggable via a custom implementation if you would rather keep it in SQLite or push it to your server). The AI cn1libs The core LLM stack is paired with a set of opt-in cn1libs that wrap specific on-device capabilities: Google ML Kit features, the TensorFlow Lite runtime, a local Whisper transcription engine, and an on-device Stable Diffusion model. Thirteen new cn1libs ship this release. These cn1libs are not yet listed in the Codename One Preferences cn1lib picker, so for the moment they are added by hand. Drop the matching dependency block into your project's common/pom.xml and rebuild. The build-time scanner does the rest: the iOS pod or Swift Package, the Android Gradle dependency, the plist usage strings (NSCameraUsageDescription for the vision libraries, NSSpeechRecognitionUsageDescription for Whisper, etc.), and the Android permissions (android.permission.RECORD_AUDIO for audio capture) are all injected automatically the first time the scanner sees the matching class on the classpath. For each cn1lib below, the dependency block is identical in shape; only the <artifactId> changes. The shared pattern is: XML <dependency> <groupId>com.codenameone</groupId> <artifactId></artifactId> <version>${cn1.version}</version> </dependency> cn1-ai-mlkit-text: Text Recognition (OCR) TL;DR. Pull printed or handwritten text out of an image (a photo of a page, a sign, a receipt) entirely on-device. Platforms. iOS bridges to GoogleMLKit/TextRecognition. Android bridges to com.google.mlkit:text-recognition. The JavaSE simulator returns an unsupported error. Use cases. Receipt scanning, sign translation pipelines (combine with cn1-ai-mlkit-translate), accessibility tools that read printed text aloud, automated form ingestion. Java byte[] jpeg = capturePhotoBytes(); TextRecognizer.recognize(jpeg).onResult((text, err) -> { if (err == null) Log.p("OCR: " + text); cn1-ai-mlkit-barcode: Barcode and QR Scanning TL;DR. Decodes QR, EAN, UPC, Data Matrix, PDF417, and the rest of the common 1D / 2D code families from a captured image. Platforms. iOS bridges to MLKitBarcodeScanning. Android bridges to com.google.mlkit:barcode-scanning. The JavaSE simulator returns an unsupported error. Use cases. Inventory scanning, ticket / boarding-pass readers, QR-driven onboarding flows, retail loyalty cards. Java byte[] jpeg = capturePhotoBytes(); BarcodeScanner.scan(jpeg).onResult((codes, err) -> { if (err == null) { for (String code : codes) Log.p("Found: " + code); } }); cn1-ai-mlkit-face: Face Detection TL;DR. Returns bounding boxes for human faces detected in an image. Each face is reported as a packed int[4] (x, y, width, height). Platforms. iOS bridges to MLKitFaceDetection. Android bridges to com.google.mlkit:face-detection. Use cases. Auto-crop a contact photo, mosaic / blur bystanders in a group shot, drive a face-tracked overlay for AR-lite filters. Java FaceDetector.detect(jpeg).onResult((boxes, err) -> { if (err != null) return; for (int i = 0; i < boxes.length; i += 4) { Log.p("face at " + boxes[i] + "," + boxes[i + 1] + " " + boxes[i + 2] + "x" + boxes[i + 3]); } }); cn1-ai-mlkit-labeling: Image Labeling TL;DR. "What is in this picture." Returns a list of descriptive labels for the image content. Platforms. iOS bridges to MLKitImageLabeling. Android bridges to com.google.mlkit:image-labeling. Use cases. Auto-tagging uploaded photos, content moderation pre-filters, content-based image search. Java ImageLabeler.label(jpeg).onResult((labels, err) -> { if (err == null) Log.p("labels: " + String.join(", ", labels)); }); cn1-ai-mlkit-translate: On-Device Translation TL;DR. Translate short text between supported language pairs entirely on-device; no server round-trip, no API key, works offline. Platforms. iOS bridges to MLKitTranslate. Android bridges to com.google.mlkit:translate. Languages are identified by their ISO 639-1 codes (en, fr, es, ...). Use cases. Offline travel assistants, chat translation, accessibility readers for foreign signage (combine with cn1-ai-mlkit-text). Java Translator.translate("Where is the train station?", "en", "fr") .onResult((fr, err) -> { if (err == null) Log.p(fr); // "Où est la gare ?" }); cn1-ai-mlkit-smartreply: Short Reply Suggestions TL;DR. Generates short suggested replies for chat conversations, similar to Gmail's Smart Reply chips. Platforms. iOS bridges to MLKitSmartReply. Android bridges to com.google.mlkit:smart-reply. The input is a JSON array of {role, message, timestamp, userId} objects. Use cases. A "quick reply" row above the keyboard in your in-app chat, response suggestions in a CRM inbox. Java String thread = "[{\"role\":\"remote\",\"message\":\"See you at 6?\"," + "\"timestamp\":" + System.currentTimeMillis() + "," + "\"userId\":\"u42\"}]"; SmartReply.suggest(thread).onResult((suggestions, err) -> { if (err == null) { for (String s : suggestions) Log.p("suggestion: " + s); } }); cn1-ai-mlkit-langid: Language Identification TL;DR. Returns the most likely ISO 639-1 code for a given text, or und (undetermined) when the input is too short or ambiguous. Platforms. iOS bridges to MLKitLanguageID. Android bridges to com.google.mlkit:language-id. Use cases. Auto-route a customer-support message to the right team, pick the correct TTS voice for an arbitrary string, pre-screen input before running an expensive translation. Java LanguageIdentifier.identify("Bonjour le monde").onResult((code, err) -> { if (err == null) Log.p(code); // "fr" }); cn1-ai-mlkit-pose: Pose Detection TL;DR. Returns 33 skeletal landmarks per detected pose as a packed float[3 * 33] (x, y, confidence triples). Platforms. iOS bridges to MLKitPoseDetection. Android bridges to com.google.mlkit:pose-detection. Use cases. Fitness apps with form correction, dance/yoga timing analysis, gesture-driven controls. Java PoseDetector.detect(jpeg).onResult((landmarks, err) -> { if (err != null || landmarks.length < 99) return; float noseX = landmarks[0], noseY = landmarks[1], noseConf = landmarks[2]; Log.p("nose at (" + noseX + ", " + noseY + ") conf=" + noseConf); }); cn1-ai-mlkit-segmentation: Selfie Segmentation TL;DR. Returns a per-pixel mask separating the person in the foreground from the background as byte[width * height] (0 = background, 255 = foreground). Platforms. iOS bridges to MLKitSegmentationSelfie. Android bridges to com.google.mlkit:segmentation-selfie. Use cases. Background replacement for video calls, sticker / portrait-mode effects, blur-the-background privacy filters. Java SelfieSegmenter.segment(jpeg).onResult((mask, err) -> { if (err == null) applyBackgroundReplacement(mask); }); cn1-ai-mlkit-docscan: Document Scanner TL;DR. Detects a rectangular document in a photo, perspective-corrects it, and writes the cropped JPEG to a temporary file. Returns the file path. Platforms. iOS uses Apple's VisionKit + Core Image rectangle detection (no extra pod). Android uses com.google.android.gms:play-services-mlkit-document-scanner. Use cases. "Scan to PDF" flows, expense apps that capture receipts, contract signing flows, ID-document capture. Java DocumentScanner.scanToFile(jpeg).onResult((path, err) -> { if (err == null) uploadDocument(path); }); cn1-ai-tflite: TensorFlow Lite Interpreter TL;DR. A general-purpose on-device inference engine. Bring your own .tflite model and run it against a float32 input tensor. Platforms. iOS uses TensorFlowLiteSwift (Pods or Swift Package). Android uses org.tensorflow:tensorflow-lite + tensorflow-lite-support. Use cases. Any custom on-device ML model your team trains or pulls from TF Hub. Image classification, simple regression, recommendation pre-filters. Java byte[] modelBytes = Util.readFully(Display.getInstance().getResourceAsStream(null, "/model.tflite")); float[] input = featureVector(); Interpreter.run(modelBytes, input).onResult((output, err) -> { if (err == null) Log.p("model returned " + output.length + " values"); }); cn1-ai-whisper: Speech-to-Text via whisper.cpp TL;DR. On-device transcription of a 16 kHz mono WAV file using a ggml-format Whisper model. The cn1lib bundles libwhisper.a. Platforms. iOS uses the Accelerate framework; Android uses a JNI build of the same whisper.cpp core. Models (e.g. ggml-base.bin) are not bundled; ship the one your app expects under the app's resources or download on first launch. Use cases. Voice notes, accessibility transcription, offline dictation, podcast indexing. Java String modelPath = SecureStorage.getFilePath("ggml-base.bin"); String audioPath = recordWavToFile(); WhisperRecognizer.transcribe(modelPath, audioPath) .onResult((text, err) -> { if (err == null) Log.p("heard: " + text); }); cn1-ai-stablediffusion: On-Device Image Generation TL;DR. Generates a JPEG from a text prompt using a bundled Stable Diffusion model. Multi-gigabyte payload, local build only. Platforms. iOS uses Core ML pipelines compiled from the bundled model. Android uses ONNX Runtime. Both configurations exceed the cloud build server's 2 GB upload limit, so this cn1lib triggers the cn1.ai.requiresBigUpload guard and the cloud build aborts with a "build this one locally" message. Add it to a project you build via mvn cn1:buildAndroid / mvn cn1:buildIosXcodeProject on the developer machine. Use cases. Avatar generation in apps where shipping to a cloud API is undesirable (offline-first apps, regulated industries, privacy-sensitive products). Java StableDiffusion.generate("a teal hot-air balloon over Lisbon, watercolour", 512, 512, /* steps */ 25) .onResult((jpeg, err) -> { if (err == null) display(Image.createImage(jpeg, 0, jpeg.length)); }); Why These Are cn1libs and Not Part of the Core The core gets the AI plumbing every app that adopts AI at all wants: the LLM client, streaming, the chat UI, the secure storage primitive for credentials, the simulator Ollama redirect for offline iteration. The cn1libs above are specialized verticals. Barcode scanning, document scanning, face detection, smart reply, pose detection, on-device translation, transcription, and on-device image generation are genuinely useful, but only for some apps. They also each bring a non-trivial native dependency. The Google ML Kit Android frameworks are large; the iOS pods carry their own weight; the bundled libwhisper.a and the Stable Diffusion model are big. Pulling all of them into the core would tax every app, whether the feature is used or not. The Stable Diffusion cn1lib in particular is large enough that the cloud build server cannot accept the upload at all (it trips the 2 GB pre-upload guard). That kind of opt-in does not belong in a dependency every app inherits. The corresponding chapter, including the full LlmClient API table, the ChatView reference, the SecureStorage overloads, the simulator Ollama redirect, and the full cn1lib coverage, is at AI, Chat UI, and Speech in the developer guide. OAuth and OIDC: The Modern Identity Stack The in-app-WebView Oauth2 flow that Codename One has shipped since approximately forever was the way every cross-platform mobile framework solved "sign in with Google / Facebook / Microsoft" in the 2010s. It is also the way every one of those identity providers stopped wanting you to solve it. Google has been blocking embedded user agents for years. Apple does not want third-party apps wrapping the Apple ID flow in a WKWebView. Microsoft and Facebook joined the chorus. The right answer is the system browser: ASWebAuthenticationSession on iOS, Custom Tabs on Android, with PKCE on the wire. That is what PR #5018 lands. PR #5039 adds a portable WebAuthn / passkey client on top. Sign In With Google (or Any OIDC Provider) com.codename1.io.oidc.OidcClient is the entry point. Point it at the discovery URL of an OIDC provider, hand it the client id and the redirect URI you registered with the provider, ask for tokens: Java OidcConfiguration cfg = OidcConfiguration.discover("https://accounts.google.com"); OidcClient client = OidcClient.builder() .configuration(cfg) .clientId("123-abc.apps.googleusercontent.com") .redirectUri("com.example.myapp:/oauthredirect") .scopes("openid", "email", "profile") .build(); client.signIn().onResult((tokens, err) -> { if (err != null) { OidcException oe = (OidcException) err; if (oe.getCode() == OidcException.USER_CANCELLED) return; Log.e(oe); return; } String idToken = tokens.getIdToken().raw(); String email = tokens.getIdToken().getClaim("email").asString(); proceed(email, idToken); Discovery JSON parsed and cached. PKCE S256 challenge generated and verified. State and nonce checked on the callback. ID-token claims decoded for you (we deliberately do not verify the signature client-side; the dev guide is explicit about why and points at the "re-validate on your backend" remedy). Refresh and revoke are first-class. The token store is pluggable via TokenStore; the default is Storage-backed, but a Keychain-backed or in-memory variant is a small class. On iOS the system-browser piece routes through ASWebAuthenticationSession. On Android through androidx.browser.customtabs with a plain ACTION_VIEW fallback for the rare device with no Custom Tabs provider. AuthenticationServices.framework and androidx.browser:browser are auto-linked when the classpath scanner sees OidcClient in use. Provider Wrappers: Google, Apple, Microsoft, Facebook, Auth0, Firebase If you would rather not configure OIDC by hand, the existing social classes get a signIn(...) method that drives the same stack with the provider's issuer URL pre-wired: Java GoogleConnect.signIn(googleClientId, "com.example.myapp:/oauthredirect", "openid", "email", "profile") .onResult((tokens, err) -> { /* ... */ }); MicrosoftConnect.signIn(entraClientId, "msauth.com.example.myapp://auth", "User.Read") .onResult((tokens, err) -> { /* ... */ }); Auth0Connect.signIn("tenant.auth0.com", clientId, redirectUri, "openid profile email") .onResult((tokens, err) -> { /* ... */ }); FacebookConnect.signIn(...) follows the same shape against the Facebook OIDC endpoint. FirebaseAuth covers the REST-based Firebase auth surface (email/password, IdP token exchange, refresh) which sits underneath any provider hand-off you might want to drive from app code. Sign In With Apple Sign in with Apple is required on iOS for apps that offer any other social login, and on Android it must fall through to a web flow. com.codename1.social.AppleSignIn handles both transparently: Java AppleSignIn.signIn() .onResult((result, err) -> { if (err != null) return; String idToken = result.getIdToken(); String code = result.getAuthorizationCode(); proceedToBackend(idToken, code); }); On iOS 13 and later this drops directly into the native Apple sheet via ASAuthorizationAppleIDProvider. On non-iOS platforms it falls through to the same OIDC web flow as everything else, so a single line of app code does the right thing on every port. The Maven plugin injects the com.apple.developer.applesignin entitlement on iOS when it sees AppleSignIn in use; Android does not see it because it is not there. Migration From the Legacy Oauth2 com.codename1.io.Oauth2 is now deprecated. Existing code still compiles, but the migration is short and almost always shorter than what it replaces: Java // Before Oauth2 oauth = new Oauth2("https://accounts.google.com/o/oauth2/auth", clientId, redirectUri); oauth.setClientSecret(clientSecret); oauth.setScope("openid email profile"); oauth.setBrowserComponent(myBrowserComponent); // tied to a WKWebView String token = oauth.authenticate(); // blocks, opens the web view Java // After OidcClient.builder() .configuration(OidcConfiguration.discover("https://accounts.google.com")) .clientId(clientId) .redirectUri(redirectUri) .scopes("openid", "email", "profile") .build() .signIn() .onResult((tokens, err) -> proceed(tokens.getIdToken().raw())); You stop owning the browser. The OS owns it. The cookies live in the platform's authentication session. The user gets the same login experience they have everywhere else on their device. WebAuthn/Passkeys PR #5039 layers a portable WebAuthn client on top: Java WebAuthnClient client = WebAuthnClient.getInstance(); if (!client.isAvailable()) { fallbackToPassword(); return; } PublicKeyCredentialCreationOptions opts = PublicKeyCredentialCreationOptions.fromServerJson(serverJson); client.create(opts).onResult((cred, err) -> { if (err == null) postToRelyingParty(cred.toJson()); }); W3C JSON wire format in both directions, so the response can be POSTed verbatim to any standard server-side WebAuthn library. iOS 16+ routes through ASAuthorizationPlatformPublicKeyCredentialProvider; Android API 28+ through androidx.credentials.CredentialManager. Provider helpers: Auth0Connect.signInWithPasskey(...) / .registerPasskey(...) and FirebaseAuth.signInWithPasskey(...) / .registerPasskey(...). One thing worth pulling out before you reach for it: if you sign in via OIDC against Google, Apple, Microsoft, Auth0, or Firebase, you usually already get passkeys for free. The identity provider runs the WebAuthn ceremony inside the system browser; OIDC just hands you the resulting tokens. So you do not need WebAuthnClient for that case. You need it for apps that run their own relying-party backend, and for apps driving the Auth0 or Firebase passkey grants directly. Full chapter: Authentication and Identity. Connectivity: WiFi, Bonjour, USB, network-type listeners PR #5021 lands four packages for apps that need to do more with the network than open an HTTP socket. The shape: Java WiFi wifi = WiFi.getInstance(); String ssid = wifi.getCurrentSSID(); String bssid = wifi.getBSSID(); String gateway = wifi.getGateway(); String ip = wifi.getIp(); wifi.scan(new ScanOptions().setTimeoutMillis(5000)) .onResult((results, err) -> { /* ... */ }); wifi.connect("MyNetwork", "hunter2", Security.WPA2_PSK) .onResult((success, err) -> { /* ... */ }); com.codename1.io.wifi for WiFi info, scan, and connect. com.codename1.io.wifi.WiFiDirect for peer-to-peer (Android only by platform reality). com.codename1.io.bonjour for mDNS / Zeroconf via BonjourBrowser and BonjourPublisher. com.codename1.io.usb for USB host (Android only). And NetworkManager.addNetworkTypeListener(...) plus NETWORK_TYPE_* constants so an app can react to a transition between cellular, WiFi, ethernet, or "none": Java NetworkManager.getInstance().addNetworkTypeListener(evt -> { int type = evt.getNetworkType(); if (type == NetworkManager.NETWORK_TYPE_NONE) showOfflineBanner(); else if (type == NetworkManager.NETWORK_TYPE_CELLULAR) suppressLargeBackgroundDownloads(); else clearOfflineBanner(); }); iOS does not expose programmatic WiFi scanning to third-party apps; scan() throws UnsupportedOperationException on iOS. iOS also does not expose WiFi Direct or general USB host. None of those are Codename One limitations; they are Apple's. The dev guide is explicit about each platform's limits. Three new compile-time defines (CN1_INCLUDE_WIFI_INFO, CN1_INCLUDE_HOTSPOT, CN1_INCLUDE_BONJOUR) wrap the iOS native code, set only when the classpath scanner sees the matching Java API in use. Apps that do not use these APIs do not pay for them at App Store review time. Same pattern as the NFC gating from the previous release. Full reference: Network Connectivity. Share-Sheet Result Callbacks PR #5036 closes a small but persistent gap: Display.share(...) and ShareButton finally tell you what the user did with the share sheet: Java ShareButton btn = new ShareButton(); btn.setTextToShare("Look at this fox"); btn.setImageToShare("/fox.jpg"); btn.setShareResultListener(result -> { switch (result.getStatus()) { case SHARED_TO: track("share_completed", result.getTargetPackage()); break; case DISMISSED: track("share_dismissed"); break; case FAILED: track("share_failed", result.getError()); break; } }); iOS routes through UIActivityViewController.completionWithItemsHandler; Android through Intent.createChooser with an IntentSender callback (API 22+). The framework normalizes the platform values into SHARED_TO(packageName), DISMISSED, or FAILED. Appearing in Other Apps' Share Menus The other half of sharing is the inverse direction: not "let the user share from your app", but "let your app receive content other apps share". If a user is in Safari, Photos, or Mail and taps the share icon, your app should be able to appear as a target there alongside Messages, WhatsApp, and Instagram. On iOS that requires a separate Share Extension target inside the .ipa, with its own bundle, its own Info.plist, an App Group string that links it to the host app, and a ShareViewController that handles the incoming payload. Historically the recommendation was to bootstrap that target by hand in Xcode, copy the resulting files into the Codename One project under ios/app_extensions/, and let the build server's extractor consume them. It worked, but it was a workflow most teams put off because the setup is fiddly. The same PR ships an IOSShareExtensionBuilder Mojo that does all of that for you. A typical setup is one Maven command and a one-time configuration block: XML <plugin> <groupId>com.codenameone</groupId> <artifactId>codenameone-maven-plugin</artifactId> <configuration> <iosShareExtension> <bundleIdentifier>com.example.myapp.share</bundleIdentifier> <displayName>MyApp</displayName> <appGroup>group.com.example.myapp</appGroup> <acceptedContent> <content>PUBLIC_URL</content> <content>PUBLIC_IMAGE</content> <content>PUBLIC_TEXT</content> </acceptedContent> </iosShareExtension> </configuration> </plugin> Run mvn cn1:generate-ios-share-extension and the Mojo writes a complete .ios.appext bundle into ios/app_extensions/: the Info.plist with the right NSExtension activation rules for the content types you declared, the App Group entitlement, a minimal ShareViewController.swift that lands the payload in the App Group's UserDefaults(suiteName:), and the matching buildSettings.properties. The result feeds straight into the existing IPhoneBuilder.extractAppExtensions pipeline, so apps that already have a hand-rolled extension keep working unchanged. On the host-app side, you read the payload on launch: Java // Anywhere after Display.init has run String shared = Storage.getInstance() .readObject("ios.shareExtension.lastPayload"); if (shared != null) { handleSharedPayload(shared); } After the next cloud or local build, your app appears in the iOS share sheet for the content types you declared. No Xcode work, no hand-rolled plist, no App Group string typed in three places. The build-time tooling owns it. Wrapping Up Tomorrow's post covers the architectural change in this release: a build-time bytecode annotation framework, the declarative router that is its first consumer, the SQLite ORM and JSON / XML mappers and component binder built on the same SPI, and the build-time SVG / Lottie transcoder that ships in the same release for related reasons. Back to the weekly index.

By Shai Almog

CORE

Code and Connect: MCP + MuleSoft

I often find myself in conversations where the same words keep popping up again and again: Agents, MCP, and A2A. Everyone seems excited about them. But the funny part is that when the topic shifts to MCP (Model Context Protocol), the explanations start to vary. One day, someone confidently said, “An MCP server is basically a tool.” Another person immediately disagreed and replied, “No, no — MCP is more like a client.” Before that debate could settle, someone else joined the conversation and said, “Actually, MCP is just a protocol.” And then another perspective appeared: “Think of it as middleware that sits between an agent and APIs.” At that moment, I realized something interesting: we were all talking about the same concept, yet each of us understood it a little differently. These conversations made me curious. If experienced developers and architects describe MCP in different ways, how confusing must it be for someone who is just starting to explore this space? The more I listened, the more I noticed a pattern — people weren’t wrong, but they were often describing only one piece of the puzzle. That realization is what inspired this blog. In this article, I want to step back from the buzzwords and walk through the concepts in a simple way. What exactly is MCP? Is it a server? A tool? A client? Or something else entirely? And how does it relate to the agents that everyone keeps talking about? Is it applicable only to agents, or is it applicable to assistants also? We will also explore MuleSoft's capability in this space. By the end of this post, my goal is to bring clarity to these terms and show how they connect. Instead of hearing multiple interpretations in different conversations, you’ll be able to see the complete picture of how MCP fits into modern AI and integration architectures. Let's Understand What Anthropic Says About MCP MCP (Model Context Protocol) is an open-source standard for connecting AI applications to external systems. Think of MCP like a USB-C port for AI applications. Just as USB-C provides a standardized way to connect electronic devices, MCP provides a standardized way to connect AI applications to external systems. MCP at high level Now let's break down each component and understand it in the simplest way possible. AI Application AI application can be any application that consists of an LLM, orchestration, and tools (You can think of it as assistants), or it may consist of more complex components such as Agent Orchestration, specialized agents, and Tools(You can think of it as an agentic application). Tools can be a Payment Gateway, a Data Retrieval API, a Weather API, a File System, a WebSearch, etc. MCP Model Context Protocol is an open protocol that enables seamless integration between AI applications (LLM Applications) and external data sources and tools. MCP provides a standardized way to connect LLMs with the context they need. MCP follows a client-server architecture. Key components of this architecture are MCP Host, MCP Client, and MCP Server. Let's extend our previous architecture. MCP architecture MCP Host It is nothing but a Host where the AI application is running. MCP Client It is a component that establishes a connection with the MCP Server and gets the context for the MCP Host to use. MCP Server It consists of external services that provide context to LLMs. Model Context Protocol consists of two layers: Data layer: The data layer implements a JSON-RPC 2.0 (JRPC) based exchange protocol that defines the message structure and semantics for client-server communication.Transport layer: The transport layer manages communication channels and authentication between clients and servers. It handles connection establishment, message framing, and secure communication between MCP participants.MCP supports two transport mechanisms: Stdio transport: Uses standard input/output streams for direct process communication between local processes on the same machine, providing optimal performance with no network overhead.Streamable HTTP transport: Uses HTTP POST for client-to-server messages with optional Server-Sent Events for streaming capabilities. This transport enables remote server communication and supports standard HTTP authentication methods, including bearer tokens, API keys, and custom headers. MCP recommends using OAuth to obtain authentication tokens. Use Case We can think of "Weather Intelligence Agent," which uses the MCP server to make a call to a tool that provides weather information based on a city name. This is a simple use case just to demonstrate how an API is called as a tool using MCP. We will use Postman and Cursor to mimic as Agent/Assistant, which will call the Weather API. Let's see how we can implement this use case using MuleSoft: Step 1: MuleSoft provides the MCP Server - Tool Listener connector. We will configure the MCP Server. MuleSoft code Refer to the code: XML <?xml version="1.0" encoding="UTF-8"?> <mule xmlns:ee="http://www.mulesoft.org/schema/mule/ee/core" xmlns:http="http://www.mulesoft.org/schema/mule/http" xmlns:mcp="http://www.mulesoft.org/schema/mule/mcp" xmlns="http://www.mulesoft.org/schema/mule/core" xmlns:doc="http://www.mulesoft.org/schema/mule/documentation" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.mulesoft.org/schema/mule/core http://www.mulesoft.org/schema/mule/core/current/mule.xsd http://www.mulesoft.org/schema/mule/mcp http://www.mulesoft.org/schema/mule/mcp/current/mule-mcp.xsd http://www.mulesoft.org/schema/mule/http http://www.mulesoft.org/schema/mule/http/current/mule-http.xsd http://www.mulesoft.org/schema/mule/ee/core http://www.mulesoft.org/schema/mule/ee/core/current/mule-ee.xsd"> <http:listener-config name="HTTP_Listener_config" doc:name="HTTP Listener config" doc:id="251f2d7c-e84b-4974-a1e8-96d9779bc9e9" > <http:listener-connection host="0.0.0.0" port="8081" /> </http:listener-config> <mcp:server-config name="MCP_Server" doc:name="MCP Server" doc:id="289fb886-e732-4274-990e-9876aca405a6" serverName="mule-mcp-server" serverVersion="1.0.0"> <mcp:streamable-http-server-connection listenerConfig="HTTP_Listener_config"/> </mcp:server-config> <http:request-config name="HTTP_Request_config" doc:name="HTTP Request config" doc:id="b31d7d79-b45b-42ec-a970-50eb19a0a702" > <http:request-connection protocol="HTTPS" host="api.weatherstack.com" /> </http:request-config> <flow name="mcp-weahter-intelligence-apiFlow" doc:id="b1c21d3c-18f0-4eac-bb4e-3cf789608580" > <mcp:tool-listener doc:name="MCP Server - Tool Listener" doc:id="4c42c1cb-898d-4fb9-8d0e-edc541fffb75" config-ref="MCP_Server" name="get_weather_information"> <mcp:description ><![CDATA[This tool gets weather information. Check weather details for device by providing the city name as input or paramValue. Please use the query.]]></mcp:description> <mcp:parameters-schema ><![CDATA[{ "$schema": "http://json-schema.org/draft-07/schema#", "type": "object", "properties": { "query": { "type": "string", "description": "city for querying weather data" } }, "required": ["query"], "additionalProperties": false }]]></mcp:parameters-schema> <mcp:responses > <mcp:text-tool-response-content text="#[payload.^raw]" priority="1"> <mcp:audience > <mcp:audience-item value="ASSISTANT" /> </mcp:audience> </mcp:text-tool-response-content> </mcp:responses> </mcp:tool-listener> <http:request doc:name="Request" doc:id="d10760de-5f93-4f63-aadc-9bfc491f94e0" config-ref="HTTP_Request_config" path="/current"> <http:query-params ><![CDATA[#[output application/java --- { "access_key" : "96d01954d0c4e444aa781fa10b92caff", "query" : payload.query, "units" : "m" }]]]></http:query-params> </http:request> </flow> </mule> Let's run this code and test it: MCP server started successfully: Deployment log Step 2: Let's use Postman as the MCP client to test it and see if it is working as expected: MCP server and available tools Step 3: Click on Connect: Connected to MCP Server Step 4: Now the MCP client is connected to the MCP server. You need to pass a query parameter as the city name, and you will get the weather details: I am writing this Blog from GOA (The Beach Capital of India). I will use GOA as the City name to retrieve weather information about GOA. Use the tool Step 5: Click on Run, and you will get the response as shown below: Response I have demonstrated it in my local version of code, which is deployed in Anypoint Studio. Let's test the same after deploying it to the runtime manager. I have deployed the code to the runtime manager. Deployed in the Anypoint platform Test result I have demonstrated this using Postman, where Postman worked as an MCP client to connect to the MCP server. We can extend it further and use Cursor to mimic the agentic behavior where the agent will use the MCP tool to get the answer. Cursor to use MCP I have used no code/low code tool, which is MuleSoft. In the next blog, I will use Python code to demonstrate the same. Watch the video for more details. Let me know if you liked it!

By Ajay Singh

REST-Assured Configuration and Specifications: Writing Maintainable API Tests

When working on API automation projects, one of the first things that becomes repetitive is configuring the same settings for every test. The base URL, content type, request logging, and common response validations often appear in multiple test classes. As the number of tests increases, maintaining these repeated configurations becomes difficult. REST Assured provides specifications to solve this problem. Instead of defining the same settings in every test, common configurations and specifications can be created once and reused throughout the test suite. This article demonstrates a simple approach to configuring REST Assured using a Base Test class along with Request and Response Specification. What Are REST-Assured Specifications? A specification is a reusable configuration object that contains common request or response settings. So, instead of repeatedly writing: Java given() .baseUri("https://api.example.com") .header("Authorization", "Bearer token") .contentType(ContentType.JSON) The configuration can be defined once and reused across multiple tests. Similarly, the common validations can also be written using the specifications. Specifications help in: Reduce code duplicationImprove test readabilityCentralize API configurationsSimplify maintenanceStandardize request and response validations Why Use Specifications? Consider an API test that retrieves user details. Java @Test public void getUserDetails() { given() .baseUri("https://api.example.com") .when() .get("/orders/2") .then() .statusCode(200); } The test works correctly, but the base URI and common validations, such as status code, will need to be repeated in every test. A better approach is to move these common settings into reusable specifications. What Problem Does It Solve? In many API automation projects, test cases often contain repeated configuration code. The same base URL, content type, authentication details, headers, and response validations are repetitive across multiple test classes. While this may not seem like a problem when there are only a few tests, maintaining the test suite becomes difficult as the project grows. Consider a scenario where the API base URL changes from a QA environment to a Staging environment. Without a centralized configuration, every test containing the old URL would need to be updated. Similarly, if a common header or authentication mechanism changes, modifications would be required in multiple places. Request and Response Specifications solve this problem by moving common configurations into reusable objects. Instead of repeating the same setup in every test, the configuration is defined once and reused wherever required. This reduces code duplication, improves readability, and makes the test suite easier to maintain. As a result, test methods can focus on validating business functionality rather than configuring API requests and responses. This leads to cleaner and more maintainable automation code. Creating a SetupSpecification Class The most common configurations should be placed in a separate class. This allows all test classes to inherit the same setup. The following example creates a Request and Response Specification in a separate class using the @BeforeClass annotation. Java public class SetupSpecification { @BeforeClass public void setup () { final RequestSpecification request = new RequestSpecBuilder () .addHeader ("Content-Type", "application/json") .setBaseUri ("http://localhost:3004") .addFilter (new RequestLoggingFilter ()) .addFilter (new ResponseLoggingFilter ()) .build (); final ResponseSpecification response = new ResponseSpecBuilder () .expectResponseTime (lessThan (10000L)) .build (); RestAssured.requestSpecification = request; RestAssured.responseSpecification = response; } } This setup method runs before the test class execution. The Request Specification contains the base URI, content type, and logging configuration. Any configuration defined in a Request Specification will be applied to every API request that uses that specification. For example, if the specification includes a common header, authentication token, content type, or query parameter, those values will automatically be sent with all requests that reference the specification. While this promotes reusability and reduces duplication, care should be taken when adding request-specific details to a shared specification. Not all APIs may require the same headers, authentication mechanisms, query parameters, or request bodies. Including such configurations in a common specification can lead to unintended behavior and make tests more difficult to maintain. The Response Specification contains the common validations that are expected from the API response. The expectResponseTime() method validates that the API responds within the specified time limit. Additionally, we can also add the validations for: Status CodeHeadersContent-TypeCookieBody However, it is important to understand that any validation defined in a Response Specification will be applied to every API test that uses that specification. For example, if the specification includes a validation for a 200 status code, all tests using that specification will automatically expect a 200 response. This may not be appropriate for APIs that are expected to return different status codes, such as 201, 204, 400, or 404. The same consideration applies to validations related to headers, content type, cookies, and response body content. Including endpoint-specific validations in a shared specification can reduce flexibility and make tests harder to maintain. A good practice is to keep only the truly common validations in a shared Response Specification and add endpoint-specific assertions within the individual test methods. The statement below makes the Request Specification available globally for the test execution. Java RestAssured.requestSpecification = request; RestAssured.responseSpecification = response; As a result, the base URI and header(Content-Type), and validation to check the response time do not need to be specified in every test. Writing a Test Using the Specifications Once the setup is complete, test classes can extend the SetupSpecification class. Java public class TestGetRequestWithRestAssuredSpecs extends SetupSpecification { @Test public void getRequestTestWithRestAssuredConfig () { final int orderId = 3; given ().when () .queryParam ("id", orderId) .get ("/getOrder") .then () .statusCode (200) .and () .assertThat () .body ("orders[0].id", equalTo (orderId), "orders[0].product_name", equalTo ("USB-C Charger")); } } The Request Specification is automatically applied because it was configured in the SetupSpecification class. It means all the common request configurations, such as the base URI, headers, content type, and logging settings, are automatically applied to the request. Similarly, the common response validations configured for expected response time in the SetupSpecification class are reused during test execution. The test itself focuses only on endpoint-specific details by passing the id query parameter, invoking the /getOrder endpoint. This approach keeps the test concise and improves maintainability by separating common configuration from test-specific assertions. Adding Additional Assertions The Response Specification can handle common validations, while endpoint-specific assertions can still be added in the test. Java public class TestGetRequestWithRestAssuredSpecs extends SetupSpecification { @Test public void getRequestTestWithRestAssuredConfig () { final int orderId = 3; given ().when () .queryParam ("id", orderId) .get ("/getOrder") .then () .statusCode (200) .and () .assertThat () .body ("orders[0].id", equalTo (orderId), "orders[0].product_name", equalTo ("USB-C Charger")); } } In this example, the response body validations for order ID and product name remain inside the test because they are specific to this API endpoint. Why This Approach Is Useful As the test suite grows, hundreds of API tests may use the same base URL, content type, authentication, and response validations. Maintaining these configurations in every test class can quickly become difficult. Keeping the Request and Response Specifications in a separate class provides a centralized location for managing common settings. If the API URL changes or additional configurations need to be added, only a single file needs to be updated. This approach also improves readability because the test methods contain only the business validations relevant to the API being tested. Using Request and Response Specifications Directly in the Test Class While many automation projects prefer keeping specifications in a separate class, there are situations where creating specifications directly inside the test class makes sense. This approach is useful for smaller projects, proof-of-concept implementations, or when a test class requires its own configuration that is not shared with other tests. In this approach, the Request and Response Specifications are created using the @BeforeClass annotation and are available only within the current test class. Java public class StringRelatedAssertionTests { private static ResponseSpecification responseSpecification; private static RequestSpecification requestSpecification; @BeforeClass public void setupSpecBuilder () { final RequestSpecBuilder requestSpecBuilder = new RequestSpecBuilder ().setBaseUri ( "https://api.restful-api.dev/objects") .addQueryParam ("id", 3) .addFilter (new RequestLoggingFilter ()) .addFilter (new ResponseLoggingFilter ()); final ResponseSpecBuilder responseSpecBuilder = new ResponseSpecBuilder ().expectStatusCode (200); responseSpecification = responseSpecBuilder.build (); requestSpecification = requestSpecBuilder.build (); } @Test public void testStringAssertions () { given ().spec (requestSpecification) .get () .then () .spec (responseSpecification) .assertThat () .body ("[0].name", equalTo ("Apple iPhone 12 Pro Max")) } } In this example, the Request and Response Specifications are created once in the @BeforeClass method and stored in static variables. The Request Specification contains common request details such as the base URI, query parameters, and logging filters, while the Response Specification defines the expected status code. During test execution, the Request Specification is applied using the spec(requestSpecification) method before sending the request. After the response is received, the Response Specification is applied using spec(responseSpecification) to validate the common response expectations before performing additional assertions on the response body. Keeping the specifications and test logic within the same class makes the example easy to follow, as both the setup and test execution are located in a single file. However, as the test suite grows and multiple test classes require the same configurations, duplicating specifications across classes can become difficult to maintain. In such situations, moving the common Request and Response Specifications to a separate class provides better reusability and reduces code duplication. For smaller projects or learning purposes, defining the specifications directly within the test class remains a simple and effective approach. Summary Rest-Assured Specifications help create cleaner and more maintainable API automation tests. A best practice is to define Request and Response Specification in a separate class and initialize them using the @BeforeClass annotation. The Request Specification manages settings such as the base URI, content type, and logging, while the Response Specification handles common response validations. By centralizing these configurations, test classes become shorter, easier to read, and simpler to maintain. For API automation frameworks built with REST Assured and TestNG, this pattern provides a clean foundation that scales well as the number of tests increases.

By Faisal Khatri

CORE

Implementing Asynchronous Communication Between Microservices Using Kafka and Spring Boot

In a microservices system, that tight coupling turns a small hiccup into a cascading slowdown. Thread pools fill, retries amplify traffic, and suddenly your simple request is blocked on half the fleet. My executive summary: asynchronous messaging with Kafka helps systems keep moving when individual components inevitably slow down or fail. It does this by decoupling producers from consumers, absorbing traffic spikes, and allowing services to evolve without tying their availability directly to one another. Code Patterns in Spring Boot With Kafka Spring for Apache Kafka gives me two primitives that feel pleasantly old Spring KafkaTemplate for sending and @KafkaListener for receiving. That template/listener model is intentionally similar to other Spring integration tech, which keeps application code focused on domain logic instead of raw client plumbing. Below is a compact (but production-shaped) pattern: externalized config via @ConfigurationProperties, a service port for publishing, a REST command endpoint, a consumer with a real error strategy (DLT), and a REST error advice. Java // === Messaging config (externalized, type-safe) === @ConfigurationProperties(prefix = "messaging.orders") @Validated record OrdersMessagingProps( @NotBlank String topic, @NotBlank String dltTopic ) {} // === DTO (event contract) === public record OrderCreatedEvent(UUID orderId, UUID userId, BigDecimal total, Instant createdAt) {} // === Service port (keeps domain testable, Kafka swappable) === public interface OrderEventPublisher { void publishOrderCreated(OrderCreatedEvent event); } // === Adapter: Kafka producer === @Component class KafkaOrderEventPublisher implements OrderEventPublisher { private final KafkaTemplate<String, OrderCreatedEvent> template; private final OrdersMessagingProps props; KafkaOrderEventPublisher(KafkaTemplate<String, OrderCreatedEvent> template, OrdersMessagingProps props) { this.template = template; this.props = props; } @Override public void publishOrderCreated(OrderCreatedEvent event) { // Keying by orderId keeps per-order ordering and drives partitioning decisions. template.send(props.topic(), event.orderId().toString(), event); } } // === REST command API (synchronous edge, async core) === @RestController @RequestMapping("/v1/orders") class OrdersController { private final OrderService orderService; // domain port OrdersController(OrderService orderService) { this.orderService = orderService; } @PostMapping public ResponseEntity<Map<String, Object>> create(@Valid @RequestBody CreateOrderRequest req) { UUID orderId = orderService.create(req.userId(), req.total()); // persists + publishes event return ResponseEntity.accepted().body(Map.of("orderId", orderId, "status", "ACCEPTED")); } record CreateOrderRequest(@NotNull UUID userId, @NotNull @Positive BigDecimal total) {} } // === Domain service port (implementation can use outbox, transactions, etc.) === public interface OrderService { UUID create(UUID userId, BigDecimal total); } // === Consumer: downstream service reacts to events === @Component class BillingListener { @KafkaListener(topics = "${messaging.orders.topic}", groupId = "${spring.kafka.consumer.group-id}") void onOrderCreated(OrderCreatedEvent event) { // Idempotency belongs here: process-by-key + store processed eventId/orderId to avoid duplicates. // Do work (charge card, create invoice, etc.) } } // === Kafka consumer error handling: retries + DLT === @Configuration class KafkaErrorHandlingConfig { @Bean DefaultErrorHandler defaultErrorHandler(KafkaTemplate<Object, Object> template, OrdersMessagingProps props) { var recoverer = new DeadLetterPublishingRecoverer(template, (rec, ex) -> new TopicPartition(props.dltTopic(), rec.partition())); // Backoff and retry policy are configurable; keep it finite to avoid poison-pill loops. return new DefaultErrorHandler(recoverer, new FixedBackOff(1000L, 3)); } } // === REST error handling (ProblemDetail) === @RestControllerAdvice class ApiErrors { @ExceptionHandler(IllegalArgumentException.class) @ResponseStatus(HttpStatus.BAD_REQUEST) ProblemDetail badRequest(IllegalArgumentException ex) { var pd = ProblemDetail.forStatusAndDetail(HttpStatus.BAD_REQUEST, ex.getMessage()); pd.setTitle("Invalid request"); return pd; } } A few been-burned-before notes on the code above. Spring Kafka’s reference docs are explicit that KafkaTemplate is the convenience wrapper for producing, and DefaultErrorHandler + DeadLetterPublishingRecoverer is a first-class way to route failed records to dead-letter topics after retries. If we want non-blocking retries, Spring Kafka also provides @RetryableTopic, which orchestrates retry topics and a DLT automatically useful when transient failures are common and you want predictable retry delay semantics. Containers and Local Dev With Docker Compose When I’m chasing down event flow bugs, I like local environments that feel like the old days: one command, deterministic startup order, and no mystery dependencies. Docker Compose is still the quickest way to stand up Kafka alongside your services, and Confluent publishes straightforward Docker-based tutorials and compose examples for running Kafka locally. For the service image itself, multi-stage builds are the modern classic compile in a builder stage, and copy the artifact into a slimmer runtime stage. Docker documents multi-stage builds as a way to reduce the final image contents and keep build dependencies out of production. Dockerfile # Multi-stage Dockerfile for a Spring Boot service (orders-service) FROM eclipse-temurin:21-jdk AS build WORKDIR /workspace COPY mvnw pom.xml ./ COPY .mvn .mvn RUN ./mvnw -q -DskipTests dependency:go-offline COPY src src RUN ./mvnw -q -DskipTests package FROM eclipse-temurin:21-jre WORKDIR /app COPY --from=build /workspace/target/*.jar app.jar EXPOSE 8080 ENTRYPOINT ["java","-jar","/app/app.jar"] And here’s a Compose file that wires up Kafka and Schema Registry, plus an example Spring Boot service. The exact image choices are illustrative. Your production choices are unspecified and should reflect your standards and security posture. YAML # compose.yaml (local/dev) services: zookeeper: image: confluentinc/cp-zookeeper:7.6.0 environment: ZOOKEEPER_CLIENT_PORT: 2181 kafka: image: confluentinc/cp-kafka:7.6.0 depends_on: [zookeeper] ports: ["9092:9092"] environment: KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:9092 KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 schema-registry: image: confluentinc/cp-schema-registry:7.6.0 depends_on: [kafka] ports: ["8081:8081"] environment: SCHEMA_REGISTRY_HOST_NAME: schema-registry SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: PLAINTEXT://kafka:9092 orders: build: ./orders-service depends_on: [kafka] ports: ["8080:8080"] environment: SPRING_KAFKA_BOOTSTRAP_SERVERS: kafka:9092 MESSAGING_ORDERS_TOPIC: orders.events MESSAGING_ORDERS_DLTTOPIC: orders.events.dlt SCHEMA_REGISTRY_URL: http://schema-registry:8081 Deploying on Kubernetes or AWS On AWS, the Kafka decision is usually managed or self-managed. If you choose Amazon MSK, the cluster lives in your VPC, pick subnets across distinct Availability Zones, and connect clients using the cluster’s bootstrap brokers. That’s the networking baseline, and it’s not optional. MSK is VPC-first by design. For authentication/authorization, MSK supports IAM access control. AWS documents the client configuration for IAM mechanisms. In EKS, I typically pair MSK IAM with IRSA so pods can obtain AWS credentials the AWS way, while ECS services would use task roles instead. Both patterns are documented by AWS, and your choice here is unspecified. Kubernetes service discovery is usually the easy part. Services and Pods get DNS names so workloads can call each other by name rather than IP. Kafka itself is reached via bootstrap broker endpoints or via internal Services, but either way, you want the strings in externalized config, not hardcoded. Here’s a minimal Kubernetes Deployment/Service for a Kafka client service. Values like region, account IDs, and MSK endpoints are unspecified placeholders. YAML apiVersion: apps/v1 kind: Deployment metadata: name: orders namespace: apps spec: replicas: 2 selector: matchLabels: { app: orders } template: metadata: labels: { app: orders } spec: serviceAccountName: orders-sa # IRSA-bound (role ARN unspecified) containers: - name: orders image: <UNSPECIFIED_AWS_ACCOUNT_ID>.dkr.ecr.<UNSPECIFIED_REGION>.amazonaws.com/orders:<TAG> ports: [{ containerPort: 8080 }] env: - name: SPRING_KAFKA_BOOTSTRAP_SERVERS value: "<UNSPECIFIED_MSK_BOOTSTRAP_BROKERS>" - name: MESSAGING_ORDERS_TOPIC value: "orders.events" - name: MESSAGING_ORDERS_DLTTOPIC value: "orders.events.dlt" readinessProbe: httpGet: { path: /actuator/health/readiness, port: 8080 } initialDelaySeconds: 10 --- apiVersion: v1 kind: Service metadata: name: orders namespace: apps spec: selector: { app: orders } ports: - port: 80 targetPort: 8080 Operationally, MSK exposes metrics into CloudWatch (AWS/Kafka), and broker logs can be delivered to CloudWatch Logs (or S3/Firehose). That combination gives you the classic visibility loop: throughput, lag, under-replicated partitions, and error logs without running your own monitoring plane. For distributed tracing in async flows, OpenTelemetry is my default vocabulary now. Spring Boot supports OpenTelemetry export via OTLP, and OpenTelemetry defines Kafka semantic conventions so your producer/consumer spans and attributes stay consistent across tools. CI/CD and the Hard-Earned Field Notes For CI/CD, I keep it boring: build once, push an immutable image, deploy via a declarative mechanism. AWS Prescriptive Guidance provides a clear GitHub Actions pattern for building Docker images and pushing to Amazon ECR, which is a solid baseline when your region/account is unspecified until configured. YAML # .github/workflows/orders.yml name: orders on: push: branches: ["main"] jobs: build_push_deploy: runs-on: ubuntu-latest permissions: id-token: write contents: read steps: - uses: actions/checkout@v4 - uses: actions/setup-java@v4 with: distribution: temurin java-version: "21" - name: Build & test run: ./mvnw -q test package - name: Configure AWS credentials (OIDC) uses: aws-actions/configure-aws-credentials@v4 with: role-to-assume: arn:aws:iam::<UNSPECIFIED_AWS_ACCOUNT_ID>:role/<UNSPECIFIED_GHA_ROLE> aws-region: <UNSPECIFIED_REGION> - name: Login to ECR run: | aws ecr get-login-password --region <UNSPECIFIED_REGION> \ | docker login --username AWS --password-stdin <UNSPECIFIED_AWS_ACCOUNT_ID>.dkr.ecr.<UNSPECIFIED_REGION>.amazonaws.com - name: Build & push image run: | IMAGE=<UNSPECIFIED_AWS_ACCOUNT_ID>.dkr.ecr.<UNSPECIFIED_REGION>.amazonaws.com/orders:${{ github.sha } docker build -t $IMAGE ./orders-service docker push $IMAGE - name: Deploy to EKS (example) run: | aws eks update-kubeconfig --name <UNSPECIFIED_EKS_CLUSTER> --region <UNSPECIFIED_REGION> kubectl -n apps set image deploy/orders orders=$IMAGE Now, the part I wish someone had handed me in 2016: Kafka gives you strong tools, but it does not remove distributed-systems truths. You still need safeguards on the consumer side: idempotent processing, disciplined schema management, and clearly defined retry and dead-letter topic behavior. Kafka’s documentation is careful about the limits of “exactly once” guarantees. Idempotent producers and transactions can strengthen delivery semantics, but achieving true end-to-end exactly-once behavior, especially when external side effects are involved, still depends on deliberate system design. For schema governance, Kafka itself doesn’t ship a schema registry, but acknowledges third-party registries; in practice, Confluent Schema Registry and Apicurio Registry are common choices. Both store schemas out-of-band, so messages carry only a schema identifier, and both support evolvable contracts across Avro/JSON Schema/Protobuf depending on your ecosystem. Conclusion and Best Practices If you take one lesson from my legacy brain into modern event-driven systems, let it be this: asynchrony is a reliability feature, not a performance trick. Kafka’s durable log and consumer group model decouples uptime and absorbs spikes, but you only get the real benefit when you treat schemas as contracts, consumers as idempotent processors, and failure handling as first-class application behavior. On AWS, the operational baseline is non-negotiable. MSK lives in your VPC across AZ subnets, clients connect via bootstrap brokers, IAM auth is configured explicitly, and observability lives in CloudWatch. Do those fundamentals early, and Kafka stops feeling like a mysterious black box and starts feeling like the dependable workhorse it was built to be.

By Mallikharjuna Manepalli

Phantom APIs Are Eating Your Attack Surface, and Most Security Teams Are Still Looking the Other Way

I've spent the better part of fifteen years staring at API traffic logs for a living, and I can tell you the job has changed twice. The first shift came with microservices, when a handful of monolithic endpoints became thousands of small, chatty interfaces, and nobody could agree on who owned the inventory. The second shift is happening right now, and it's worse because this time the endpoints aren't even being written by people who can explain why they exist. Call them phantom APIs: routes, handlers, and parameters that show up in production but never appear in a spec, a ticket, or a design review. Some get hand-built by a developer in a hurry and are forgotten. Increasingly, though, they're a byproduct of AI code generation — Copilot, Cursor, an internal fine-tuned assistant, whatever your shop has standardized on — quietly scaffolding an admin route, a debug handler, or a permissive query path because that pattern showed up often enough in training data to feel "normal." Nobody asked for it. Nobody reviewed it with fresh eyes, because by the time a human glances at the diff, the suggestion already looks plausible. That's the part that should worry you more than any single CVE: plausibility, not malice, is now the main vector. How a Phantom Gets Born Here's the mechanism, stripped of drama. An engineer asks an AI assistant to "add an endpoint that lets support staff look up account status." The model, trained on millions of internal admin panels, often reaches for the path of least resistance: broad object access, no granular scope check, maybe a debug flag left wired to a query parameter "for testing." It compiles. It passes the smoke tests because the smoke tests check that the feature works, not that it's bounded. It ships. None of that shows up in your OpenAPI document because nobody updated the spec — the AI didn't know one existed, and the human reviewing the pull request was scanning for logic bugs, not authorization boundaries. Your API gateway, meanwhile, is busy enforcing policy on the routes it knows about. A path it has never seen just rides along on the same TLS termination and the same network ACLs as everything else, because from the network's point of view, there's nothing unusual happening. The gateway isn't broken. It's just answering a question nobody thought to ask it. I've heard versions of this story from engineers at a logistics platform, a healthcare billing vendor, and a fintech, all in the last year, none of whom wanted their names anywhere near a public postmortem — which is its own data point. Shame keeps these incidents quiet, and quiet incidents are exactly what let the pattern repeat across the industry instead of getting fixed once. The Numbers Stopped Being Theoretical in 2025 If you've been treating "API security" as a slide in next year's budget deck rather than this quarter's incident response calendar, the data from the past twelve months should change your mind. Wallarm's 2026 API ThreatStats Report, which pulled from 67,058 published vulnerabilities and 60 disclosed API breaches across 2025, found that API-related flaws made up 17% of all published vulnerabilities and 43% of the entries CISA added to its Known Exploited Vulnerabilities catalog that year. The technical profile of those flaws is the part that should keep API owners up at night: 97% exploitable with a single request, 99% remotely reachable, and 59% requiring no authentication at all. This isn't an attack surface that rewards patience and tradecraft. It rewards speed, and speed is exactly what AI tooling hands to attackers as readily as it hands to developers. That same report tracked AI-related vulnerabilities jumping from 439 in 2024 to 2,185 in 2025 — a 398% increase — with 315 of those tied specifically to Model Context Protocol implementations, the connective tissue between AI agents and the tools they're allowed to call. MCP didn't exist as a meaningful attack surface two years ago. Now it's 14% of all AI-related vulnerability disclosures in a single annual report. I don't think I've watched a category go from nonexistent to material that fast since the early days of container orchestration. IBM's X-Force Threat Intelligence Index 2026 adds the macro view: exploitation of public-facing applications became the single most common initial access vector in 2025, up 44% year over year, and 56% of the roughly 40,000 vulnerabilities X-Force tracked required no authentication to exploit. CybelAngel's own 2025 API threat reporting found that 95% of API attacks that year originated from sessions that were already authenticated — meaning the front door wasn't the problem; what happened after someone walked through it was. Put those two findings side by side, and you get a fairly bleak picture: getting in is easy, and once an attacker is in, the API layer rarely stops them from going sideways. And CrowdStrike's 2026 Global Threat Report puts a number on how little time defenders now have to notice. Average eCrime breakout time — the gap between initial access and lateral movement — fell to 29 minutes in 2025, down from 48 minutes the year before and 98 minutes in 2021. The fastest breakout CrowdStrike observed clocked in at 27 seconds. AI-enabled adversary operations rose 89% year over year, and the company recorded prompt-injection or AI-tool abuse incidents at more than 90 organizations. As Adam Meyers, CrowdStrike's head of counter adversary operations, put it when the report landed, breakout time is now the clearest signal of how intrusions have changed. A phantom API sitting outside your monitoring isn't a slow-burning liability anymore. It's a 27-second one. GraphQL Made This Worse, Not Better GraphQL was supposed to reduce shadow API risk by giving clients one well-documented entry point instead of dozens of REST routes. In practice, it concentrated the risk instead of eliminating it. Roughly 70% of organizations now run GraphQL in some form, according to Wallarm's Q2 2025 ThreatStats data, and the same report flagged something that should sound familiar to anyone who's done incident response: zero GraphQL-specific breaches were publicly disclosed that quarter, despite the technology's deep reach into production systems. That's not a sign GraphQL is safe. It's a sign almost nobody is looking closely enough to catch what's happening inside a single, deeply nested query that can touch a dozen resolvers and a dozen authorization decisions in one round trip. A REST endpoint that's missing an authorization check is one bug. A GraphQL resolver tree with the same gap can be a dozen bugs wearing one URL. Shadow and zombie APIs compound the problem from the other direction. Salt Security's 2025 CISO report found that only 19% of CISOs globally have full visibility into their API inventory — just 27% among large enterprises, and a thin 12% among smaller organizations — despite 73% ranking API security as a high or critical priority. Two-thirds of organizations audit for shadow APIs only monthly or quarterly, which leaves a four-to-twelve-week window every single cycle during which an undocumented route can sit there, fully reachable, before anyone goes looking. Salt Labs' own Q1 2025 data found that 99% of organizations had encountered an API security issue in the prior twelve months, and BOLA and injection flaws together accounted for more than a third of everything reported. None of this is exotic. It's the same handful of failure modes, recurring at a scale that AI-assisted development is now accelerating rather than fixing. The Failure Chain, Step by Step Strip away the vendor-report statistics for a second and walk through how this actually plays out on a single team, because the abstraction is where people lose the thread. A developer asks an AI assistant for a quick internal tool: pull account status for support staff, fast, no fuss. The assistant generates a working route, and because "working" was the only bar anyone set, it also generates a second, undocumented path the model added on its own initiative — a debug variant that accepts a raw account ID with no scope check, left over from however the model's training data tends to structure admin tooling. The pull request gets reviewed for logic, not for the existence of a route nobody asked for, because nobody is in the habit of reading a diff looking for endpoints that shouldn't exist. It merges. The OpenAPI spec doesn't change because nothing in the toolchain forces it to. The API gateway keeps doing its job — rate limiting, TLS, routing — on every path it's configured to recognize, and the new one simply isn't on that list, so it inherits whatever the underlying framework allows by default rather than anything the security team actually decided. For months, nothing happens because nobody is sending traffic to a path nobody knows about. Then someone does. Maybe it's a script kiddie running a wordlist against common admin paths, maybe it's a scraper, maybe it's one of the AI-driven reconnaissance tools the CrowdStrike and Wallarm data above describe as increasingly common. The request lands. There's no auth check to fail, so there's no log entry resembling a failed login — the kind of signal most SOC dashboards are tuned to catch. There's just a 200 response and a payload of account data. Given that CrowdStrike clocked the fastest 2025 breakout at 27 seconds and the average at 29 minutes, the gap between "endpoint found" and "data gone" is no longer a window anyone can rely on noticing in real time. By the time it surfaces — an anomaly report, a customer complaint, a researcher's disclosure email — the honest answer to "how long has this been exposed" is usually some shrug-worthy variant of "the logs only go back so far." That's the chain: AI suggestion → unreviewed scope gap → silent spec drift → gateway blind spot → silent exploitation → discovery after the fact. Every link in it is mundane. None of it requires a sophisticated attacker. That's exactly why it keeps happening. What I'd Actually Build to Catch It Description is cheap. Here's the shape of a pipeline I'd put in front of a team that wanted to stop shipping phantom routes instead of just talking about the risk: Plain Text CI/CD LAYER (pre-merge, blocking) → Generate live OpenAPI spec from the build → Diff against the last approved spec → Any new route not explicitly annotated/reviewed → FAIL build → Flag missing auth decorators, missing rate-limit config, wildcard scopes RUNTIME LAYER (continuous, post-deploy) → Traffic profiler sits behind the gateway, fingerprints every path actually receiving requests → Cross-reference live traffic against the approved spec, on a rolling window (hours, not quarters) → Anything serving 200s that isn't in the spec → page on-call, not a quarterly report GATEWAY LAYER (enforcement) → Default-deny for any path not present in the signed spec → Schema validation on request/response shape, not just route existence → Auth/scope check enforced at the gateway, independent of what the service itself does The CI step is the cheapest control here, and the one most teams skip, because it requires someone to decide that an undocumented route is a build failure, not a Slack message for later. The runtime layer catches what gets past CI anyway — config drift, routes added outside the normal deploy path, anything a human forgot to annotate. The gateway layer is the backstop: even if the first two fail, a default-deny policy means an unrecognized path doesn't get served at all, rather than getting served and merely logged. None of these three layers is sufficient alone. Together, they convert "we hope someone notices" into "the system refuses to let this happen quietly," which is the actual point. What Actually Works, and What's Mostly Marketing The vendor response has been predictably fast and not entirely cynical. Akamai's $450 million acquisition of Noname Security, announced in May 2024 and closed that June, folded one of the better-regarded API discovery platforms directly into a CDN-and-edge company's security stack — a clear bet that API visibility belongs as close to the traffic as possible, not bolted on afterward. Salt Security's 1H 2026 report introduced what it calls Agentic Security Posture Management, aimed squarely at mapping the relationships between LLMs, MCP servers, and the APIs underneath them, specifically to catch what the industry has started calling "Shadow MCP." Whether that label sticks or fades in eighteen months, the underlying instinct is correct: you cannot secure an API layer you can't continuously enumerate, and static documentation reviewed once a quarter is no longer a serious control. The defenses that actually move the needle, based on what I've watched, hold up under real incident response, aren't glamorous: Runtime discovery over documentation trust. Treat your OpenAPI spec as a claim to be verified against live traffic, not a source of truth. If traffic is hitting a path that isn't in the spec, that's an incident, not a documentation gap.Spec-diffing in CI, not just in security review. A pull request that introduces a new route should fail a build if that route doesn't appear in an updated, reviewed spec. This is cheap to automate and catches the AI-generated-endpoint problem at the exact moment it's introduced.Authorization checks that don't trust the session. Given that 95% of API attacks in CybelAngel's 2025 dataset started from an authenticated session, the perimeter check matters far less than the per-object, per-field authorization decision happening on every single call.AI-assisted review aimed at AI-generated code specifically. Ironically, the same pattern-matching that produces phantom endpoints can be turned around to flag them — diff-aware tooling that specifically interrogates new routes for missing rate limits, missing auth decorators, or unscoped data access, rather than general-purpose linting.Treat MCP and agent tool definitions as part of your API attack surface, full stop. They're not a side project. They're API endpoints with extra steps, and the ThreatStats data says they're already 14% of AI-related disclosures. None of these are silver bullets, and I'd be lying if I said any vendor has fully solved this. What I will say, after watching this category for a year now, is that the organizations doing well are the ones that stopped treating "shadow API discovery" as a once-a-quarter audit and started treating it as a property of the deployment pipeline itself — something that gets checked on every merge, the same way a linter or a test suite does. The ones still relying on a documentation review process built for a world where humans wrote every route are going to keep finding out about their phantom APIs the way most teams still do: during an incident, not before one. The question worth sitting with isn't whether your API inventory has gaps — every inventory does. It's whether you could currently produce, on demand, a complete list of every endpoint serving production traffic right now, including the ones nobody remembers approving. If the honest answer is no, you don't have an API security posture. You have an API security guess, and AI-generated code is making the guess bigger every sprint.

By Igboanugo David Ugochukwu

CORE

Integration

DZone's Featured Integration Resources

Top Integration Experts

The Latest Integration Topics