Software design and architecture focus on the development decisions made to improve a system's overall structure and behavior in order to achieve essential qualities such as modifiability, availability, and security. The Zones in this category are available to help developers stay up to date on the latest software design and architecture trends and techniques.
Cloud architecture refers to how technologies and components are built in a cloud environment. A cloud environment comprises a network of servers that are located in various places globally, and each serves a specific purpose. With the growth of cloud computing and cloud-native development, modern development practices are constantly changing to adapt to this rapid evolution. This Zone offers the latest information on cloud architecture, covering topics such as builds and deployments to cloud-native environments, Kubernetes practices, cloud databases, hybrid and multi-cloud environments, cloud computing, and more!
Containers allow applications to run quicker across many different development environments, and a single container encapsulates everything needed to run an application. Container technologies have exploded in popularity in recent years, leading to diverse use cases as well as new and unexpected challenges. This Zone offers insights into how teams can solve these challenges through its coverage of container performance, Kubernetes, testing, container orchestration, microservices usage to build and deploy containers, and more.
Integration refers to the process of combining software parts (or subsystems) into one system. An integration framework is a lightweight utility that provides libraries and standardized methods to coordinate messaging among different technologies. As software connects the world in increasingly more complex ways, integration makes it all possible facilitating app-to-app communication. Learn more about this necessity for modern software development by keeping a pulse on the industry topics such as integrated development environments, API best practices, service-oriented architecture, enterprise service buses, communication architectures, integration testing, and more.
A microservices architecture is a development method for designing applications as modular services that seamlessly adapt to a highly scalable and dynamic environment. Microservices help solve complex issues such as speed and scalability, while also supporting continuous testing and delivery. This Zone will take you through breaking down the monolith step by step and designing a microservices architecture from scratch. Stay up to date on the industry's changes with topics such as container deployment, architectural design patterns, event-driven architecture, service meshes, and more.
Performance refers to how well an application conducts itself compared to an expected level of service. Today's environments are increasingly complex and typically involve loosely coupled architectures, making it difficult to pinpoint bottlenecks in your system. Whatever your performance troubles, this Zone has you covered with everything from root cause analysis, application monitoring, and log management to anomaly detection, observability, and performance testing.
The topic of security covers many different facets within the SDLC. From focusing on secure application design to designing systems to protect computers, data, and networks against potential attacks, it is clear that security should be top of mind for all developers. This Zone provides the latest information on application vulnerabilities, how to incorporate security earlier in your SDLC practices, data governance, and more.
The New Insider Threat Isn't Human: Securing AI Agents Before They Secure Themselves
Selective Deployment in Azure Data Factory: A Practical Blueprint for Safer CI/CD
High availability is a non-negotiable requirement for mission-critical SAP HANA deployments. When a primary database node goes down without an automated failover in place, the business impact is immediate. RHEL Pacemaker has long been the standard cluster manager for SAP HANA High Availability(HA) on Linux; it detects failures, fences misbehaving nodes, promotes secondaries, and orchestrates the full recovery sequence without manual intervention. The standard Pacemaker playbook for SAP HANA HA, as documented in the official documentation, relies on a virtual IP address (VIP) as the single stable network endpoint for all database traffic. Pacemaker keeps that VIP tied to whichever node is currently the active primary. When a failover happens, the VIP moves. Applications reconnect to the same address and reach the new primary without configuration changes. The problem is that this approach breaks down on many cloud platforms. Hyperscalers and private cloud environments frequently do not support traditional floating VIPs in the way bare-metal or on-premises networking does. The official RHEL Pacemaker documentation covers the VIP setup in detail and stops there. When VIPs are not available, practitioners are left to work out an alternative on their own. This article defines a production-ready alternative for exactly this scenario. The approach replaces the floating VIP with a network load balancer (NLB) and uses a Pacemaker-managed health check listener to tell the load balancer which node is the active primary at any given time. This article explains the problem, positions it against existing cloud provider approaches, and walks through the implementation step by step. How Cloud Providers Address This The challenge of replacing a floating VIP with a load balancer while still routing traffic exclusively to the active HANA primary is not new. There is published guidance on how to approach, and the core pattern is consistent across all of them. One such approach is to use an internal passthrough Network Load Balancer alongside a socat-based health check listener managed as a Pacemaker resource. The listener opens on a dedicated port in the private range (49152–65535), and the NLB probes that port to determine which backend is the primary. The approach uses the Open Cluster Framework(OCF) 'anything' resource agent to manage the socat process inside Pacemaker. The second approach is to use an Internal Load Balancer with a health probe on port 625XX (where XX is the HANA instance number). A listener on each HANA node responds to the probe, but only the primary has the listener active. In some configurations, HAProxy is used rather than socat as the listener. The implementation discussed in this article adds to this landscape a clean approach using a native systemd service registered directly as a Pacemaker resource instead of the OCF 'anything' agent or HAProxy, and it targets RHEL specifically. The systemd approach keeps the setup self-contained, auditable, and consistent with how most RHEL administrators already manage services. It works on any cloud provider or private cloud environment that supports network load balancers. Architecture Overview The diagram below shows the two-node SAP HANA cluster, the network load balancer, and how the health check listener connects them. The NLB's backend pool includes both HANA nodes on the standard HANA port (3XX15), but the health probe targets a separate port, 62500, that only the active primary exposes. Overall cluster architecture The NLB sees both nodes as members of its backend pool. Because only the primary node has anything listening on port 62500, the NLB marks the secondary as unhealthy for routing purposes and sends all traffic to the primary. When Pacemaker promotes the secondary during a failover, it starts the listener on the new primary as part of the same orchestration sequence. The NLB detects the change on its next health check cycle and shifts all traffic accordingly. Failover Sequence The diagram below shows the sequence of events from the moment the primary node fails to the moment applications reconnect through the load balancer. Failover sequence from node failure to reconnection Two timing factors govern the total recovery window. The first is Pacemaker's fencing and promotion sequence, typically 30 to 90 seconds, depending on the STONITH method and HANA replication state. The second is the NLB health check interval, which determines how quickly the load balancer detects the new primary after Pacemaker completes its promotion. For production environments, tuning both values together is worth the effort Pacemaker Resource Model The diagram below maps the Pacemaker resource hierarchy and constraints used in this setup. Understanding the resource model helps clarify why both the colocation and ordering constraints are necessary. The colocation constraint (score=INFINITY) tells Pacemaker that lb_healthcheck must always run on the same node as the promoted HANA primary. If the promoted primary moves, the health check listener moves with it. The ordering constraint ensures the listener does not start until HANA has fully completed its promotion, preventing the load balancer from routing traffic to a node that is still finishing its takeover sequence. Prerequisites The following must be in place before starting the implementation: Two RHEL virtual servers with access to the Red Hat High Availability Add-On repositorySAP HANA installed on both servers with HANA System Replication configuredPacemaker installed and configured through section 5.7 of the official Red Hat SAP HANA HA guide, sections 5.8 and 5.9 (virtual IP configuration) are intentionally skippedA network load balancer provisioned with both HANA nodes in the backend pool, backend port set to 3XX15 (where XX is the HANA instance number)socat installed on both HANA nodesFirewall rules permitting TCP traffic on port 62500 from the NLB health check source addresses socat is available in standard RHEL repositories. Install it with: sudo dnf install socat -y Step-by-Step Implementation Step 1: Create the Systemd Health Check Service Run the following command on both HANA nodes. It creates a systemd unit file that uses socat to open a TCP listener on port 62500. The listener accepts any connection and returns success immediately; that response is all the load balancer needs. Shell cat <<EOF > /etc/systemd/system/lb-healthcheck.service [Unit] Description=LB healthcheck listener for active SAP HANA primary After=network-online.target Wants=network-online.target [Service] Type=simple ExecStart=/usr/bin/socat TCP4-LISTEN:62500,reuseaddr,fork EXEC:/bin/true Restart=always RestartSec=2 [Install] WantedBy=multi-user.target EOF Do not enable this service manually. Pacemaker will control its lifecycle entirely. Step 2: Reload Systemd After writing the unit file, reload systemd on both nodes so it registers the new service: Shell systemctl daemon-reload Step 3: Prevent the Service From Starting Automatically Explicitly disable and stop the service. If both nodes have the listener running simultaneously, the load balancer will consider both healthy and will route traffic to either node, which defeats the entire purpose of the setup. Shell systemctl disable lb-healthcheck systemctl stop lb-healthcheck Step 4: Create the Pacemaker Resource Register the systemd service as a Pacemaker-managed resource. From this point forward, Pacemaker owns the start, stop, and monitoring of the listener. Shell pcs resource create lb_healthcheck \ systemd:lb-healthcheck \ op monitor interval=10s timeout=20s Pacemaker will now monitor the listener every 10 seconds and automatically relocate it during failover events. Step 5: Add the Colocation Constraint This is the constraint that enforces the listener always runs on the same node as the promoted SAP HANA primary. Without it, Pacemaker might place the resource on either node. Shell pcs constraint colocation add lb_healthcheck \ with Promoted cln_SAPHanaCon_P01_HDB01 \ score=INFINITY Replace P01_HDB01 with the actual SID and instance number for the environment. For example: if SID is PRD and instance number is 00, use PRD_HDB00 Step 6: Add the Ordering Constraint The ordering constraint prevents the health check listener from starting until after the HANA promotion is fully complete. Without this, a race condition could cause the load balancer to route traffic to a node that is still mid-promotion. Shell pcs constraint order promote cln_SAPHanaCon_P01_HDB01 \ then start lb_healthcheck Step 7: Validate the Pacemaker Configuration Verify that both constraints are correctly registered in the cluster: Shell pcs constraint config The output should contain both of the following entries: Plain Text Colocation Constraints: Started resource 'lb_healthcheck' with Promoted resource 'cln_SAPHanaCon_P01_HDB01' score=INFINITY Order Constraints: promote resource 'cln_SAPHanaCon_P01_HDB01' then start resource 'lb_healthcheck' Step 8: Verify Listener Placement Confirm that only the active primary node is listening on port 62500. Run this command on each node: Shell ss -lntp | grep 62500 On the primary node, the output should show a LISTEN entry on 0.0.0.0:62500. On the secondary node, the command should return nothing. Plain Text # Expected on PRIMARY node: LISTEN 0 5 0.0.0.0:62500 0.0.0.0:* # Expected on SECONDARY node: # (no output) If both nodes show the listener, the colocation constraint is either missing or incorrect. If neither node shows it, check that the HANA clone resource is in the Promoted state with: pcs status Comparison: VIP Approach vs. NLB Health Check Approach The diagram below summarizes the trade-offs between the traditional VIP approach and the NLB health check approach described in this article. Comparison The VIP approach cuts over faster because there is no dependency on an external health check interval. The IP simply moves to the new primary node. It requires the underlying network to support IP address mobility, which cloud environments typically do not. The NLB approach works across any cloud or private cloud environment that supports network load balancers. The trade-off is that traffic cutover depends on the NLB's health check interval in addition to Pacemaker's promotion time. The cloud documentation on major cloud providers acknowledges this trade-off explicitly: using an NLB with a health check listener is their recommended approach for all SAP HANA HA deployments, and they provide the same socat-based pattern using the OCF 'anything' resource agent. The approach documented here achieves the same outcome using a systemd service, which many operators find more familiar and easier to audit. Operational Notes and Tuning A few things are worth keeping in mind when running this setup in production. NLB health check interval: The faster the health check interval, the shorter the window between Pacemaker completing its promotion and the NLB redirecting traffic. A 5-second interval is common in Cloud SAP HA documentation. Setting this too low can cause false positives during normal HANA replication lag. STONITH configuration: This solution assumes STONITH (fencing) is configured as part of the base Pacemaker setup. Without STONITH, Pacemaker will not promote the secondary during a primary failure. STONITH ensures the failed node is definitively powered off before promotion proceeds, preventing split-brain. Port 62500 vs. 625XX convention: Cloud providers use the convention 625XX (where XX is the instance number) for their SAP HANA health check ports. Cloud's documentation recommends using any port in the private range 49152 to 65535. Port 62500 used in this setup falls within that range and does not conflict with standard HANA ports. Teams following other cloud provider conventions can substitute 625XX if they prefer consistency across environments. Testing failover: After setup, the full failover sequence should be tested by killing the primary HANA process (not the OS) and verifying the NLB redirects traffic to the new primary within the expected time window. The pcs status command is the primary tool for watching the Pacemaker side of the transition. Conclusion The standard RHEL Pacemaker documentation for SAP HANA HA assumes a virtual IP is available. Not all hyperscalers provide VIP. The solution fills that gap cleanly: replace the VIP with a network load balancer hostname, and use a Pacemaker-managed socat listener to tell the load balancer which node is the primary at any given time. The core pattern NLB health probe targeting a Pacemaker-owned listener is the same pattern major cloud providers use in their own SAP HA documentation. What this implementation adds is a clean systemd service approach for RHEL, without needing the OCF 'anything' resource agent or additional proxy software. The setup comes down to eight steps: write a systemd service, disable it from auto-starting, register it as a Pacemaker resource, and add two constraints. The constraints — one for colocation, one for ordering — are what tie the listener's lifecycle to the HANA primary promotion sequence and make the whole thing work reliably across failovers. For teams running SAP HANA on RHEL in environments where VIPs are not an option, this is a production-ready path forward that relies entirely on standard RHEL tooling.
SBOMs Create Transparency, But Not Without Risk The Software Bill of Materials, or SBOM, has changed meaning in recent years. It used to be seen as a technical tool for internal inventory management. It is now required as evidence due to regulations. The European Cyber Resilience Act will require digital product manufacturers to reliably document the composition of their software. The NIS 2 Directive increases pressure on operators of essential entities to secure their supply chains in a traceable way. The United States Executive Order 14028 made the SBOM a requirement in government procurement as early as 2021. As a result, the bill of materials evolved from a voluntary artifact to a mandatory disclosure. This rise in importance exposes a conflict of objectives that cannot be resolved, only managed. The bill of materials is designed to establish trust, enable verifiability, and allow quick response to vulnerabilities. Yet it also reveals how a software product is built. It lists third-party components, their versions, and potential vulnerability points. It lets people guess architectural choices and competitively relevant strategies. A complete bill of materials acts as both evidence and blueprint. Publishing it carelessly confuses transparency with surrender. This article argues that the way sharing is controlled, not just the act of sharing, determines whether it helps or harms. Why Complete SBOMs Contain Sensitive Information To see the importance of the conflict, it helps to examine what a complete bill of materials contains. It is not simply a list of libraries used. It frequently includes precise version numbers, the full transitive dependency chain, sometimes internal package names, references to private artifact sources, and metadata about the generators and build process. Each detail may seem harmless on its own. Taken together, they provide a detailed profile of a product’s technical makeup. For readers of a developer publication, this risk is very clear. Applications built with Maven or Gradle often have deep, branched transitive dependency chains. A single library can pull in dozens more. A complete bill of materials shows these chains in full detail. It allows others to see which vulnerabilities may affect a product. It also shows which internal components the manufacturer uses, which frameworks it avoids, and where it is using outdated versions. This intended security measure can become a manual for attackers. The sensitivity of a bill of materials is not simply a side issue, but its core property. Least Disclosure: Sharing as Controlled Disclosure From this understanding comes the key idea: minimal disclosure, or least disclosure. This means you should only share as much as a person really needs for their purpose — no more — and you should be able to prove it. This principle clears up a common misunderstanding. Many assume SBOM sharing means publishing everything. In reality, sharing a bill of materials does not mean making all details broadly available. It is a controlled act. Content, recipient, and context are weighed together. The key question is not whether to share, but what to share, with whom, and under what conditions. This shift sets apart controlled transparency from unintentional overexposure. A minimal disclosure approach views the SBOM not as a single document to send but as a database from which to generate specific views for each need. The technical architecture discussed next builds on this idea. Different Recipients, Different Information Needs To share only what’s needed, you first have to know who you are sharing with, because each group needs different info. You can think of four main groups, and what they need shapes the whole process. The public typically only needs a basic view. For them, listing the component name, license, and project reference is enough. This satisfies the need for transparency, especially for open-source software, without revealing internal structure. Customers need more details. They must analyze risks and justify purchases. They rely on version levels and dependency metadata. Auditors and authorities focus on dependability, not detail. They require evidence that is verifiable and complete. Suppliers and internal teams need operational details. They work with deep data to manage and edit bills of materials together. These differences lead to an important reality. A single, universal SBOM view is too crude in both detail and security. Trying to serve all users the same way usually fails, frequently resulting in email attachments. This practice lacks control and should be avoided. Public Transparency vs. Private Exchange Because the recipients differ, a strong structural separation is needed. Any proper disclosure model must separate public transparency from private exchange. Public transparency is a deliberately limited, open view of the bill of materials. Anyone can access it. Private exchange is the controlled transfer of more detailed information to authorized parties. Do not combine these two modes, whether in technology or organization. If you do, the line between public and private details blurs. Exodos Labs’ model shows this separation well and is used here as an example. It draws a clear line between a public “SBOM Trust Center” and a private “Secure Exchange.” The Trust Center gives a continuously updated, defensible public view. The Secure Exchange allows controlled sharing with specific organizations. The architecture’s main advantage is its clear separation. It makes overexposure harder by assigning public and private data to separate channels from the start. Redaction: Several Secure Views From One Bill of Materials Separating public and private sharing does not fully explain how different views can come from a single database. This is where redaction becomes vital. Redaction is not only about deleting fields. It reduces, masks, aggregates, or hides information based on the recipient. In practice, internal package sources and private registry references may be removed entirely. Transitive dependencies can be summarised rather than listed. Sensitive build and generator metadata can be hidden from certain recipients. Several secure views emerge from the full bill of materials. A minimal public view might show only the component name, license, and project reference. An extended view for authorized customers can include version and dependency details. A contractually protected view might be released after a non-disclosure agreement is signed. The example model supports such selective redaction and can create recipient-specific views. The key point is this: Do not distribute a complete bill of materials and then cut it down. Instead, generate intentionally designed views from the full data set. Each view ought to match the needs and openness suitable for its audience. Access Control Beyond Simple Roles Once you define the views, you must decide how to control access. Simple role models are often not enough. Just being a "customer" or "partner" does not mean someone should see everything. Whether a specific customer can access a certain view depends on more than just their category. More appropriate is attribute-based access control, which combines a range of characteristics before releasing a view. Among these characteristics are the associated organization, the product-related entitlement, the contractual status, and, where applicable, the status of a non-disclosure agreement, the assignment to a specific release, the regulatory context of a request, the release status, and any temporal limitation on access. Only the interaction of these attributes decides which view a requester actually receives. The example model relies precisely on such attribute-based control, combined with the redaction described earlier. The conceptual added value lies in scalability: whereas rigid roles become unmanageable as the number of recipients and special cases grows, attribute-based rules can be enforced consistently even across large circles of recipients. With this, the question of who decides on disclosure is settled — complementing the previously treated question of what is concealed in the first place. Demonstrability: Auditability and Release Binding Controlled disclosure calls for not only that the right measure of information be given to the right party, but also that it be provable. Demonstrability here comprises two sides that belong together, because both answer the same fundamental question: what can the parties involved rely upon? The first side concerns auditability. SBOM sharing is controllable only if it can be traced without gaps, who requested access, who granted it, which view was displayed, and which version was exported. The status of a non-disclosure agreement, its revocation, and temporal limitations likewise belong in this audit trail. An immutable audit trail transforms sharing from a passing file transfer into a provable transaction; in the event of dispute, it replaces assertion with evidence. The second side concerns the binding of a bill of materials to a concrete artifact. A bill of materials is dependable only when it is unambiguously established to which release, to which build, to which JAR or WAR file, to which container image, Git tag, artifact hash, or container digest it belongs. In the case of a security incident in particular, this assignment decides the capacity to act: without it, it remains open whether the bill of materials at hand actually describes the delivered artifact or a long-superseded state. Auditability thus proves who saw what and when; release binding proves what this view refers to in the first place. Together, the two establish the trust that a bill of materials is meant to instill. CI/CD Integration and Conclusion However demanding the mechanisms described may appear, in practice, they most frequently founder on a plain circumstance: manual maintenance. Bills of materials compiled by hand, updated after the fact, and published on static pages inevitably grow stale and thereby lose their value. The consequence is evident: the generation, validation, versioning, and publication of a bill of materials belong in the build and release pipeline. For development teams, this means close integration with Maven, Gradle, and CI/CD processes, so that a current bill of materials is generated with every build, automatically checked against quality criteria, and made publicly available. The example model illustrates this by feeding the Trust Center continuously from the supply chain, so that public disclosure always corresponds to the actual state, and the recurring question of which bill of materials is current does not even arise. Against this background, the typical mistakes that a well-considered approach sidesteps can be named. They range from complete public publication, through dispatch by email, the mixing of public and private views and the absence of a redaction strategy, to missing release processes, deficient auditing, absent release binding, manually maintained disclosure pages, and unprovided-for means of revocation. Each of these mistakes is ultimately a variation on the same fundamental error of equating transparency with maximal disclosure. It is precisely this equation that must be overcome. Bills of materials are necessary for trust, regulatory compliance, and the security of the software supply chain, yet maximal disclosure does not automatically lead to greater transparency. What is decisive is to provide information that is correct, up to date, and appropriate for the target group, and to do so demonstrably. Secure bills of materials arise not through complete publication, but through suitable views for the right recipient in the right context. Whoever takes this to heart transforms the bill of materials from a risk into an instrument.
I often find myself in conversations where the same words keep popping up again and again: Agents, MCP, and A2A. Everyone seems excited about them. But the funny part is that when the topic shifts to MCP (Model Context Protocol), the explanations start to vary. One day, someone confidently said, “An MCP server is basically a tool.” Another person immediately disagreed and replied, “No, no — MCP is more like a client.” Before that debate could settle, someone else joined the conversation and said, “Actually, MCP is just a protocol.” And then another perspective appeared: “Think of it as middleware that sits between an agent and APIs.” At that moment, I realized something interesting: we were all talking about the same concept, yet each of us understood it a little differently. These conversations made me curious. If experienced developers and architects describe MCP in different ways, how confusing must it be for someone who is just starting to explore this space? The more I listened, the more I noticed a pattern — people weren’t wrong, but they were often describing only one piece of the puzzle. That realization is what inspired this blog. In this article, I want to step back from the buzzwords and walk through the concepts in a simple way. What exactly is MCP? Is it a server? A tool? A client? Or something else entirely? And how does it relate to the agents that everyone keeps talking about? Is it applicable only to agents, or is it applicable to assistants also? We will also explore MuleSoft's capability in this space. By the end of this post, my goal is to bring clarity to these terms and show how they connect. Instead of hearing multiple interpretations in different conversations, you’ll be able to see the complete picture of how MCP fits into modern AI and integration architectures. Let's Understand What Anthropic Says About MCP MCP (Model Context Protocol) is an open-source standard for connecting AI applications to external systems. Think of MCP like a USB-C port for AI applications. Just as USB-C provides a standardized way to connect electronic devices, MCP provides a standardized way to connect AI applications to external systems. MCP at high level Now let's break down each component and understand it in the simplest way possible. AI Application AI application can be any application that consists of an LLM, orchestration, and tools (You can think of it as assistants), or it may consist of more complex components such as Agent Orchestration, specialized agents, and Tools(You can think of it as an agentic application). Tools can be a Payment Gateway, a Data Retrieval API, a Weather API, a File System, a WebSearch, etc. MCP Model Context Protocol is an open protocol that enables seamless integration between AI applications (LLM Applications) and external data sources and tools. MCP provides a standardized way to connect LLMs with the context they need. MCP follows a client-server architecture. Key components of this architecture are MCP Host, MCP Client, and MCP Server. Let's extend our previous architecture. MCP architecture MCP Host It is nothing but a Host where the AI application is running. MCP Client It is a component that establishes a connection with the MCP Server and gets the context for the MCP Host to use. MCP Server It consists of external services that provide context to LLMs. Model Context Protocol consists of two layers: Data layer: The data layer implements a JSON-RPC 2.0 (JRPC) based exchange protocol that defines the message structure and semantics for client-server communication.Transport layer: The transport layer manages communication channels and authentication between clients and servers. It handles connection establishment, message framing, and secure communication between MCP participants.MCP supports two transport mechanisms: Stdio transport: Uses standard input/output streams for direct process communication between local processes on the same machine, providing optimal performance with no network overhead.Streamable HTTP transport: Uses HTTP POST for client-to-server messages with optional Server-Sent Events for streaming capabilities. This transport enables remote server communication and supports standard HTTP authentication methods, including bearer tokens, API keys, and custom headers. MCP recommends using OAuth to obtain authentication tokens. Use Case We can think of "Weather Intelligence Agent," which uses the MCP server to make a call to a tool that provides weather information based on a city name. This is a simple use case just to demonstrate how an API is called as a tool using MCP. We will use Postman and Cursor to mimic as Agent/Assistant, which will call the Weather API. Let's see how we can implement this use case using MuleSoft: Step 1: MuleSoft provides the MCP Server - Tool Listener connector. We will configure the MCP Server. MuleSoft code Refer to the code: XML <?xml version="1.0" encoding="UTF-8"?> <mule xmlns:ee="http://www.mulesoft.org/schema/mule/ee/core" xmlns:http="http://www.mulesoft.org/schema/mule/http" xmlns:mcp="http://www.mulesoft.org/schema/mule/mcp" xmlns="http://www.mulesoft.org/schema/mule/core" xmlns:doc="http://www.mulesoft.org/schema/mule/documentation" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.mulesoft.org/schema/mule/core http://www.mulesoft.org/schema/mule/core/current/mule.xsd http://www.mulesoft.org/schema/mule/mcp http://www.mulesoft.org/schema/mule/mcp/current/mule-mcp.xsd http://www.mulesoft.org/schema/mule/http http://www.mulesoft.org/schema/mule/http/current/mule-http.xsd http://www.mulesoft.org/schema/mule/ee/core http://www.mulesoft.org/schema/mule/ee/core/current/mule-ee.xsd"> <http:listener-config name="HTTP_Listener_config" doc:name="HTTP Listener config" doc:id="251f2d7c-e84b-4974-a1e8-96d9779bc9e9" > <http:listener-connection host="0.0.0.0" port="8081" /> </http:listener-config> <mcp:server-config name="MCP_Server" doc:name="MCP Server" doc:id="289fb886-e732-4274-990e-9876aca405a6" serverName="mule-mcp-server" serverVersion="1.0.0"> <mcp:streamable-http-server-connection listenerConfig="HTTP_Listener_config"/> </mcp:server-config> <http:request-config name="HTTP_Request_config" doc:name="HTTP Request config" doc:id="b31d7d79-b45b-42ec-a970-50eb19a0a702" > <http:request-connection protocol="HTTPS" host="api.weatherstack.com" /> </http:request-config> <flow name="mcp-weahter-intelligence-apiFlow" doc:id="b1c21d3c-18f0-4eac-bb4e-3cf789608580" > <mcp:tool-listener doc:name="MCP Server - Tool Listener" doc:id="4c42c1cb-898d-4fb9-8d0e-edc541fffb75" config-ref="MCP_Server" name="get_weather_information"> <mcp:description ><![CDATA[This tool gets weather information. Check weather details for device by providing the city name as input or paramValue. Please use the query.]]></mcp:description> <mcp:parameters-schema ><![CDATA[{ "$schema": "http://json-schema.org/draft-07/schema#", "type": "object", "properties": { "query": { "type": "string", "description": "city for querying weather data" } }, "required": ["query"], "additionalProperties": false }]]></mcp:parameters-schema> <mcp:responses > <mcp:text-tool-response-content text="#[payload.^raw]" priority="1"> <mcp:audience > <mcp:audience-item value="ASSISTANT" /> </mcp:audience> </mcp:text-tool-response-content> </mcp:responses> </mcp:tool-listener> <http:request doc:name="Request" doc:id="d10760de-5f93-4f63-aadc-9bfc491f94e0" config-ref="HTTP_Request_config" path="/current"> <http:query-params ><![CDATA[#[output application/java --- { "access_key" : "96d01954d0c4e444aa781fa10b92caff", "query" : payload.query, "units" : "m" }]]]></http:query-params> </http:request> </flow> </mule> Let's run this code and test it: MCP server started successfully: Deployment log Step 2: Let's use Postman as the MCP client to test it and see if it is working as expected: MCP server and available tools Step 3: Click on Connect: Connected to MCP Server Step 4: Now the MCP client is connected to the MCP server. You need to pass a query parameter as the city name, and you will get the weather details: I am writing this Blog from GOA (The Beach Capital of India). I will use GOA as the City name to retrieve weather information about GOA. Use the tool Step 5: Click on Run, and you will get the response as shown below: Response I have demonstrated it in my local version of code, which is deployed in Anypoint Studio. Let's test the same after deploying it to the runtime manager. I have deployed the code to the runtime manager. Deployed in the Anypoint platform Test result I have demonstrated this using Postman, where Postman worked as an MCP client to connect to the MCP server. We can extend it further and use Cursor to mimic the agentic behavior where the agent will use the MCP tool to get the answer. Cursor to use MCP I have used no code/low code tool, which is MuleSoft. In the next blog, I will use Python code to demonstrate the same. Watch the video for more details. Let me know if you liked it!
When working on API automation projects, one of the first things that becomes repetitive is configuring the same settings for every test. The base URL, content type, request logging, and common response validations often appear in multiple test classes. As the number of tests increases, maintaining these repeated configurations becomes difficult. REST Assured provides specifications to solve this problem. Instead of defining the same settings in every test, common configurations and specifications can be created once and reused throughout the test suite. This article demonstrates a simple approach to configuring REST Assured using a Base Test class along with Request and Response Specification. What Are REST-Assured Specifications? A specification is a reusable configuration object that contains common request or response settings. So, instead of repeatedly writing: Java given() .baseUri("https://api.example.com") .header("Authorization", "Bearer token") .contentType(ContentType.JSON) The configuration can be defined once and reused across multiple tests. Similarly, the common validations can also be written using the specifications. Specifications help in: Reduce code duplicationImprove test readabilityCentralize API configurationsSimplify maintenanceStandardize request and response validations Why Use Specifications? Consider an API test that retrieves user details. Java @Test public void getUserDetails() { given() .baseUri("https://api.example.com") .when() .get("/orders/2") .then() .statusCode(200); } The test works correctly, but the base URI and common validations, such as status code, will need to be repeated in every test. A better approach is to move these common settings into reusable specifications. What Problem Does It Solve? In many API automation projects, test cases often contain repeated configuration code. The same base URL, content type, authentication details, headers, and response validations are repetitive across multiple test classes. While this may not seem like a problem when there are only a few tests, maintaining the test suite becomes difficult as the project grows. Consider a scenario where the API base URL changes from a QA environment to a Staging environment. Without a centralized configuration, every test containing the old URL would need to be updated. Similarly, if a common header or authentication mechanism changes, modifications would be required in multiple places. Request and Response Specifications solve this problem by moving common configurations into reusable objects. Instead of repeating the same setup in every test, the configuration is defined once and reused wherever required. This reduces code duplication, improves readability, and makes the test suite easier to maintain. As a result, test methods can focus on validating business functionality rather than configuring API requests and responses. This leads to cleaner and more maintainable automation code. Creating a SetupSpecification Class The most common configurations should be placed in a separate class. This allows all test classes to inherit the same setup. The following example creates a Request and Response Specification in a separate class using the @BeforeClass annotation. Java public class SetupSpecification { @BeforeClass public void setup () { final RequestSpecification request = new RequestSpecBuilder () .addHeader ("Content-Type", "application/json") .setBaseUri ("http://localhost:3004") .addFilter (new RequestLoggingFilter ()) .addFilter (new ResponseLoggingFilter ()) .build (); final ResponseSpecification response = new ResponseSpecBuilder () .expectResponseTime (lessThan (10000L)) .build (); RestAssured.requestSpecification = request; RestAssured.responseSpecification = response; } } This setup method runs before the test class execution. The Request Specification contains the base URI, content type, and logging configuration. Any configuration defined in a Request Specification will be applied to every API request that uses that specification. For example, if the specification includes a common header, authentication token, content type, or query parameter, those values will automatically be sent with all requests that reference the specification. While this promotes reusability and reduces duplication, care should be taken when adding request-specific details to a shared specification. Not all APIs may require the same headers, authentication mechanisms, query parameters, or request bodies. Including such configurations in a common specification can lead to unintended behavior and make tests more difficult to maintain. The Response Specification contains the common validations that are expected from the API response. The expectResponseTime() method validates that the API responds within the specified time limit. Additionally, we can also add the validations for: Status CodeHeadersContent-TypeCookieBody However, it is important to understand that any validation defined in a Response Specification will be applied to every API test that uses that specification. For example, if the specification includes a validation for a 200 status code, all tests using that specification will automatically expect a 200 response. This may not be appropriate for APIs that are expected to return different status codes, such as 201, 204, 400, or 404. The same consideration applies to validations related to headers, content type, cookies, and response body content. Including endpoint-specific validations in a shared specification can reduce flexibility and make tests harder to maintain. A good practice is to keep only the truly common validations in a shared Response Specification and add endpoint-specific assertions within the individual test methods. The statement below makes the Request Specification available globally for the test execution. Java RestAssured.requestSpecification = request; RestAssured.responseSpecification = response; As a result, the base URI and header(Content-Type), and validation to check the response time do not need to be specified in every test. Writing a Test Using the Specifications Once the setup is complete, test classes can extend the SetupSpecification class. Java public class TestGetRequestWithRestAssuredSpecs extends SetupSpecification { @Test public void getRequestTestWithRestAssuredConfig () { final int orderId = 3; given ().when () .queryParam ("id", orderId) .get ("/getOrder") .then () .statusCode (200) .and () .assertThat () .body ("orders[0].id", equalTo (orderId), "orders[0].product_name", equalTo ("USB-C Charger")); } } The Request Specification is automatically applied because it was configured in the SetupSpecification class. It means all the common request configurations, such as the base URI, headers, content type, and logging settings, are automatically applied to the request. Similarly, the common response validations configured for expected response time in the SetupSpecification class are reused during test execution. The test itself focuses only on endpoint-specific details by passing the id query parameter, invoking the /getOrder endpoint. This approach keeps the test concise and improves maintainability by separating common configuration from test-specific assertions. Adding Additional Assertions The Response Specification can handle common validations, while endpoint-specific assertions can still be added in the test. Java public class TestGetRequestWithRestAssuredSpecs extends SetupSpecification { @Test public void getRequestTestWithRestAssuredConfig () { final int orderId = 3; given ().when () .queryParam ("id", orderId) .get ("/getOrder") .then () .statusCode (200) .and () .assertThat () .body ("orders[0].id", equalTo (orderId), "orders[0].product_name", equalTo ("USB-C Charger")); } } In this example, the response body validations for order ID and product name remain inside the test because they are specific to this API endpoint. Why This Approach Is Useful As the test suite grows, hundreds of API tests may use the same base URL, content type, authentication, and response validations. Maintaining these configurations in every test class can quickly become difficult. Keeping the Request and Response Specifications in a separate class provides a centralized location for managing common settings. If the API URL changes or additional configurations need to be added, only a single file needs to be updated. This approach also improves readability because the test methods contain only the business validations relevant to the API being tested. Using Request and Response Specifications Directly in the Test Class While many automation projects prefer keeping specifications in a separate class, there are situations where creating specifications directly inside the test class makes sense. This approach is useful for smaller projects, proof-of-concept implementations, or when a test class requires its own configuration that is not shared with other tests. In this approach, the Request and Response Specifications are created using the @BeforeClass annotation and are available only within the current test class. Java public class StringRelatedAssertionTests { private static ResponseSpecification responseSpecification; private static RequestSpecification requestSpecification; @BeforeClass public void setupSpecBuilder () { final RequestSpecBuilder requestSpecBuilder = new RequestSpecBuilder ().setBaseUri ( "https://api.restful-api.dev/objects") .addQueryParam ("id", 3) .addFilter (new RequestLoggingFilter ()) .addFilter (new ResponseLoggingFilter ()); final ResponseSpecBuilder responseSpecBuilder = new ResponseSpecBuilder ().expectStatusCode (200); responseSpecification = responseSpecBuilder.build (); requestSpecification = requestSpecBuilder.build (); } @Test public void testStringAssertions () { given ().spec (requestSpecification) .get () .then () .spec (responseSpecification) .assertThat () .body ("[0].name", equalTo ("Apple iPhone 12 Pro Max")) } } In this example, the Request and Response Specifications are created once in the @BeforeClass method and stored in static variables. The Request Specification contains common request details such as the base URI, query parameters, and logging filters, while the Response Specification defines the expected status code. During test execution, the Request Specification is applied using the spec(requestSpecification) method before sending the request. After the response is received, the Response Specification is applied using spec(responseSpecification) to validate the common response expectations before performing additional assertions on the response body. Keeping the specifications and test logic within the same class makes the example easy to follow, as both the setup and test execution are located in a single file. However, as the test suite grows and multiple test classes require the same configurations, duplicating specifications across classes can become difficult to maintain. In such situations, moving the common Request and Response Specifications to a separate class provides better reusability and reduces code duplication. For smaller projects or learning purposes, defining the specifications directly within the test class remains a simple and effective approach. Summary Rest-Assured Specifications help create cleaner and more maintainable API automation tests. A best practice is to define Request and Response Specification in a separate class and initialize them using the @BeforeClass annotation. The Request Specification manages settings such as the base URI, content type, and logging, while the Response Specification handles common response validations. By centralizing these configurations, test classes become shorter, easier to read, and simpler to maintain. For API automation frameworks built with REST Assured and TestNG, this pattern provides a clean foundation that scales well as the number of tests increases.
I used to open identity audits by asking a CISO how many users were on their network. These days, I ask a different question first: how many non-human identities do you have, and when was the last time anyone counted? Most of the time, the answer is a long pause, followed by a number that's wrong, followed by an admission that it's wrong. That pause is the whole story of identity security in 2026. CyberArk's 2025 Identity Security Landscape report, based on a survey of 2,600 security decision-makers across 20 countries, put a hard number on what I'd been seeing anecdotally for two years: machine identities now outnumber human identities by more than 80 to 1 in the average enterprise. Service accounts, API keys, certificates, container workloads, CI/CD pipeline tokens, and now AI agents acting on behalf of users — all of it stacking up faster than anyone is governing it. Clarence Hinton, CyberArk's Chief Strategy Officer, said it plainly when the report came out: the privileged access of AI agents represents an entirely new threat vector. He's not wrong, and the part that should bother you is that "new" undersells how fast it's already arrived. Gartner's framing in its 2026 IAM predictions research is just as blunt: human and machine identities have jointly become the primary attack surface, and the firm expects nearly a third of enterprises to be running AI agents that execute workflows autonomously, at machine speed, by the end of this year. Traditional IAM — built around the assumption that a human logs in, gets a session, and logs out — was never designed for an actor that authenticates itself, chains five API calls together in under a second, and never sleeps. The advice Gartner keeps repeating to CISOs boils down to three things: register every machine actor as a first-class identity, automate the entire credential lifecycle instead of trusting humans to rotate things on schedule, and write authorization policy that treats "agent" as its own subject type, not an edge case bolted onto human IAM. Machine and IoT Identities: Stop Treating Them Like an Afterthought Here's the uncomfortable reframe I give every team I work with: a service account is not a lesser version of a user account. It needs its own identity lifecycle — provisioning, attestation, rotation, and deprovisioning — and it needs it whether it's a Kubernetes pod, an IoT sensor on a factory floor, or an AI agent with a standing connection to your CRM. The instinct to issue a long-lived API key once and forget about it is exactly the instinct that's been getting enterprises breached. This is where SPIFFE and SPIRE earn their keep. SPIFFE — the Secure Production Identity Framework For Everyone — graduated to the Cloud Native Computing Foundation's highest maturity tier in August 2022, and adoption since then has only accelerated; known production users include Bloomberg, ByteDance, Pinterest, Block (where the project originated), Uber, and Yahoo Japan, with HashiCorp, Google, IBM, and Intel building on top of it. The pitch is simple, and the implementation is not: every workload gets a cryptographically verifiable SPIFFE ID, short-lived X.509 SVIDs replace long-lived static credentials, and identity gets attested at the node and workload level rather than assumed from a network position. Andrew Moore, Uber's Platform Authentication Tech Lead, described SPIFFE as the "northstar foundation of securing all production interactions" when the project graduated — and having sat through enough postmortems where the root cause was a hardcoded credential in a config file, I understand exactly why he'd put it that way. For IoT specifically, the same principle holds with extra friction: device certificates and public-key-based provisioning at manufacture time beat shared secrets baked into firmware every time, because a shared secret leaked from one device compromises the fleet, while a compromised device certificate compromises one device. The annoying part is that retrofitting this onto an existing IoT deployment is expensive and slow. The expensive part doesn't go away by ignoring it; it just moves from a planned budget line to an incident response invoice. Zero Trust in Practice: What "Per-Call Auth" Actually Means Zero trust as a phrase has been diluted by marketing decks to the point of meaninglessness, so let me be specific about the part that matters here: every single call between services should carry its own authorization decision, independent of network location and independent of whatever broader session or token initiated the chain. A service mesh with mTLS enforced at the sidecar, a Kubernetes admission controller that rejects workloads without valid SPIFFE attestation, an API gateway that checks scope on every request rather than trusting whatever authenticated upstream — that's zero trust as an engineering practice rather than a slogan. The Salesloft Drift breach from August 2025 is the cleanest recent illustration I've seen of what happens when that discipline is missing, and it's worth walking through because almost nobody talks about it as an identity failure, even though that's exactly what it was. Between August 8 and 18, 2025, an intrusion cluster tracked as UNC6395 stole OAuth refresh tokens belonging to Salesloft's Drift chatbot integration with Salesforce. Those tokens didn't just authenticate Drift once — they granted standing, broadly scoped access that let the attackers run systematic queries against Salesforce instances at more than 700 organizations for roughly ten days before anyone shut it down. Cloudflare, one of the disclosed victims, found that 104 of its own API tokens had been exposed in the process, embedded in support-ticket text that the attackers specifically went hunting through. The breach later cascaded further: Google's Threat Intelligence Group linked the same stolen token set to a follow-on compromise of Gainsight-published Salesforce apps affecting another 200-plus instances. Salesforce's core platform was never touched. The failure was entirely in how a third-party integration's machine credential was scoped, monitored, and trusted by default — precisely the non-human identity gap that CyberArk's 80:1 statistic is describing in the abstract. That's the case for per-call, per-scope enforcement instead of standing trust in a token: if Drift's OAuth grant had been scoped to the specific objects it actually needed, time-boxed, and subject to anomaly detection on query volume, ten days of unmonitored SOQL queries against 700 organizations' CRM data simply doesn't happen. Decentralized Identity: Further Along Than Most Engineers Realize I'll admit I was skeptical of decentralized identity for years — it had the smell of a solution chasing a problem. That changed somewhat in 2025. On May 15, the W3C's Verifiable Credentials Working Group pushed the Verifiable Credentials Data Model 2.0 to full Recommendation status, alongside six companion specifications covering data integrity, JOSE/COSE-based securing of credentials, controlled identifiers, and revocation via bitstring status lists. Decentralized Identifiers themselves reached an updated 1.1 Recommendation the same year, building on the original DID Core spec from 2022. None of this is vaporware standards-body theater; it's the plumbing underneath the EU's Digital Identity Wallet rollout and a growing number of supply-chain credentialing pilots. Where this intersects with machine identity is the part most zero-trust articles skip: a verifiable credential doesn't have to describe a human. An AI agent or a service can hold a VC asserting "this workload is attested by this CI pipeline" or "this device passed this manufacturer's provisioning process," cryptographically signed and independently verifiable without phoning home to a central authority every time. It's still early — more than 150 distinct DID methods exist, and that fragmentation is a genuine interoperability headache — but the standards foundation is no longer the blocker it was three years ago. The blocker now is mostly organizational willingness to pilot something that doesn't look like OAuth. IAM Automation: The Part Nobody Can Skip Anymore Here's where the industry stops having a choice. In April 2025, the CA/Browser Forum unanimously approved Ballot SC-081v3, originally proposed by Apple, which phases public TLS certificate lifespans down from the current 398-day maximum to 200 days starting March 2026, 100 days in March 2027, and 47 days by March 2029. That's roughly an eightfold increase in renewal frequency over four years, on certificates most organizations are still managing through some combination of spreadsheets and tribal knowledge. Manual certificate management was already a liability. At a 47-day cadence, it's not viable at any meaningful scale — full stop. Practically, this means PKI and secrets automation move from "nice to have" to load-bearing infrastructure. HashiCorp Vault and its competitors for dynamic secrets issuance, SPIRE for workload-level short-lived credentials, and CI/CD-integrated certificate lifecycle tooling aren't optional add-ons to a security program anymore — they're the only way the math works once renewal events go from roughly one a year to eight. The teams I've watched handle this transition well started treating certificate and secret rotation as a property of deployment automation, a full year before the CA/Browser Forum vote even landed. The teams scrambling now are discovering that "we'll automate it later" was always a deferred cost, not an avoided one. What I'd Actually Build Plain Text IDENTITY ISSUANCE LAYER → SPIFFE/SPIRE issues short-lived SVIDs to every workload, attested at startup → Device certs provisioned at manufacture/build time, never shared secrets → AI agents registered as distinct identity subjects, not borrowed user sessions ENFORCEMENT LAYER (per call, not per session) → Service mesh enforces mTLS between every workload, no exceptions for "internal" traffic → API gateway validates scope and token freshness on every request → Kubernetes admission controller rejects any workload lacking valid attestation LIFECYCLE LAYER (automated, not scheduled) → Vault-issued dynamic secrets with short TTLs by default → Cert rotation pipelines built for 47-day cycles now, not in 2029 → OAuth grants for third-party SaaS integrations scoped narrowly and reviewed on a fixed cadence, not left standing indefinitely The issuance layer answers, "How do we know who this is?" The enforcement layer answers, "What is this identity actually allowed to touch, right now?" The lifecycle layer is what keeps the answer to both of those questions from going stale — which, per the Salesloft Drift timeline, is exactly the gap that turns a single over-permissioned integration into a 700-company incident. None of this is exotic engineering. It's mostly discipline, applied consistently, to a category of identity that most organizations have been quietly ignoring while they perfected human MFA. The uncomfortable truth for 2026 is that the attackers have already noticed where the gap is. The question is whether your machine identities have an owner, a lifecycle, and an expiration date — or whether they're just credentials that happen to still work, sitting in a config file, waiting for someone to go looking for them first.
This is the second follow-up to June 5's release post. It covers the platform APIs that moved into the framework core this release. There are two headline pieces (AI/LLM and the modern OAuth/OIDC stack) and two smaller pieces (WiFi/connectivity and share-sheet result callbacks). This continues the direction the previous release set when we moved NFC, biometrics, and cryptography into the framework core. The full background on that earlier set is in NFC, Crypto, Biometrics, And A New Build Cloud. AI: A First-Class LLM Client and a ChatView Component PR #5035 lands the com.codename1.ai package, the ChatView UI component, the speech and TTS additions, and the build-time dependency injection that wires the native pieces in. PR #5057 lands the developer-guide chapter and the agent-skill addition, so any project generated from the Initializr inherits the new APIs through its bundled AGENTS.md. LlmClient: The Basic Chat Request com.codename1.ai.LlmClient is the entry point. The simplest possible use: Java LlmClient client = LlmClient.openai(apiKey); ChatRequest req = new ChatRequest.Builder() .model("gpt-4o-mini") .system("You are a helpful assistant.") .user("What is the capital of France?") .temperature(0.7) .build(); client.chat(req).onResult((resp, err) -> { if (err != null) { Log.e(err); return; } Log.p(resp.firstChoice().content()); LlmClient.openai(...), LlmClient.anthropic(...), LlmClient.gemini(...), LlmClient.ollama(...), and LlmClient.openAiCompatible(baseUrl, apiKey) are the factories. All five are fully implemented native clients. The OpenAI client also drives Ollama, vLLM, llama.cpp, and any other endpoint that speaks the OpenAI wire format, so most local-model stacks plug in through LlmClient.openAiCompatible(...) without a separate driver. Streaming Chat (What You Actually Want for Chat UIs) For any UI that types responses out token-by-token, the streaming entry point is the one to reach for. The callback fires on the EDT, so you can append directly to a text component: Java client.chatStream(req, new ChatStreamListener() { @Override public void onDelta(ChatDelta d) { responseLabel.setText(responseLabel.getText() + d.contentDelta()); responseLabel.getParent().revalidateLater(); } @Override public void onComplete(ChatResponse fin) { sendButton.setEnabled(true); } @Override public void onError(Throwable t) { Log.e(t); sendButton.setEnabled(true); } Under the hood this is a custom ConnectionRequest subclass that parses SSE line-by-line and dispatches each delta through Display.callSerially. AsyncResource.cancel() kills the socket. So a chat UI that has a cancel button is a one-line cancellation. Tool Calls If you want the model to call back into your app, Tool / ToolChoice give you OpenAI-style function calling. Define the tool, hand the model your model and the available tools, and the response surfaces structured ToolCall objects you dispatch: Java Tool getWeather = Tool.builder() .name("get_weather") .description("Look up the current weather for a city.") .parameter("city", "string", "The city name, e.g. \"Paris\".") .build(); ChatRequest req = new ChatRequest.Builder() .model("gpt-4o-mini") .user("Is it raining in Tel Aviv right now?") .tool(getWeather) .toolChoice(ToolChoice.AUTO) .build(); client.chat(req).onResult((resp, err) -> { if (err != null) return; for (ToolCall call : resp.firstChoice().toolCalls()) { if ("get_weather".equals(call.name())) { String city = call.argument("city").asString(); String json = lookupWeather(city); // Loop the result back into the conversation client.chat(req.replyWithToolResult(call, json)) .onResult((followUp, e) -> updateUi(followUp)); } } The shape mirrors the OpenAI function-calling contract one for one, so anything you have written against the OpenAI API directly maps across without rethinking. Embeddings LlmClient.embed(...) returns a vector for any input string. Useful for similarity search against a local SQLite store (tomorrow's post will cover the new ORM that pairs with this): Java EmbeddingRequest er = new EmbeddingRequest.Builder() .model("text-embedding-3-small") .input("Codename One is a cross-platform mobile framework.") .build(); client.embed(er).onResult((emb, err) -> { float[] vector = emb.firstVector(); // store, search, compare Image Generation DALL-E and a Replicate scaffold are surfaced through ImageGenerator: Java ImageGenerator gen = ImageGenerator.openAiDallE(apiKey); gen.generate("A red bicycle leaning against an olive tree", "1024x1024") .onResult((img, err) -> { if (err != null) return; myImageComponent.setIcon(img); Working Against Ollama in the Simulator (No API Charges) JavaSEPort pings localhost:11434 at startup. If it finds Ollama, it sets the cn1.ai.ollamaDetected property. With cn1.ai.simulatorRedirect=auto (or =ollama) every LlmClient.openai(...) call routes through the local Ollama endpoint instead of OpenAI's. Production code does not change. The iteration loop, your tests, and your offline debugging stop costing money and stop needing an internet connection. In common/codenameone_settings.properties: Properties files simulator.cn1.ai.simulatorRedirect=auto (The simulator. prefix scopes the property to the JavaSE simulator path.) Then run Ollama locally with whichever model your code expects (ollama run llama3.2 or similar) and your existing LlmClient.openai(...) calls go to localhost. How to Handle API Keys A direct word on credentials before any of the above sees production. LLM provider API keys (OpenAI, Anthropic, Gemini, your Auth0 / Firebase configs) are bearer tokens with a budget attached. They must never be checked into source control, embedded in your app binary, or hard-coded in code. A leaked key can be extracted from any APK or IPA in minutes and used to drain your account. The correct shape is to fetch the key from your own backend over an authenticated request, then store it on the device using the platform's keychain / keystore. The framework provides both pieces: com.codename1.crypto.SecureStorage (from the previous release) is the cross-platform wrapper over iOS Keychain Services and Android EncryptedSharedPreferences. Values are encrypted at rest using the platform's hardware-backed protection class where one is available.This release adds a single-argument get / set / remove(account, ...) overloads next to the existing biometric-gated methods. The new overloads store the value without a per-read Face ID / Touch ID prompt, which is what you want for an LLM API key (you read it on every network call; a biometric prompt every time is not workable). The biometric-gated methods are still there for credentials you do want to gate per use. A reasonable shape: Java private static AsyncResource<String> getOpenAiKey() { String cached = SecureStorage.get("openai_api_key"); if (cached != null) { return AsyncResource.complete(cached); } return Rest.get(myServer + "/v1/credentials/openai") .bearerToken(userSessionToken()) .fetchAsString() .onResult((key, err) -> { if (err == null) { SecureStorage.set("openai_api_key", key); } }); Your server gates the credential request behind the user's session, your app caches the result on the keychain, and the key never sits anywhere a reverse-engineering pass could find it. If your server rotates the key, invalidate the cache and refetch. Existing biometric-gated SecureStorage calls keep working unchanged. The new overloads are additive. ChatView: A Ready-Made Streaming Chat UI com.codename1.components.ChatView is the matching UI component. Scrollable message list, ChatBubble for the per-message bubble (theme-aware UIIDs so it picks up the iOS Modern / Material 3 native themes consistently), ChatInput for the bottom input bar, and a one-line bindToLlm(...) that wires the input to a streaming chat request: Java ChatView view = new ChatView(); getOpenAiKey().onResult((key, err) -> { view.bindToLlm(LlmClient.openai(key), new ChatRequest.Builder() .model("gpt-4o-mini") .system("You are a friendly tutor for " + "Codename One developers.") .build()); }); Form f = new Form("Chat", new BorderLayout()); f.add(BorderLayout.CENTER, view); The result is a standard mobile chat layout, picked up from whichever native theme the project uses: If you want more control than bindToLlm(...) gives you (custom message styling, a "thinking" placeholder, hand-rolled retry, persistence to your own model class), drive the view by hand: Java ChatView view = new ChatView(); ConversationStore store = ConversationStore.open("tutor-thread"); view.setMessages(store.load()); LlmClient client = LlmClient.openai(apiKeyFromKeychain); view.setInputListener(userText -> { ChatMessage userMsg = ChatMessage.user(userText); view.appendMessage(userMsg); store.append(userMsg); ChatMessage assistant = ChatMessage.assistant(""); view.appendMessage(assistant); ChatRequest req = new ChatRequest.Builder() .model("gpt-4o-mini") .messages(store.load()) .build(); client.chatStream(req, new ChatStreamListener() { @Override public void onDelta(ChatDelta d) { view.appendToLastMessage(d.contentDelta()); } @Override public void onComplete(ChatResponse fin) { store.append(ChatMessage.assistant(view.lastMessage().content())); view.setInputEnabled(true); } @Override public void onError(Throwable t) { view.appendToLastMessage(" [error: " + t.getMessage() + "]"); view.setInputEnabled(true); } }); appendToLastMessage(...) is the streaming entry point; it marshals through callSerially so deltas land on the EDT in order. ConversationStore persists the thread (the default backing is Storage; pluggable via a custom implementation if you would rather keep it in SQLite or push it to your server). The AI cn1libs The core LLM stack is paired with a set of opt-in cn1libs that wrap specific on-device capabilities: Google ML Kit features, the TensorFlow Lite runtime, a local Whisper transcription engine, and an on-device Stable Diffusion model. Thirteen new cn1libs ship this release. These cn1libs are not yet listed in the Codename One Preferences cn1lib picker, so for the moment they are added by hand. Drop the matching dependency block into your project's common/pom.xml and rebuild. The build-time scanner does the rest: the iOS pod or Swift Package, the Android Gradle dependency, the plist usage strings (NSCameraUsageDescription for the vision libraries, NSSpeechRecognitionUsageDescription for Whisper, etc.), and the Android permissions (android.permission.RECORD_AUDIO for audio capture) are all injected automatically the first time the scanner sees the matching class on the classpath. For each cn1lib below, the dependency block is identical in shape; only the <artifactId> changes. The shared pattern is: XML <dependency> <groupId>com.codenameone</groupId> <artifactId><!-- cn1lib artifact id from below --></artifactId> <version>${cn1.version}</version> </dependency> cn1-ai-mlkit-text: Text Recognition (OCR) TL;DR. Pull printed or handwritten text out of an image (a photo of a page, a sign, a receipt) entirely on-device. Platforms. iOS bridges to GoogleMLKit/TextRecognition. Android bridges to com.google.mlkit:text-recognition. The JavaSE simulator returns an unsupported error. Use cases. Receipt scanning, sign translation pipelines (combine with cn1-ai-mlkit-translate), accessibility tools that read printed text aloud, automated form ingestion. Java byte[] jpeg = capturePhotoBytes(); TextRecognizer.recognize(jpeg).onResult((text, err) -> { if (err == null) Log.p("OCR: " + text); cn1-ai-mlkit-barcode: Barcode and QR Scanning TL;DR. Decodes QR, EAN, UPC, Data Matrix, PDF417, and the rest of the common 1D / 2D code families from a captured image. Platforms. iOS bridges to MLKitBarcodeScanning. Android bridges to com.google.mlkit:barcode-scanning. The JavaSE simulator returns an unsupported error. Use cases. Inventory scanning, ticket / boarding-pass readers, QR-driven onboarding flows, retail loyalty cards. Java byte[] jpeg = capturePhotoBytes(); BarcodeScanner.scan(jpeg).onResult((codes, err) -> { if (err == null) { for (String code : codes) Log.p("Found: " + code); } }); cn1-ai-mlkit-face: Face Detection TL;DR. Returns bounding boxes for human faces detected in an image. Each face is reported as a packed int[4] (x, y, width, height). Platforms. iOS bridges to MLKitFaceDetection. Android bridges to com.google.mlkit:face-detection. Use cases. Auto-crop a contact photo, mosaic / blur bystanders in a group shot, drive a face-tracked overlay for AR-lite filters. Java FaceDetector.detect(jpeg).onResult((boxes, err) -> { if (err != null) return; for (int i = 0; i < boxes.length; i += 4) { Log.p("face at " + boxes[i] + "," + boxes[i + 1] + " " + boxes[i + 2] + "x" + boxes[i + 3]); } }); cn1-ai-mlkit-labeling: Image Labeling TL;DR. "What is in this picture." Returns a list of descriptive labels for the image content. Platforms. iOS bridges to MLKitImageLabeling. Android bridges to com.google.mlkit:image-labeling. Use cases. Auto-tagging uploaded photos, content moderation pre-filters, content-based image search. Java ImageLabeler.label(jpeg).onResult((labels, err) -> { if (err == null) Log.p("labels: " + String.join(", ", labels)); }); cn1-ai-mlkit-translate: On-Device Translation TL;DR. Translate short text between supported language pairs entirely on-device; no server round-trip, no API key, works offline. Platforms. iOS bridges to MLKitTranslate. Android bridges to com.google.mlkit:translate. Languages are identified by their ISO 639-1 codes (en, fr, es, ...). Use cases. Offline travel assistants, chat translation, accessibility readers for foreign signage (combine with cn1-ai-mlkit-text). Java Translator.translate("Where is the train station?", "en", "fr") .onResult((fr, err) -> { if (err == null) Log.p(fr); // "Où est la gare ?" }); cn1-ai-mlkit-smartreply: Short Reply Suggestions TL;DR. Generates short suggested replies for chat conversations, similar to Gmail's Smart Reply chips. Platforms. iOS bridges to MLKitSmartReply. Android bridges to com.google.mlkit:smart-reply. The input is a JSON array of {role, message, timestamp, userId} objects. Use cases. A "quick reply" row above the keyboard in your in-app chat, response suggestions in a CRM inbox. Java String thread = "[{\"role\":\"remote\",\"message\":\"See you at 6?\"," + "\"timestamp\":" + System.currentTimeMillis() + "," + "\"userId\":\"u42\"}]"; SmartReply.suggest(thread).onResult((suggestions, err) -> { if (err == null) { for (String s : suggestions) Log.p("suggestion: " + s); } }); cn1-ai-mlkit-langid: Language Identification TL;DR. Returns the most likely ISO 639-1 code for a given text, or und (undetermined) when the input is too short or ambiguous. Platforms. iOS bridges to MLKitLanguageID. Android bridges to com.google.mlkit:language-id. Use cases. Auto-route a customer-support message to the right team, pick the correct TTS voice for an arbitrary string, pre-screen input before running an expensive translation. Java LanguageIdentifier.identify("Bonjour le monde").onResult((code, err) -> { if (err == null) Log.p(code); // "fr" }); cn1-ai-mlkit-pose: Pose Detection TL;DR. Returns 33 skeletal landmarks per detected pose as a packed float[3 * 33] (x, y, confidence triples). Platforms. iOS bridges to MLKitPoseDetection. Android bridges to com.google.mlkit:pose-detection. Use cases. Fitness apps with form correction, dance/yoga timing analysis, gesture-driven controls. Java PoseDetector.detect(jpeg).onResult((landmarks, err) -> { if (err != null || landmarks.length < 99) return; float noseX = landmarks[0], noseY = landmarks[1], noseConf = landmarks[2]; Log.p("nose at (" + noseX + ", " + noseY + ") conf=" + noseConf); }); cn1-ai-mlkit-segmentation: Selfie Segmentation TL;DR. Returns a per-pixel mask separating the person in the foreground from the background as byte[width * height] (0 = background, 255 = foreground). Platforms. iOS bridges to MLKitSegmentationSelfie. Android bridges to com.google.mlkit:segmentation-selfie. Use cases. Background replacement for video calls, sticker / portrait-mode effects, blur-the-background privacy filters. Java SelfieSegmenter.segment(jpeg).onResult((mask, err) -> { if (err == null) applyBackgroundReplacement(mask); }); cn1-ai-mlkit-docscan: Document Scanner TL;DR. Detects a rectangular document in a photo, perspective-corrects it, and writes the cropped JPEG to a temporary file. Returns the file path. Platforms. iOS uses Apple's VisionKit + Core Image rectangle detection (no extra pod). Android uses com.google.android.gms:play-services-mlkit-document-scanner. Use cases. "Scan to PDF" flows, expense apps that capture receipts, contract signing flows, ID-document capture. Java DocumentScanner.scanToFile(jpeg).onResult((path, err) -> { if (err == null) uploadDocument(path); }); cn1-ai-tflite: TensorFlow Lite Interpreter TL;DR. A general-purpose on-device inference engine. Bring your own .tflite model and run it against a float32 input tensor. Platforms. iOS uses TensorFlowLiteSwift (Pods or Swift Package). Android uses org.tensorflow:tensorflow-lite + tensorflow-lite-support. Use cases. Any custom on-device ML model your team trains or pulls from TF Hub. Image classification, simple regression, recommendation pre-filters. Java byte[] modelBytes = Util.readFully(Display.getInstance().getResourceAsStream(null, "/model.tflite")); float[] input = featureVector(); Interpreter.run(modelBytes, input).onResult((output, err) -> { if (err == null) Log.p("model returned " + output.length + " values"); }); cn1-ai-whisper: Speech-to-Text via whisper.cpp TL;DR. On-device transcription of a 16 kHz mono WAV file using a ggml-format Whisper model. The cn1lib bundles libwhisper.a. Platforms. iOS uses the Accelerate framework; Android uses a JNI build of the same whisper.cpp core. Models (e.g. ggml-base.bin) are not bundled; ship the one your app expects under the app's resources or download on first launch. Use cases. Voice notes, accessibility transcription, offline dictation, podcast indexing. Java String modelPath = SecureStorage.getFilePath("ggml-base.bin"); String audioPath = recordWavToFile(); WhisperRecognizer.transcribe(modelPath, audioPath) .onResult((text, err) -> { if (err == null) Log.p("heard: " + text); }); cn1-ai-stablediffusion: On-Device Image Generation TL;DR. Generates a JPEG from a text prompt using a bundled Stable Diffusion model. Multi-gigabyte payload, local build only. Platforms. iOS uses Core ML pipelines compiled from the bundled model. Android uses ONNX Runtime. Both configurations exceed the cloud build server's 2 GB upload limit, so this cn1lib triggers the cn1.ai.requiresBigUpload guard and the cloud build aborts with a "build this one locally" message. Add it to a project you build via mvn cn1:buildAndroid / mvn cn1:buildIosXcodeProject on the developer machine. Use cases. Avatar generation in apps where shipping to a cloud API is undesirable (offline-first apps, regulated industries, privacy-sensitive products). Java StableDiffusion.generate("a teal hot-air balloon over Lisbon, watercolour", 512, 512, /* steps */ 25) .onResult((jpeg, err) -> { if (err == null) display(Image.createImage(jpeg, 0, jpeg.length)); }); Why These Are cn1libs and Not Part of the Core The core gets the AI plumbing every app that adopts AI at all wants: the LLM client, streaming, the chat UI, the secure storage primitive for credentials, the simulator Ollama redirect for offline iteration. The cn1libs above are specialized verticals. Barcode scanning, document scanning, face detection, smart reply, pose detection, on-device translation, transcription, and on-device image generation are genuinely useful, but only for some apps. They also each bring a non-trivial native dependency. The Google ML Kit Android frameworks are large; the iOS pods carry their own weight; the bundled libwhisper.a and the Stable Diffusion model are big. Pulling all of them into the core would tax every app, whether the feature is used or not. The Stable Diffusion cn1lib in particular is large enough that the cloud build server cannot accept the upload at all (it trips the 2 GB pre-upload guard). That kind of opt-in does not belong in a dependency every app inherits. The corresponding chapter, including the full LlmClient API table, the ChatView reference, the SecureStorage overloads, the simulator Ollama redirect, and the full cn1lib coverage, is at AI, Chat UI, and Speech in the developer guide. OAuth and OIDC: The Modern Identity Stack The in-app-WebView Oauth2 flow that Codename One has shipped since approximately forever was the way every cross-platform mobile framework solved "sign in with Google / Facebook / Microsoft" in the 2010s. It is also the way every one of those identity providers stopped wanting you to solve it. Google has been blocking embedded user agents for years. Apple does not want third-party apps wrapping the Apple ID flow in a WKWebView. Microsoft and Facebook joined the chorus. The right answer is the system browser: ASWebAuthenticationSession on iOS, Custom Tabs on Android, with PKCE on the wire. That is what PR #5018 lands. PR #5039 adds a portable WebAuthn / passkey client on top. Sign In With Google (or Any OIDC Provider) com.codename1.io.oidc.OidcClient is the entry point. Point it at the discovery URL of an OIDC provider, hand it the client id and the redirect URI you registered with the provider, ask for tokens: Java OidcConfiguration cfg = OidcConfiguration.discover("https://accounts.google.com"); OidcClient client = OidcClient.builder() .configuration(cfg) .clientId("123-abc.apps.googleusercontent.com") .redirectUri("com.example.myapp:/oauthredirect") .scopes("openid", "email", "profile") .build(); client.signIn().onResult((tokens, err) -> { if (err != null) { OidcException oe = (OidcException) err; if (oe.getCode() == OidcException.USER_CANCELLED) return; Log.e(oe); return; } String idToken = tokens.getIdToken().raw(); String email = tokens.getIdToken().getClaim("email").asString(); proceed(email, idToken); Discovery JSON parsed and cached. PKCE S256 challenge generated and verified. State and nonce checked on the callback. ID-token claims decoded for you (we deliberately do not verify the signature client-side; the dev guide is explicit about why and points at the "re-validate on your backend" remedy). Refresh and revoke are first-class. The token store is pluggable via TokenStore; the default is Storage-backed, but a Keychain-backed or in-memory variant is a small class. On iOS the system-browser piece routes through ASWebAuthenticationSession. On Android through androidx.browser.customtabs with a plain ACTION_VIEW fallback for the rare device with no Custom Tabs provider. AuthenticationServices.framework and androidx.browser:browser are auto-linked when the classpath scanner sees OidcClient in use. Provider Wrappers: Google, Apple, Microsoft, Facebook, Auth0, Firebase If you would rather not configure OIDC by hand, the existing social classes get a signIn(...) method that drives the same stack with the provider's issuer URL pre-wired: Java GoogleConnect.signIn(googleClientId, "com.example.myapp:/oauthredirect", "openid", "email", "profile") .onResult((tokens, err) -> { /* ... */ }); MicrosoftConnect.signIn(entraClientId, "msauth.com.example.myapp://auth", "User.Read") .onResult((tokens, err) -> { /* ... */ }); Auth0Connect.signIn("tenant.auth0.com", clientId, redirectUri, "openid profile email") .onResult((tokens, err) -> { /* ... */ }); FacebookConnect.signIn(...) follows the same shape against the Facebook OIDC endpoint. FirebaseAuth covers the REST-based Firebase auth surface (email/password, IdP token exchange, refresh) which sits underneath any provider hand-off you might want to drive from app code. Sign In With Apple Sign in with Apple is required on iOS for apps that offer any other social login, and on Android it must fall through to a web flow. com.codename1.social.AppleSignIn handles both transparently: Java AppleSignIn.signIn() .onResult((result, err) -> { if (err != null) return; String idToken = result.getIdToken(); String code = result.getAuthorizationCode(); proceedToBackend(idToken, code); }); On iOS 13 and later this drops directly into the native Apple sheet via ASAuthorizationAppleIDProvider. On non-iOS platforms it falls through to the same OIDC web flow as everything else, so a single line of app code does the right thing on every port. The Maven plugin injects the com.apple.developer.applesignin entitlement on iOS when it sees AppleSignIn in use; Android does not see it because it is not there. Migration From the Legacy Oauth2 com.codename1.io.Oauth2 is now deprecated. Existing code still compiles, but the migration is short and almost always shorter than what it replaces: Java // Before Oauth2 oauth = new Oauth2("https://accounts.google.com/o/oauth2/auth", clientId, redirectUri); oauth.setClientSecret(clientSecret); oauth.setScope("openid email profile"); oauth.setBrowserComponent(myBrowserComponent); // tied to a WKWebView String token = oauth.authenticate(); // blocks, opens the web view Java // After OidcClient.builder() .configuration(OidcConfiguration.discover("https://accounts.google.com")) .clientId(clientId) .redirectUri(redirectUri) .scopes("openid", "email", "profile") .build() .signIn() .onResult((tokens, err) -> proceed(tokens.getIdToken().raw())); You stop owning the browser. The OS owns it. The cookies live in the platform's authentication session. The user gets the same login experience they have everywhere else on their device. WebAuthn/Passkeys PR #5039 layers a portable WebAuthn client on top: Java WebAuthnClient client = WebAuthnClient.getInstance(); if (!client.isAvailable()) { fallbackToPassword(); return; } PublicKeyCredentialCreationOptions opts = PublicKeyCredentialCreationOptions.fromServerJson(serverJson); client.create(opts).onResult((cred, err) -> { if (err == null) postToRelyingParty(cred.toJson()); }); W3C JSON wire format in both directions, so the response can be POSTed verbatim to any standard server-side WebAuthn library. iOS 16+ routes through ASAuthorizationPlatformPublicKeyCredentialProvider; Android API 28+ through androidx.credentials.CredentialManager. Provider helpers: Auth0Connect.signInWithPasskey(...) / .registerPasskey(...) and FirebaseAuth.signInWithPasskey(...) / .registerPasskey(...). One thing worth pulling out before you reach for it: if you sign in via OIDC against Google, Apple, Microsoft, Auth0, or Firebase, you usually already get passkeys for free. The identity provider runs the WebAuthn ceremony inside the system browser; OIDC just hands you the resulting tokens. So you do not need WebAuthnClient for that case. You need it for apps that run their own relying-party backend, and for apps driving the Auth0 or Firebase passkey grants directly. Full chapter: Authentication and Identity. Connectivity: WiFi, Bonjour, USB, network-type listeners PR #5021 lands four packages for apps that need to do more with the network than open an HTTP socket. The shape: Java WiFi wifi = WiFi.getInstance(); String ssid = wifi.getCurrentSSID(); String bssid = wifi.getBSSID(); String gateway = wifi.getGateway(); String ip = wifi.getIp(); wifi.scan(new ScanOptions().setTimeoutMillis(5000)) .onResult((results, err) -> { /* ... */ }); wifi.connect("MyNetwork", "hunter2", Security.WPA2_PSK) .onResult((success, err) -> { /* ... */ }); com.codename1.io.wifi for WiFi info, scan, and connect. com.codename1.io.wifi.WiFiDirect for peer-to-peer (Android only by platform reality). com.codename1.io.bonjour for mDNS / Zeroconf via BonjourBrowser and BonjourPublisher. com.codename1.io.usb for USB host (Android only). And NetworkManager.addNetworkTypeListener(...) plus NETWORK_TYPE_* constants so an app can react to a transition between cellular, WiFi, ethernet, or "none": Java NetworkManager.getInstance().addNetworkTypeListener(evt -> { int type = evt.getNetworkType(); if (type == NetworkManager.NETWORK_TYPE_NONE) showOfflineBanner(); else if (type == NetworkManager.NETWORK_TYPE_CELLULAR) suppressLargeBackgroundDownloads(); else clearOfflineBanner(); }); iOS does not expose programmatic WiFi scanning to third-party apps; scan() throws UnsupportedOperationException on iOS. iOS also does not expose WiFi Direct or general USB host. None of those are Codename One limitations; they are Apple's. The dev guide is explicit about each platform's limits. Three new compile-time defines (CN1_INCLUDE_WIFI_INFO, CN1_INCLUDE_HOTSPOT, CN1_INCLUDE_BONJOUR) wrap the iOS native code, set only when the classpath scanner sees the matching Java API in use. Apps that do not use these APIs do not pay for them at App Store review time. Same pattern as the NFC gating from the previous release. Full reference: Network Connectivity. Share-Sheet Result Callbacks PR #5036 closes a small but persistent gap: Display.share(...) and ShareButton finally tell you what the user did with the share sheet: Java ShareButton btn = new ShareButton(); btn.setTextToShare("Look at this fox"); btn.setImageToShare("/fox.jpg"); btn.setShareResultListener(result -> { switch (result.getStatus()) { case SHARED_TO: track("share_completed", result.getTargetPackage()); break; case DISMISSED: track("share_dismissed"); break; case FAILED: track("share_failed", result.getError()); break; } }); iOS routes through UIActivityViewController.completionWithItemsHandler; Android through Intent.createChooser with an IntentSender callback (API 22+). The framework normalizes the platform values into SHARED_TO(packageName), DISMISSED, or FAILED. Appearing in Other Apps' Share Menus The other half of sharing is the inverse direction: not "let the user share from your app", but "let your app receive content other apps share". If a user is in Safari, Photos, or Mail and taps the share icon, your app should be able to appear as a target there alongside Messages, WhatsApp, and Instagram. On iOS that requires a separate Share Extension target inside the .ipa, with its own bundle, its own Info.plist, an App Group string that links it to the host app, and a ShareViewController that handles the incoming payload. Historically the recommendation was to bootstrap that target by hand in Xcode, copy the resulting files into the Codename One project under ios/app_extensions/, and let the build server's extractor consume them. It worked, but it was a workflow most teams put off because the setup is fiddly. The same PR ships an IOSShareExtensionBuilder Mojo that does all of that for you. A typical setup is one Maven command and a one-time configuration block: XML <plugin> <groupId>com.codenameone</groupId> <artifactId>codenameone-maven-plugin</artifactId> <configuration> <iosShareExtension> <bundleIdentifier>com.example.myapp.share</bundleIdentifier> <displayName>MyApp</displayName> <appGroup>group.com.example.myapp</appGroup> <acceptedContent> <content>PUBLIC_URL</content> <content>PUBLIC_IMAGE</content> <content>PUBLIC_TEXT</content> </acceptedContent> </iosShareExtension> </configuration> </plugin> Run mvn cn1:generate-ios-share-extension and the Mojo writes a complete .ios.appext bundle into ios/app_extensions/: the Info.plist with the right NSExtension activation rules for the content types you declared, the App Group entitlement, a minimal ShareViewController.swift that lands the payload in the App Group's UserDefaults(suiteName:), and the matching buildSettings.properties. The result feeds straight into the existing IPhoneBuilder.extractAppExtensions pipeline, so apps that already have a hand-rolled extension keep working unchanged. On the host-app side, you read the payload on launch: Java // Anywhere after Display.init has run String shared = Storage.getInstance() .readObject("ios.shareExtension.lastPayload"); if (shared != null) { handleSharedPayload(shared); } After the next cloud or local build, your app appears in the iOS share sheet for the content types you declared. No Xcode work, no hand-rolled plist, no App Group string typed in three places. The build-time tooling owns it. Wrapping Up Tomorrow's post covers the architectural change in this release: a build-time bytecode annotation framework, the declarative router that is its first consumer, the SQLite ORM and JSON / XML mappers and component binder built on the same SPI, and the build-time SVG / Lottie transcoder that ships in the same release for related reasons. Back to the weekly index.
In a microservices system, that tight coupling turns a small hiccup into a cascading slowdown. Thread pools fill, retries amplify traffic, and suddenly your simple request is blocked on half the fleet. My executive summary: asynchronous messaging with Kafka helps systems keep moving when individual components inevitably slow down or fail. It does this by decoupling producers from consumers, absorbing traffic spikes, and allowing services to evolve without tying their availability directly to one another. Code Patterns in Spring Boot With Kafka Spring for Apache Kafka gives me two primitives that feel pleasantly old Spring KafkaTemplate for sending and @KafkaListener for receiving. That template/listener model is intentionally similar to other Spring integration tech, which keeps application code focused on domain logic instead of raw client plumbing. Below is a compact (but production-shaped) pattern: externalized config via @ConfigurationProperties, a service port for publishing, a REST command endpoint, a consumer with a real error strategy (DLT), and a REST error advice. Java // === Messaging config (externalized, type-safe) === @ConfigurationProperties(prefix = "messaging.orders") @Validated record OrdersMessagingProps( @NotBlank String topic, @NotBlank String dltTopic ) {} // === DTO (event contract) === public record OrderCreatedEvent(UUID orderId, UUID userId, BigDecimal total, Instant createdAt) {} // === Service port (keeps domain testable, Kafka swappable) === public interface OrderEventPublisher { void publishOrderCreated(OrderCreatedEvent event); } // === Adapter: Kafka producer === @Component class KafkaOrderEventPublisher implements OrderEventPublisher { private final KafkaTemplate<String, OrderCreatedEvent> template; private final OrdersMessagingProps props; KafkaOrderEventPublisher(KafkaTemplate<String, OrderCreatedEvent> template, OrdersMessagingProps props) { this.template = template; this.props = props; } @Override public void publishOrderCreated(OrderCreatedEvent event) { // Keying by orderId keeps per-order ordering and drives partitioning decisions. template.send(props.topic(), event.orderId().toString(), event); } } // === REST command API (synchronous edge, async core) === @RestController @RequestMapping("/v1/orders") class OrdersController { private final OrderService orderService; // domain port OrdersController(OrderService orderService) { this.orderService = orderService; } @PostMapping public ResponseEntity<Map<String, Object>> create(@Valid @RequestBody CreateOrderRequest req) { UUID orderId = orderService.create(req.userId(), req.total()); // persists + publishes event return ResponseEntity.accepted().body(Map.of("orderId", orderId, "status", "ACCEPTED")); } record CreateOrderRequest(@NotNull UUID userId, @NotNull @Positive BigDecimal total) {} } // === Domain service port (implementation can use outbox, transactions, etc.) === public interface OrderService { UUID create(UUID userId, BigDecimal total); } // === Consumer: downstream service reacts to events === @Component class BillingListener { @KafkaListener(topics = "${messaging.orders.topic}", groupId = "${spring.kafka.consumer.group-id}") void onOrderCreated(OrderCreatedEvent event) { // Idempotency belongs here: process-by-key + store processed eventId/orderId to avoid duplicates. // Do work (charge card, create invoice, etc.) } } // === Kafka consumer error handling: retries + DLT === @Configuration class KafkaErrorHandlingConfig { @Bean DefaultErrorHandler defaultErrorHandler(KafkaTemplate<Object, Object> template, OrdersMessagingProps props) { var recoverer = new DeadLetterPublishingRecoverer(template, (rec, ex) -> new TopicPartition(props.dltTopic(), rec.partition())); // Backoff and retry policy are configurable; keep it finite to avoid poison-pill loops. return new DefaultErrorHandler(recoverer, new FixedBackOff(1000L, 3)); } } // === REST error handling (ProblemDetail) === @RestControllerAdvice class ApiErrors { @ExceptionHandler(IllegalArgumentException.class) @ResponseStatus(HttpStatus.BAD_REQUEST) ProblemDetail badRequest(IllegalArgumentException ex) { var pd = ProblemDetail.forStatusAndDetail(HttpStatus.BAD_REQUEST, ex.getMessage()); pd.setTitle("Invalid request"); return pd; } } A few been-burned-before notes on the code above. Spring Kafka’s reference docs are explicit that KafkaTemplate is the convenience wrapper for producing, and DefaultErrorHandler + DeadLetterPublishingRecoverer is a first-class way to route failed records to dead-letter topics after retries. If we want non-blocking retries, Spring Kafka also provides @RetryableTopic, which orchestrates retry topics and a DLT automatically useful when transient failures are common and you want predictable retry delay semantics. Containers and Local Dev With Docker Compose When I’m chasing down event flow bugs, I like local environments that feel like the old days: one command, deterministic startup order, and no mystery dependencies. Docker Compose is still the quickest way to stand up Kafka alongside your services, and Confluent publishes straightforward Docker-based tutorials and compose examples for running Kafka locally. For the service image itself, multi-stage builds are the modern classic compile in a builder stage, and copy the artifact into a slimmer runtime stage. Docker documents multi-stage builds as a way to reduce the final image contents and keep build dependencies out of production. Dockerfile # Multi-stage Dockerfile for a Spring Boot service (orders-service) FROM eclipse-temurin:21-jdk AS build WORKDIR /workspace COPY mvnw pom.xml ./ COPY .mvn .mvn RUN ./mvnw -q -DskipTests dependency:go-offline COPY src src RUN ./mvnw -q -DskipTests package FROM eclipse-temurin:21-jre WORKDIR /app COPY --from=build /workspace/target/*.jar app.jar EXPOSE 8080 ENTRYPOINT ["java","-jar","/app/app.jar"] And here’s a Compose file that wires up Kafka and Schema Registry, plus an example Spring Boot service. The exact image choices are illustrative. Your production choices are unspecified and should reflect your standards and security posture. YAML # compose.yaml (local/dev) services: zookeeper: image: confluentinc/cp-zookeeper:7.6.0 environment: ZOOKEEPER_CLIENT_PORT: 2181 kafka: image: confluentinc/cp-kafka:7.6.0 depends_on: [zookeeper] ports: ["9092:9092"] environment: KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:9092 KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 schema-registry: image: confluentinc/cp-schema-registry:7.6.0 depends_on: [kafka] ports: ["8081:8081"] environment: SCHEMA_REGISTRY_HOST_NAME: schema-registry SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: PLAINTEXT://kafka:9092 orders: build: ./orders-service depends_on: [kafka] ports: ["8080:8080"] environment: SPRING_KAFKA_BOOTSTRAP_SERVERS: kafka:9092 MESSAGING_ORDERS_TOPIC: orders.events MESSAGING_ORDERS_DLTTOPIC: orders.events.dlt SCHEMA_REGISTRY_URL: http://schema-registry:8081 Deploying on Kubernetes or AWS On AWS, the Kafka decision is usually managed or self-managed. If you choose Amazon MSK, the cluster lives in your VPC, pick subnets across distinct Availability Zones, and connect clients using the cluster’s bootstrap brokers. That’s the networking baseline, and it’s not optional. MSK is VPC-first by design. For authentication/authorization, MSK supports IAM access control. AWS documents the client configuration for IAM mechanisms. In EKS, I typically pair MSK IAM with IRSA so pods can obtain AWS credentials the AWS way, while ECS services would use task roles instead. Both patterns are documented by AWS, and your choice here is unspecified. Kubernetes service discovery is usually the easy part. Services and Pods get DNS names so workloads can call each other by name rather than IP. Kafka itself is reached via bootstrap broker endpoints or via internal Services, but either way, you want the strings in externalized config, not hardcoded. Here’s a minimal Kubernetes Deployment/Service for a Kafka client service. Values like region, account IDs, and MSK endpoints are unspecified placeholders. YAML apiVersion: apps/v1 kind: Deployment metadata: name: orders namespace: apps spec: replicas: 2 selector: matchLabels: { app: orders } template: metadata: labels: { app: orders } spec: serviceAccountName: orders-sa # IRSA-bound (role ARN unspecified) containers: - name: orders image: <UNSPECIFIED_AWS_ACCOUNT_ID>.dkr.ecr.<UNSPECIFIED_REGION>.amazonaws.com/orders:<TAG> ports: [{ containerPort: 8080 }] env: - name: SPRING_KAFKA_BOOTSTRAP_SERVERS value: "<UNSPECIFIED_MSK_BOOTSTRAP_BROKERS>" - name: MESSAGING_ORDERS_TOPIC value: "orders.events" - name: MESSAGING_ORDERS_DLTTOPIC value: "orders.events.dlt" readinessProbe: httpGet: { path: /actuator/health/readiness, port: 8080 } initialDelaySeconds: 10 --- apiVersion: v1 kind: Service metadata: name: orders namespace: apps spec: selector: { app: orders } ports: - port: 80 targetPort: 8080 Operationally, MSK exposes metrics into CloudWatch (AWS/Kafka), and broker logs can be delivered to CloudWatch Logs (or S3/Firehose). That combination gives you the classic visibility loop: throughput, lag, under-replicated partitions, and error logs without running your own monitoring plane. For distributed tracing in async flows, OpenTelemetry is my default vocabulary now. Spring Boot supports OpenTelemetry export via OTLP, and OpenTelemetry defines Kafka semantic conventions so your producer/consumer spans and attributes stay consistent across tools. CI/CD and the Hard-Earned Field Notes For CI/CD, I keep it boring: build once, push an immutable image, deploy via a declarative mechanism. AWS Prescriptive Guidance provides a clear GitHub Actions pattern for building Docker images and pushing to Amazon ECR, which is a solid baseline when your region/account is unspecified until configured. YAML # .github/workflows/orders.yml name: orders on: push: branches: ["main"] jobs: build_push_deploy: runs-on: ubuntu-latest permissions: id-token: write contents: read steps: - uses: actions/checkout@v4 - uses: actions/setup-java@v4 with: distribution: temurin java-version: "21" - name: Build & test run: ./mvnw -q test package - name: Configure AWS credentials (OIDC) uses: aws-actions/configure-aws-credentials@v4 with: role-to-assume: arn:aws:iam::<UNSPECIFIED_AWS_ACCOUNT_ID>:role/<UNSPECIFIED_GHA_ROLE> aws-region: <UNSPECIFIED_REGION> - name: Login to ECR run: | aws ecr get-login-password --region <UNSPECIFIED_REGION> \ | docker login --username AWS --password-stdin <UNSPECIFIED_AWS_ACCOUNT_ID>.dkr.ecr.<UNSPECIFIED_REGION>.amazonaws.com - name: Build & push image run: | IMAGE=<UNSPECIFIED_AWS_ACCOUNT_ID>.dkr.ecr.<UNSPECIFIED_REGION>.amazonaws.com/orders:${{ github.sha } docker build -t $IMAGE ./orders-service docker push $IMAGE - name: Deploy to EKS (example) run: | aws eks update-kubeconfig --name <UNSPECIFIED_EKS_CLUSTER> --region <UNSPECIFIED_REGION> kubectl -n apps set image deploy/orders orders=$IMAGE Now, the part I wish someone had handed me in 2016: Kafka gives you strong tools, but it does not remove distributed-systems truths. You still need safeguards on the consumer side: idempotent processing, disciplined schema management, and clearly defined retry and dead-letter topic behavior. Kafka’s documentation is careful about the limits of “exactly once” guarantees. Idempotent producers and transactions can strengthen delivery semantics, but achieving true end-to-end exactly-once behavior, especially when external side effects are involved, still depends on deliberate system design. For schema governance, Kafka itself doesn’t ship a schema registry, but acknowledges third-party registries; in practice, Confluent Schema Registry and Apicurio Registry are common choices. Both store schemas out-of-band, so messages carry only a schema identifier, and both support evolvable contracts across Avro/JSON Schema/Protobuf depending on your ecosystem. Conclusion and Best Practices If you take one lesson from my legacy brain into modern event-driven systems, let it be this: asynchrony is a reliability feature, not a performance trick. Kafka’s durable log and consumer group model decouples uptime and absorbs spikes, but you only get the real benefit when you treat schemas as contracts, consumers as idempotent processors, and failure handling as first-class application behavior. On AWS, the operational baseline is non-negotiable. MSK lives in your VPC across AZ subnets, clients connect via bootstrap brokers, IAM auth is configured explicitly, and observability lives in CloudWatch. Do those fundamentals early, and Kafka stops feeling like a mysterious black box and starts feeling like the dependable workhorse it was built to be.
Editor’s Note: The following is an article written for and published in DZone’s 2026 Trend Report, Cognitive Databases, Intelligent Data: Unified Infrastructure for Vector Search, AI-Optimized Queries, and Hybrid Workloads. For years, some of us have argued that the data stack is part of the product and should be engineered like the application layer: as code and as a service. The market matured toward it, and the data mesh has been the clearest recent expression. AI has eclipsed those debates and settled the matter. The data stack is now product-facing, shaping what users see, what AI answers, and which automated decisions and workflows fire. That makes one question unavoidable: When an answer depends on data across many systems and teams, who is accountable for accuracy? An AI answer is assembled at request time from corporate data. The data stack is inside the response. AI Turns Data Infrastructure Into Product Behavior AI makes the data stack part of product behavior, but raw infrastructure should not leak into the product. The goal is to abstract the stack behind durable, governed interfaces. An AI feature should consume meaning, relationships, permissions, and context. Following data mesh and data contracts, the API layer has to evolve from returning data to exposing capabilities. A consumer, including an AI model, should depend on a contract that carries: Metadata – origin, lineage, meaningQuality – freshness, completeness, confidenceRelationships – how entities compose and traverseSecurity – authorization applied consistently across operational, analytical, and vector stores When meaning lives in the contract, infrastructure becomes interchangeable, and a misbehaving AI feature is no longer an opaque failure — it’s a question with an owner. Where Ownership Breaks First Ownership does not break at the edges of systems but much earlier, in how the organization is designed. Most technology organizations still distribute teams around components and technical specialization: applications, databases, pipelines, governance, indexing, and analytics. Each team owns its layer, though no one owns the end-to-end meaning of the data. That worked when data only fed analytics. It fails in AI-native products, where data is product behavior, and the two lifecycles are inseparable. AI composes its behavior across every layer at once, inheriting each inconsistency in semantics, freshness, permissions, and relationships. So this is not a handoff problem; it is a Conway’s Law problem. Architecture mirrors the organization, and AI makes the organizational seams visible to the user. Platform teams remain essential for shared abstractions, governance primitives, and standards. But product teams need to own both their features and APIs and their data end to end: its lifecycle, meaning, quality, and governance. Splitting teams by technical layer scatters one business entity across many disconnected owners, and AI inherits that fragmentation. AI-native organizations give product teams end-to-end ownership of the data, with platform teams providing shared standards. Accountability Follows Product Behavior When data only fed dashboards, accountability could stay narrow: Did the pipeline run, and did the report match? AI moves that boundary. Once retrieval, copilots, and agents start making decisions and generating answers from data, a correct pipeline, a healthy index, and a valid access policy still don’t guarantee a correct user-facing result. Accountability can’t be pinned to technical layers. It has to follow the behavior the user experiences. The product team that owns an AI capability is responsible for the end-to-end correctness, freshness, explainability, and safety of the data behind it. Its job is to own the contract that defines what the AI may know and retrieve. Platform teams provide the standardized primitives that make this accountability structure possible: semantic contracts, lineage, quality signals, access enforcement, observability, and governance-aware retrieval. The question shifts from “which team owns this layer?” to “which product team owns this behavior, and which platform capabilities guarantee it?” In AI-native systems, accountability rests with the team that owns the behavior, not the system that happened to fail. Table: Accountability Differences Between the Layer-by-Layer and AI-Native Models arealayer-by-layerai-nativeSource of truthEach system decides locallyThe product team owns the authoritative semantic contractQualityThe data team checks pipelinesThe product team owns user-facing correctness; the platform provides quality signalsRetrievalThe platform team owns indexes as infrastructureA governed product capability with explicit SLOsAccessThe security team owns policies separatelyEnforced consistently across product, data, and AI layersIncidentsRouted to whichever layer failedThe product team leads; the platform, data, and security teams support as capability owners Architecture Choices Are Also Operating Model Choices Architecture decisions also decide how an organization governs and evolves meaning. AI-native systems raise the stakes here because copilots and agents consume meaning — entities, relationships, metrics, and permissions — rather than tables. Semantic consistency becomes part of how the product behaves. No central team can own the meaning of every domain, so meaning has to live close to the domain that owns the capability. But decentralization alone backfires: Without platform-enforced standards, the old central bottleneck just turns into semantic fragmentation, with every domain exposing its own definitions and contracts. The fix is to split ownership cleanly: Domains own the meaningPlatform teams own the contracts that keep it consistent Underneath, storage and processing keep churning. What actually lasts is whether stable abstractions (e.g., “employee,” “payroll,” “entitlement”) survive above them. The principle is simple: Infrastructure should be replaceable, and meaning should not. So the real operating-model choice comes down to who owns meaning, and who keeps it consistent. Shared Data Contracts Make Accountability Concrete If organizational fragmentation is the root problem, contracts make ownership explicit. A classic data contract is necessary but insufficient. Schema validation catches a renamed column, but it misses semantic drift, stale meaning, or a changed business definition. Those failures don’t break a build. They break behavior. The contract has to grow from schema into semantics, carrying meaning, lineage, quality, and authorization. Crucially, it abstracts the capability and meaning a domain exposes, not the storage format underneath, so it behaves the same whether the source is a table, a document, an event, or an embedding. That makes the data contract both a producer-to-consumer check and a runtime semantic interface that retrieval, copilots, and agents all consume. Its real value is relocating accountability to the source so drift surfaces in the producing domain while context stays local, which accelerates interoperability rather than centralizing control. Governance Has to Travel With the Data Traditional governance sat beside the data in the form of periodic reviews, approvals, and access checks. AI breaks that model. Data now moves continuously through pipelines, caches, embeddings, indexes, and agents, recomposed at runtime faster than any review can observe. Governance must be part of the execution model itself. Governance travels with meaning, not storage. An embedding holds no raw rows yet reveals sensitive meaning, so policy must follow the semantic classification. The gap is sharpest in authorization. Identity systems stop at the API boundary, and AI doesn’t preserve security boundaries on its own, which turns every embedding, cache, and retrieval step into a new one to defend. Governance therefore becomes a runtime capability that decides what AI may retrieve, infer, expose, and act on. Solving that calls for composable, declarative governance primitives embedded in the platform so auditability becomes a property of the system rather than the outcome of a project. Accountability Gaps That Slow AI Data Work The real cost of fragmented accountability is the constant drag on every data-powered capability. Friction is never neutral, so when teams can’t trust the platform’s freshness, semantics, or governance, they route around it and build their own, resulting in shadow pipelines, local indexes, and duplicated transformations. Each workaround makes sense locally even as it corrodes the whole, fragmenting governance and eroding trust in the very platform it was meant to replace. And piling on more central control only hides the problem — the fragmentation just migrates into those shadow systems. So the deeper gap was missing platform contracts. What Clear Ownership Looks Instead of adding more teams on more layers, clear ownership means aligning accountability with the single product experience the user meets. What you’re really investing in is the stable semantic abstractions that outlast whatever infrastructure comes and goes. And the hardest problem is how to make the organization understandable to its own AI systems. Additional resources: DAMA-DMBOK: Data Management Body of KnowledgeDAMA International – foundational guidance on data ownership, stewardship, and governance rolesOpen Data Contract Standard (ODCS) – an open spec for declaring schema, semantics, quality, and service levels between data producers and consumersOpenLineage – an open standard for collecting data lineage across pipelines and services, useful for tracing what AI features consumeNIST AI Risk Management Framework (AI RMF) – a vendor-neutral framework for accountability and governance of AI systemsCoral – exposes diverse data sources to agents through one declared SQL and semantic layer; an example of meaning being owned per source rather than centrallyGetting Started With Data Quality, DZone Refcard by Miguel García LorenzoData Pipeline Essentials, DZone Refcard by Sudip SenguptaOpen-Source Data Management Practices and Patterns, DZone Refcard by Abhishek GuptaReal-Time Data Architecture Patterns, DZone Refcard by Miguel García Lorenzo“Building Trusted, Performant, and Scalable Databases: A Practitioner’s Checklist” by Saurabh Dashora This is an excerpt from DZone’s 2026 Trend Report, Cognitive Databases, Intelligent Data: Unified Infrastructure for Vector Search, AI-Optimized Queries, and Hybrid Workloads.Read the Free Report
Enterprise perimeter defenses are fundamentally built on an obsolete assumption that the developer’s workstation is a secure, trusted anchor point. The massive security breach executed by the threat group TeamPCP, resulting in the exfiltration of 3,800 internal GitHub source code repositories, completely shattered this illusion. This was not a standalone exploit. It was a multi-vector convergence where vulnerabilities in the Node/NPM ecosystem, the systemic ungoverned architecture of the Visual Studio Code Marketplace, and the tactical “fog of war” caused by a period of historic GitHub infrastructure instability came together to create the perfect attack. Phase 1: The Root Exploitation (Node/NPM and the TanStack Supply Chain Pivot) The kill chain did not begin at GitHub; it originated deep within the modern JavaScript developer tool-chain. TeamPCP executed a localized supply chain compromise targeting upstream open-source utilities, specifically targeting contributors to TanStack npm packages (a widely relied-upon suite for state management and routing). [TanStack NPM Compromise] -> [Stolen ‘gh’ CLI Tokens] -> [Nx Console Pipeline Hijack (No Multi-Admin Approval)] -> [Malicious Extension Version 18.95.0 Published] By injecting malicious code into these highly trusted downstream dependencies, the attackers performed targeted local credential harvesting. Their primary target was not production application code, but the development environments of the maintainers themselves. The exploit successfully extracted long-lived GitHub CLI (`gh`) authentication tokens from a legitimate core developer who maintained both TanStack and the Nx Console ecosystem. Because these developer access tokens lacked granular scoping restrictions, they provided direct administrative write access to the main release pipelines of secondary repositories. Phase 2: VS Code Extension Poisoning (The Nx Console Triage (CVE-2026–48027)) Armed with the stolen gh tokens, TeamPCP bypassed standard perimeter security by pivoting to the Visual Studio Marketplace. On May 18, version 18.95.0 of Nx Console (a heavily utilized Monorepo orchestration extension with over 2.2 million installs) was maliciously built and uploaded. The deployment revealed two fatal flaws within modern developer workflows, 1. The Single-Factor Release Pipeline The malicious version was uploaded directly to both the Visual Studio Marketplace and the open-source OpenVSX registry. Because the Nx Console publishing architecture lacked a “two-admin manual validation mandate” for automated releases, the publishing pipeline accepted the stolen developer token at face value without triggering a secondary verification gate. 2. The “Silent Killer”: Marketplace Metrics vs. Background Sync Microsoft’s public marketplace logs initially registered a negligible 28 manual downloads before the package was identified and yanked 18 minutes later. However, Nx’s internal telemetry revealed that ~6,000 extension activations occurred simultaneously. This massive discrepancy highlights the danger of VS Code’s background auto-update synchronization. Thousands of developer environments pulled down, unzipped, and executed the malicious version automatically while the developers’ IDEs were running in the background. JSON // Example of the target parameters within compromised developer workspaces { "extensions.autoUpdate": true, // The default vulnerability exploited by TeamPCP "terminal.integrated.profiles.osx": { "malicious-hook": { "path": "/bin/bash", "args": ["-c", "python3 ~/.local/share/kitty/cat.py &"] } } } The Node Execution Layer Upon extension activation, the poisoned payload immediately dropped an obfuscated Node.js post-install hook. Operating completely within user space to evade basic Endpoint Detection and Response (EDR) behavioral hooks, it set an environmental marker (`__DAEMONIZED=1`) and spawned a background Python process (`cat.py`). The malware systemically scanned local paths for, Infrastructure configuration: Plaintext HashiCorp Vault tokens (`~/.vault-token`), local Kubernetes kubeconfig files, and AWS/Azure IAM metadata endpoint hashes.Ecosystem identity: Plaintext .npmrc registry tokens and active GitHub tokens (`ghp_`, gho_, ghs_).Active memory subsystems: Contents of 1Password vaults via the op CLI by hijacking active, unlocked terminal sessions. Phase 3: The Climax (The Internal GitHub Breach) The payload achieved its ultimate goal when an internal GitHub software engineer, who utilized Nx Console for local workflow management, had their workstation pull down the background update. The malware executed on the engineer’s local machine, scraped their active internal enterprise session tokens, and exfiltrated them to a remote command-and-control (C2) server. TeamPCP then used these highly privileged internal access credentials to bypass GitHub’s corporate identity perimeters entirely. Because internal repository boundaries operate on flat network access structures once an authenticated developer endpoint is cleared, the threat actors systematically cloned and exfiltrated 3,800 proprietary internal GitHub source code repositories before the endpoint could be isolated. Phase 4: The Tactical Fog of War (GitHub’s Infrastructure Instability) The velocity and stealth of this attack were significantly aided by an ongoing reliability crisis within GitHub’s core infrastructure. During the 12-month window surrounding the breach, GitHub recorded a massive spike in service degradation. Total tracked incidents: 257 distinct technical incidents.Major outages: 48 major service shutdowns, totaling 112 hours and 18 minutes of total downtime.Primary failure vector: GitHub Actions experienced 57 outages, three times the incidence rate of core Git storage operations.On October 29: Outage in compute dependency (Microsoft Azure). 90% error rate across Codespaces/ Actions affected Telemetry gaps, security monitoring systems failed to track cross-border API token replication.On February 2: Configuration failure in user settings caching, cascading failures in the Git HTTPS proxy affected High volume of synchronous cache writes generated a deluge of network errors, masking anomalous Git clone calls.On February 12: Authorization claim changes in core networking dependencies, 90% Codespace provisioning failure affected Security alerts failed to populate due to misclassified severity ratings, delaying incident detection by hours. The Alert Fatigue Vulnerability This constant infrastructural noise created an ideal tactical environment for the attackers. SecOps and DevSecOps teams were caught in a continuous state of alert fatigue, dealing with broken GitHub Actions pipelines, Elasticsearch cluster degradation, and database timeouts. When TeamPCP’s automated scripts began running rapid API queries and pulling massive volumes of repository data using the compromised engineer’s token, the unusual spikes in data transfer blended into the background noise of an infrastructure already struggling with systemic capacity and networking failures. Hard Takeaways: How Developers Must Harden Their Environments If your environment was active or using automated tools during this period, you must shift your development machine from an implied trust zone to a completely zero-trust environment. 1. Kill IDE Auto-Updates Globally Never allow your IDE to pull unvetted code in the background. Explicitly configure your editor to require manual permission before updating any third-party extension. In VS Code’s global settings.json, enforce: Plain Text "extensions.autoUpdate": false, "extensions.autoCheckUpdates": false 2. Isolate Development Runtimes Stop running compilers, package managers, and complex IDE extensions directly on your bare-metal operating system. Utilize isolated, ephemeral development environments (e.g., containerized Dev containers or heavily scoped virtual environments) where local file-system access is completely decoupled from your master ~/.ssh/, ~/.aws/, or .npmrc configuration folders. 3. Implement Strict Token Volatility Eliminate long-lived personal access tokens (`ghp_`). Switch entirely to fine-grained personal access tokens configured with strict, single-repository scope constraints and a maximum 7-day expiration date. Explicitly configure your local password managers and authentication tools to require biological verification (e.g., TouchID/FaceID) or short timeout windows for every individual call executed via the terminal command line (`op signin timeout`).
I've spent the better part of fifteen years staring at API traffic logs for a living, and I can tell you the job has changed twice. The first shift came with microservices, when a handful of monolithic endpoints became thousands of small, chatty interfaces, and nobody could agree on who owned the inventory. The second shift is happening right now, and it's worse because this time the endpoints aren't even being written by people who can explain why they exist. Call them phantom APIs: routes, handlers, and parameters that show up in production but never appear in a spec, a ticket, or a design review. Some get hand-built by a developer in a hurry and are forgotten. Increasingly, though, they're a byproduct of AI code generation — Copilot, Cursor, an internal fine-tuned assistant, whatever your shop has standardized on — quietly scaffolding an admin route, a debug handler, or a permissive query path because that pattern showed up often enough in training data to feel "normal." Nobody asked for it. Nobody reviewed it with fresh eyes, because by the time a human glances at the diff, the suggestion already looks plausible. That's the part that should worry you more than any single CVE: plausibility, not malice, is now the main vector. How a Phantom Gets Born Here's the mechanism, stripped of drama. An engineer asks an AI assistant to "add an endpoint that lets support staff look up account status." The model, trained on millions of internal admin panels, often reaches for the path of least resistance: broad object access, no granular scope check, maybe a debug flag left wired to a query parameter "for testing." It compiles. It passes the smoke tests because the smoke tests check that the feature works, not that it's bounded. It ships. None of that shows up in your OpenAPI document because nobody updated the spec — the AI didn't know one existed, and the human reviewing the pull request was scanning for logic bugs, not authorization boundaries. Your API gateway, meanwhile, is busy enforcing policy on the routes it knows about. A path it has never seen just rides along on the same TLS termination and the same network ACLs as everything else, because from the network's point of view, there's nothing unusual happening. The gateway isn't broken. It's just answering a question nobody thought to ask it. I've heard versions of this story from engineers at a logistics platform, a healthcare billing vendor, and a fintech, all in the last year, none of whom wanted their names anywhere near a public postmortem — which is its own data point. Shame keeps these incidents quiet, and quiet incidents are exactly what let the pattern repeat across the industry instead of getting fixed once. The Numbers Stopped Being Theoretical in 2025 If you've been treating "API security" as a slide in next year's budget deck rather than this quarter's incident response calendar, the data from the past twelve months should change your mind. Wallarm's 2026 API ThreatStats Report, which pulled from 67,058 published vulnerabilities and 60 disclosed API breaches across 2025, found that API-related flaws made up 17% of all published vulnerabilities and 43% of the entries CISA added to its Known Exploited Vulnerabilities catalog that year. The technical profile of those flaws is the part that should keep API owners up at night: 97% exploitable with a single request, 99% remotely reachable, and 59% requiring no authentication at all. This isn't an attack surface that rewards patience and tradecraft. It rewards speed, and speed is exactly what AI tooling hands to attackers as readily as it hands to developers. That same report tracked AI-related vulnerabilities jumping from 439 in 2024 to 2,185 in 2025 — a 398% increase — with 315 of those tied specifically to Model Context Protocol implementations, the connective tissue between AI agents and the tools they're allowed to call. MCP didn't exist as a meaningful attack surface two years ago. Now it's 14% of all AI-related vulnerability disclosures in a single annual report. I don't think I've watched a category go from nonexistent to material that fast since the early days of container orchestration. IBM's X-Force Threat Intelligence Index 2026 adds the macro view: exploitation of public-facing applications became the single most common initial access vector in 2025, up 44% year over year, and 56% of the roughly 40,000 vulnerabilities X-Force tracked required no authentication to exploit. CybelAngel's own 2025 API threat reporting found that 95% of API attacks that year originated from sessions that were already authenticated — meaning the front door wasn't the problem; what happened after someone walked through it was. Put those two findings side by side, and you get a fairly bleak picture: getting in is easy, and once an attacker is in, the API layer rarely stops them from going sideways. And CrowdStrike's 2026 Global Threat Report puts a number on how little time defenders now have to notice. Average eCrime breakout time — the gap between initial access and lateral movement — fell to 29 minutes in 2025, down from 48 minutes the year before and 98 minutes in 2021. The fastest breakout CrowdStrike observed clocked in at 27 seconds. AI-enabled adversary operations rose 89% year over year, and the company recorded prompt-injection or AI-tool abuse incidents at more than 90 organizations. As Adam Meyers, CrowdStrike's head of counter adversary operations, put it when the report landed, breakout time is now the clearest signal of how intrusions have changed. A phantom API sitting outside your monitoring isn't a slow-burning liability anymore. It's a 27-second one. GraphQL Made This Worse, Not Better GraphQL was supposed to reduce shadow API risk by giving clients one well-documented entry point instead of dozens of REST routes. In practice, it concentrated the risk instead of eliminating it. Roughly 70% of organizations now run GraphQL in some form, according to Wallarm's Q2 2025 ThreatStats data, and the same report flagged something that should sound familiar to anyone who's done incident response: zero GraphQL-specific breaches were publicly disclosed that quarter, despite the technology's deep reach into production systems. That's not a sign GraphQL is safe. It's a sign almost nobody is looking closely enough to catch what's happening inside a single, deeply nested query that can touch a dozen resolvers and a dozen authorization decisions in one round trip. A REST endpoint that's missing an authorization check is one bug. A GraphQL resolver tree with the same gap can be a dozen bugs wearing one URL. Shadow and zombie APIs compound the problem from the other direction. Salt Security's 2025 CISO report found that only 19% of CISOs globally have full visibility into their API inventory — just 27% among large enterprises, and a thin 12% among smaller organizations — despite 73% ranking API security as a high or critical priority. Two-thirds of organizations audit for shadow APIs only monthly or quarterly, which leaves a four-to-twelve-week window every single cycle during which an undocumented route can sit there, fully reachable, before anyone goes looking. Salt Labs' own Q1 2025 data found that 99% of organizations had encountered an API security issue in the prior twelve months, and BOLA and injection flaws together accounted for more than a third of everything reported. None of this is exotic. It's the same handful of failure modes, recurring at a scale that AI-assisted development is now accelerating rather than fixing. The Failure Chain, Step by Step Strip away the vendor-report statistics for a second and walk through how this actually plays out on a single team, because the abstraction is where people lose the thread. A developer asks an AI assistant for a quick internal tool: pull account status for support staff, fast, no fuss. The assistant generates a working route, and because "working" was the only bar anyone set, it also generates a second, undocumented path the model added on its own initiative — a debug variant that accepts a raw account ID with no scope check, left over from however the model's training data tends to structure admin tooling. The pull request gets reviewed for logic, not for the existence of a route nobody asked for, because nobody is in the habit of reading a diff looking for endpoints that shouldn't exist. It merges. The OpenAPI spec doesn't change because nothing in the toolchain forces it to. The API gateway keeps doing its job — rate limiting, TLS, routing — on every path it's configured to recognize, and the new one simply isn't on that list, so it inherits whatever the underlying framework allows by default rather than anything the security team actually decided. For months, nothing happens because nobody is sending traffic to a path nobody knows about. Then someone does. Maybe it's a script kiddie running a wordlist against common admin paths, maybe it's a scraper, maybe it's one of the AI-driven reconnaissance tools the CrowdStrike and Wallarm data above describe as increasingly common. The request lands. There's no auth check to fail, so there's no log entry resembling a failed login — the kind of signal most SOC dashboards are tuned to catch. There's just a 200 response and a payload of account data. Given that CrowdStrike clocked the fastest 2025 breakout at 27 seconds and the average at 29 minutes, the gap between "endpoint found" and "data gone" is no longer a window anyone can rely on noticing in real time. By the time it surfaces — an anomaly report, a customer complaint, a researcher's disclosure email — the honest answer to "how long has this been exposed" is usually some shrug-worthy variant of "the logs only go back so far." That's the chain: AI suggestion → unreviewed scope gap → silent spec drift → gateway blind spot → silent exploitation → discovery after the fact. Every link in it is mundane. None of it requires a sophisticated attacker. That's exactly why it keeps happening. What I'd Actually Build to Catch It Description is cheap. Here's the shape of a pipeline I'd put in front of a team that wanted to stop shipping phantom routes instead of just talking about the risk: Plain Text CI/CD LAYER (pre-merge, blocking) → Generate live OpenAPI spec from the build → Diff against the last approved spec → Any new route not explicitly annotated/reviewed → FAIL build → Flag missing auth decorators, missing rate-limit config, wildcard scopes RUNTIME LAYER (continuous, post-deploy) → Traffic profiler sits behind the gateway, fingerprints every path actually receiving requests → Cross-reference live traffic against the approved spec, on a rolling window (hours, not quarters) → Anything serving 200s that isn't in the spec → page on-call, not a quarterly report GATEWAY LAYER (enforcement) → Default-deny for any path not present in the signed spec → Schema validation on request/response shape, not just route existence → Auth/scope check enforced at the gateway, independent of what the service itself does The CI step is the cheapest control here, and the one most teams skip, because it requires someone to decide that an undocumented route is a build failure, not a Slack message for later. The runtime layer catches what gets past CI anyway — config drift, routes added outside the normal deploy path, anything a human forgot to annotate. The gateway layer is the backstop: even if the first two fail, a default-deny policy means an unrecognized path doesn't get served at all, rather than getting served and merely logged. None of these three layers is sufficient alone. Together, they convert "we hope someone notices" into "the system refuses to let this happen quietly," which is the actual point. What Actually Works, and What's Mostly Marketing The vendor response has been predictably fast and not entirely cynical. Akamai's $450 million acquisition of Noname Security, announced in May 2024 and closed that June, folded one of the better-regarded API discovery platforms directly into a CDN-and-edge company's security stack — a clear bet that API visibility belongs as close to the traffic as possible, not bolted on afterward. Salt Security's 1H 2026 report introduced what it calls Agentic Security Posture Management, aimed squarely at mapping the relationships between LLMs, MCP servers, and the APIs underneath them, specifically to catch what the industry has started calling "Shadow MCP." Whether that label sticks or fades in eighteen months, the underlying instinct is correct: you cannot secure an API layer you can't continuously enumerate, and static documentation reviewed once a quarter is no longer a serious control. The defenses that actually move the needle, based on what I've watched, hold up under real incident response, aren't glamorous: Runtime discovery over documentation trust. Treat your OpenAPI spec as a claim to be verified against live traffic, not a source of truth. If traffic is hitting a path that isn't in the spec, that's an incident, not a documentation gap.Spec-diffing in CI, not just in security review. A pull request that introduces a new route should fail a build if that route doesn't appear in an updated, reviewed spec. This is cheap to automate and catches the AI-generated-endpoint problem at the exact moment it's introduced.Authorization checks that don't trust the session. Given that 95% of API attacks in CybelAngel's 2025 dataset started from an authenticated session, the perimeter check matters far less than the per-object, per-field authorization decision happening on every single call.AI-assisted review aimed at AI-generated code specifically. Ironically, the same pattern-matching that produces phantom endpoints can be turned around to flag them — diff-aware tooling that specifically interrogates new routes for missing rate limits, missing auth decorators, or unscoped data access, rather than general-purpose linting.Treat MCP and agent tool definitions as part of your API attack surface, full stop. They're not a side project. They're API endpoints with extra steps, and the ThreatStats data says they're already 14% of AI-related disclosures. None of these are silver bullets, and I'd be lying if I said any vendor has fully solved this. What I will say, after watching this category for a year now, is that the organizations doing well are the ones that stopped treating "shadow API discovery" as a once-a-quarter audit and started treating it as a property of the deployment pipeline itself — something that gets checked on every merge, the same way a linter or a test suite does. The ones still relying on a documentation review process built for a world where humans wrote every route are going to keep finding out about their phantom APIs the way most teams still do: during an incident, not before one. The question worth sitting with isn't whether your API inventory has gaps — every inventory does. It's whether you could currently produce, on demand, a complete list of every endpoint serving production traffic right now, including the ones nobody remembers approving. If the honest answer is no, you don't have an API security posture. You have an API security guess, and AI-generated code is making the guess bigger every sprint.