The topic of security covers many different facets within the SDLC. From focusing on secure application design to designing systems to protect computers, data, and networks against potential attacks, it is clear that security should be top of mind for all developers. This Zone provides the latest information on application vulnerabilities, how to incorporate security earlier in your SDLC practices, data governance, and more.
AI, OAuth, and Other Platform APIs in the Core
Architectural Collapse: How Extension Poisoning, Node Vulnerabilities, and Infrastructure Fog Enabled the GitHub Repository Breach
Editor’s Note: The following is an article written for and published in DZone’s 2026 Trend Report, Cognitive Databases, Intelligent Data: Unified Infrastructure for Vector Search, AI-Optimized Queries, and Hybrid Workloads. Many teams find governance gaps only after a retrieval system surfaces stale or unauthorized content in production. Models, agents, and retrieval workflows all depend on enterprise data. Before any of that data reaches an AI system, teams need to know where it originates, how it’s integrated, whether it meets quality expectations, what context enriches it, who can access it, and how it changes over time. This checklist gives engineering, data, platform, architecture, and governance teams a structured way to check whether enterprise data is ready for AI use. It focuses on data lifecycle readiness, not model selection or prompt engineering. Use it before production, then revisit the checks during recurring reviews. Table: Data Lifecycle Overview Lifecycle StageWhat to confirmexample evidenceSource readinessOwned, approved, refreshed, understood data sourcesSource catalog entry, owner recordData preparationReliable integration, quality, standardization, enrichmentQuality report, transformation testGovernance continuityClassification, access, lineage, change controlsAccess policy, lineage recordAI-facing assetsDerived assets tied to source rulesDerived asset inventory, retrieval testProduction feedbackMonitoring, issue routing, remediation closureMonitoring alert, remediation log Source Inventory and Ownership AI data governance starts before any source is exposed to an AI system. Teams need to know which sources are in scope, where the data comes from, how often it changes, and who owns its accuracy; being connected to a source is not the same as being approved to use it. Catalog every data source connected to AI environments, including whether it is approved for AI useRequire domain-owner sign-off before approving a connected source for AI workloads; record approval alongside the source entryDesignate the authoritative source for each business entity before its data is copied or exposed for AI useAssign a named domain owner for each source, responsible for accuracy, freshness, and documented limitationsRecord each source’s refresh schedule and acceptable lag; flag sources without a defined scheduleDocument known data gaps, coverage limits, and quality issues at the source level so consuming teams can account for them Integration, Quality, and Enrichment Raw data should not feed AI systems until teams have checked its quality, resolved inconsistencies, and added the business context needed to interpret it correctly. A connected source can still be too coarse, narrow in scope, or out of date for the workflow it feeds. Teams should resolve these mismatches before the data is exposed to AI systems. Validate that integration jobs handle schema changes, missing fields, and source outages without dropping data silentlyDefine measurable quality thresholds (e.g., completeness, timeliness) before a dataset is approvedAssign a team that must resolve quality failures before the data is approvedStandardize formats, naming conventions, and reference values before data enters AI-facing stores, tools, or servicesEnrich records with business context (e.g., department codes, product hierarchies) that downstream systems need to interpret them correctlyDocument the reference datasets and lookups used to enrich AI-facing records so teams can trace added context back to its sourceTest transformations against known inputs and outputs after each change to confirm that business rules still holdReject or quarantine records that fall below quality thresholds before they affect retrieval results or generated responses Classification, Access, and Use Boundaries AI systems should follow least privilege, only using data approved for the user, workflow, and output at hand. The same access rules apply at every stage the data passes through, including storage, indexes, embeddings, retrieval results, caches, and logs. Sensitivity enforced at the source must stay enforced after the data is copied, transformed, or indexed. Classify data assets by sensitivity level and map each level to permitted usesEnforce least-privilege access across source systems, pipelines, indexes, retrieval tools, and AI services so downstream AI use doesn’t bypass source permissionsDocument whether each AI-facing data store, index, or retrieval service inherits source access at query time or enforces copied ACLsMask or remove sensitive fields before they reach AI services, tools, or promptsMaintain approved and prohibited uses for each sensitivity levelSeparate dev, staging, and prod environments so live data does not leak into experimental systemsRequire explicit approval before adding a new data source or sensitivity category to an AI system Lineage, Provenance, and Change Traceability When a model or agent produces an unexpected result, teams need to trace the data from source to output, with enough detail to link a specific AI response to the inputs behind it. The same trail supports audit and regulatory reviews. Without it, a team investigating an issue has to guess whether the cause was a stale source, broken transformation, or out-of-date index. Capture the source system, extraction time, transformation version, and pipeline run ID for each record prepared for AI useTrack schema changes, business rule updates, and definition/version changes for fields that affect AI interpretation (e.g., “active customer”)Maintain provenance metadata for enrichment steps so added business context can be traced to its sourceLink derived assets (e.g., embeddings, indexes, summaries) to the source records and pipeline versions that produced themRetain lineage records for the period required by regulatory and audit policiesStore lineage records in a system queryable by data, platform, and audit teams independently of the pipelines that produced them Embeddings, Indexes, and Derived Data Assets Embeddings, indexes, summaries, and caches are copies of source data shaped for retrieval, so ownership, classification, access, and lineage controls must carry forward. When a copy falls out of sync with its source, AI systems may retrieve stale context or keep information that should have been updated or deleted. Assign an owner accountable for the accuracy and freshness of each embedding store, vector index, summary cache, or other derived assetDefine a refresh cadence that keeps each derived asset aligned with source data within a documented latency toleranceVersion-derived assets so teams can roll back after a bad source change or failed updateApply the same source system retention, deletion, and access policy rules and changes to derived assetsValidate index, embedding, summary, and cache updates to confirm they return expected results without dropping recordsLog each derived asset creation, update, and deletion with enough detail to link the change to a specific pipeline run AI-Facing Delivery and Retrieval Reliability Upstream governance only matters if the right information reaches the model or agent when it is needed. Retrieval quality problems are usually data quality problems in another form: Stale sources and lagging indexes can both produce confidently wrong answers. Define retrieval quality expectations, including relevance, freshness, and source attribution, for each AI-facing service or tool; assign a named owner accountable for the specDefine when retrieval should return an answer, return search results only, ask for clarification, or return no answerRequire source attribution for retrieval results that cite internal policies, contracts, customer records, account records, or regulated content so generated responses can be checked against the original dataSet latency and throughput targets for retrieval services so slow or overloaded systems do not degrade model responses or agent actionsConfigure alerts when retrieval quality, freshness, or latency falls below thresholds that could affect retrieval results, generated responses, or agent actionsRequire human review for AI-generated outputs that authorize actions, commit transactions, or affect regulated decisionsTest services and tools end to end with representative queries to confirm that responses use the expected sources Monitoring, Feedback, and Lifecycle Change Production reviews should catch stale data, delayed refreshes, quality drift, and unusual access patterns before they affect AI behavior. Recurring AI output issues should be traced to a specific data source, pipeline step, or derived asset so teams can fix the underlying cause. Flag datasets that miss the refresh window defined for their sourceTrack lag between source updates and derived asset refreshes to detect stale responsesConfigure alerts for unusual access patterns (e.g., unapproved users, services, or tools)Assign recurring AI output issues to the responsible data source, pipeline step, or derived asset owner; record the remediation and closureDefine a deprecation process that identifies which pipelines, services, and derived assets must be updated or retired when a source is removedRequire rollback procedures for source changes, schema migrations, and derived asset updates that could degrade AI behaviorConduct recurring reviews to confirm governance controls still match current use cases and access patterns Closing Data readiness for AI is not a one-time launch task. Build these checks into existing data quality and platform reviews, then revisit them when sources, access rules, derived assets, or AI use cases change. This is an excerpt from DZone’s 2026 Trend Report, Cognitive Databases, Intelligent Data: Unified Infrastructure for Vector Search, AI-Optimized Queries, and Hybrid Workloads.Read the Free Report
Large Language Models (LLMs) can automate the development process by producing a substantial amount of web application code in just a few minutes. Nonetheless, it is important to bear in mind that these models are pattern-based and not deterministic. Work in the domain of AI programming assistants shows that AI-based code often exhibits security vulnerabilities in real-world testing. A study on GitHub's features showed that approximately 40% of the generated code was susceptible to security issues, emphasizing the need for careful testing and scrutiny. In other words, programmers and engineers employ a particular mode of working rooted in software methodology, which enables them to tackle this problem straightforwardly and continually incorporate AI-produced code as it is generated. However, AI speed tends to result in errors being overlooked every now and then. In some instances, project managers allocate more rigorous testing because they have to ensure that what people often call correct code" has become "perfect, functional, and secure code." Every code that is deemed complete has to go through a number of tests, from simple static checkups and unit tests to more sophisticated integration tests, end-to-end tests, automated checks for security breaches, capacity checks, and manual code reviews, to ensure that the delivered software is functionally good enough and meets the security requirements. This article presents different testing methods for LLM systems that create HTML code intended for use on the web. Node.js and React are examples of relevant development frameworks used in such software. As an aid to merging the code branches, a pre-merge checklist is also included, along with recommendations on testing the triggers themselves to ensure that the material does not put the security of the system at risk, as far as it is included in the final body of code. Why AI-Generated Web Code Requires Extra Scrutiny It is commonly agreed that traditional bugs are caused by human error. Humans are the source of bugs, especially if they are involved in the development processes. This is different in the case of AI-generated bugs. AI generates bugs that are meant to fit in the missing context somehow by the problem model, leading to code that may appear to work on certain testing conditions; hence, it may bypass, but when the conditions change, the code does not. This lack of logic will mostly be observed at the borders, including the most sensitive parts, such as authentication mechanisms, actions upon ridiculous requests, handling many things simultaneously, reloading, differences in application versions, or vulnerabilities that were enabled due to an incorrect setting of the security defaults. Security is not just a legal obligation. One study used GitHub Copilot to build the most dangerous code by design without any errors in judgment. The study revealed a non-negligible number of insecure code recommendations, with the wording and context of the instructions playing a significant role in these recommendations. Another study utilizing a more sophisticated methodology with current versions of the software confirmed the drawbacks of generating AI-written code. This highlights the significance of the efforts made by individuals using LLMs to develop web features, emphasizing the need for substantial changes from the existing methods to move away from the 'it works on my machine' mentality. Harmonization initiatives should mainly focus on the inputs and outputs of the code, emphasizing important factors such as code execution across various test scenarios that simulate real-world conditions, as well as sharing knowledge with the machine. Testing Layers for LLM-Written Code The main idea is not to rely on a single test format but to use multiple forms, each targeting different types of 'AI errors.' To detect fundamental problems, static assessments such as linting and type verification can be performed, which can help identify certain issues early on. It is crucial that these issues are identified and addressed promptly, as they are expected to be easy to detect and fix quickly. A good tool that can be used for this is ESLint, as it detects the code patterns in JavaScript, which is also well or very well adaptable to the best coding conventions of your organization. Shell npm init @eslint/config@latest npx eslint src/ According to the official documentation of the ESLint tool, it is preferable to execute the following sequence of commands: npm init @eslint/config@latest and then npx eslint on the required files and folders. It is also worth considering that the ruleset should include those designed for heightened security. For example, there is a security plug-in called eslint-plugin-security that is specifically created to monitor the presence of known security problems in JavaScript and Node.js code. Although there may be instances of information misuse, eslint-plugin-security provides good support for developers. Shell npm i -D eslint-plugin-security Once completed, turn on the appropriate rules in your ESLint configuration, noting that different ESLint setups may require slightly different setup techniques. During testing, attention should be paid to the more elusive aspects of the program, such as logic algorithms, edge cases, and consistency testing of the generated results. One of the strategies that is readily comprehensible and offered by Jest is to write tests and use the expect() construct that includes tools and the toBe mechanism to confirm the results as desired. How-to (Node/JS + Jest): JavaScript // utils/sanitizeSlug.js export const sanitizeSlug = (s) => s.trim().toLowerCase().replace(/\s+/g, "-"); // utils/sanitizeSlug.test.js import { sanitizeSlug } from "./sanitizeSlug"; test("sanitizes slugs", () => { expect(sanitizeSlug(" Hello World ")).toBe("hello-world"); }); A helpful routine for improving an LLM model is to consider assumptions about the input. If the model handles input in the form of “slugs being space-separated”, it must be stated in the code, or there is a danger that this code will lead to bugs during real practice tests. Component Tests: Test React Like a User, Not Like a Compiler React Testing Library's emphasis on usability testing is well recognized because such tests help build confidence in the application's functionality. The guide written for React provides no reassurance based solely on best practices and forces the use of React Testing Library for testing instead. How-to (React + React Testing Library + Jest): JavaScript // LoginButton.jsx export function LoginButton({ onLogin }) { return <button onClick={onLogin}>Log in</button>; } // LoginButton.test.jsx import { render, screen, fireEvent } from "@testing-library/react"; import { LoginButton } from "./LoginButton"; test("calls onLogin when clicked", () => { const onLogin = jest.fn(); render(<LoginButton onLogin={onLogin} />); fireEvent.click(screen.getByText("Log in")); expect(onLogin).toHaveBeenCalledTimes(1); }); Integration Tests: Verify Contracts Between Modules and Services Usually, AI-generated code fails to pass the integration testing. This is because the AI model was imprecise in defining the contract, such as response structures, status codes, operation of authentication middleware, database connection procedures, and similar details. When it comes to Node.js applications, many developers would opt for Supertest, which is an extension and support of SuperAgent, which provides HTTP assertion support for testing Node HTTP servers. How-to (Express + Supertest): JavaScript import request from "supertest"; import app from "../app"; test("GET /health returns ok", async () => { await request(app) .get("/health") .expect(200); }); E2E Tests: Make the Browser Prove the Feature Works E2E tests are capable of uncovering defects when described types do not: navigation, live view, data storage, HTTP cookies, access restrictions, and, what is colloquially known as, “working when a user just does whatever.” Cypress strives to be more than just a solution for end-to-end testing. Its documentation contains examples that help you write an end-to-end test from scratch. In contrast, action + assertion chains are more important from the perspective of the playwright, with additional functions for waiting inside elements, which greatly alleviates the necessity to use sleep states only for checks. How-to (install Cypress): Shell npm install cypress --save-dev npx cypress open Those commands are straight from Cypress installation docs. How-to (Playwright E2E test snippet): JavaScript import { test, expect } from "@playwright/test"; test("login redirects to dashboard", async ({ page }) => { await page.goto("/login"); await page.getByLabel("Email").fill("[email protected]"); await page.getByLabel("Password").fill("password123"); await page.getByRole("button", { name: "Log in" }).click(); await expect(page).toHaveURL(/dashboard/); }); Playwright documents this general “do actions, then assert the state” structure, and notes its auto-waiting behavior. Security Testing: Treat “Generated Code” as a Risk Multiplier Use dependency scanning and static analysis to improve the security of web applications. When it comes to reviewing the endpoints and UI flows (auth, access control, injection, etc.) that are generated by the AI model, one may refer to the list of the OWASP Top 10 guidelines as a checklist. Dependency Scanning (Snyk + npm audit): Snyk test checks for open-source vulnerabilities and license issues. npm audit exits non-zero on found vulnerabilities (ideal for CI gates). How-to (Snyk): Shell Copy snyk test How-to (Snyk Code SAST): Shell Copy snyk code test Snyk code test, called Snyk, performs Static Application Security Testing against the source code. Although not required, it is advised to activate the CodeQL extension in GitHub Actions in order to conduct code scanning. Performance Testing: “It Works” Is Not “It Survives Traffic” AI-scripted functionalities are also known to cause performance detriment, either through the introduction of additional DB access calls or multiple N+1 queries; thus, it is advisable to smoke load test critical routes. k6 has proper documentation on how to write and execute such tests. How-to (k6 smoke test): JavaScript import http from "k6/http"; import { check, sleep } from "k6"; export default function () { const res = http.get("https://example.com/api/health"); check(res, { "status is 200": (r) => r.status === 200 }); sleep(1); } Both k6 and Artillery are equipped with documentation on how to formulate HTTP requests and set up tests. Artillery can be installed either through npm or npx to execute tests. Snapshot and Golden Master Testing: Use Sparingly, Review Aggressively Creating snapshot tests is useful for monitoring changes in different versions of the app that should not be changed quietly (such as HTML email templates, stable fragments of the user interface, etc.). The Jest snapshot file requires verification of snapshot outputs alongside code modifications, which are then reviewed to prevent misunderstandings; Jest compares future runs with past snapshots and reports errors if discrepancies are found. How-to (Jest snapshot): JavaScript import renderer from "react-test-renderer"; import { Banner } from "./Banner"; test("banner matches snapshot", () => { const tree = renderer.create(<Banner />).toJSON(); expect(tree).toMatchSnapshot(); }); The Ultimate Golden Master Hack for LLM Code Code Review: The Essential Human Layer Among other things, code review is an important step, as it allows for the asking of questions such as “Is this approach valid?” or even “Is this in line with the architecture?” The Secure Software Development Framework (SSDF) by the National Institute of Standards and Technology (NIST) exists because most Software Development Life Cycles (SDLCs) tend to ignore security at the source. It promotes the incorporation of safe behavioural patterns into the existing cycle of activities. Such mechanisms as code review and other process controls remain significant as they are aimed at human beings and not machines. For AI-generated PRs, code review should explicitly check: authz/authn boundariesinput validation and encodingerror handling and loggingdependency choices“magic” regexes and crypto (danger zone) Testing the Prompt and Validating LLM Outputs A great number of teams skip over the small fact that the prompt is itself a code construct. Since the wording of the prompt can generate particular behavior, it is imperative that it is put to the test, as is done for APIs. Workflow: Define prompt contract: Templates with stack, versions, constraints, and testing requirements.Request tests: Generate and run unit/integration tests before trusting features.Create regression suite: Store prompts, invariants, and run tests/scripts.Use checklists: Keep prompts as review checklists for each PR. Before merging AI-generated code, require: lint/type checks pass (ESLint) unit + integration pass (Jest + Supertest patterns) at least one E2E flow passes (Cypress/Playwright) dependency scan passes or is triaged (Snyk / npm audit) Tool Comparisons and When to Use What No tool here is used inappropriately since Jest is designed for unit testing; React Testing Library, for testing those aspects of unit that show nuances to end users; Supertest – for HTTP server testing; Cypress alongside Playwright – for E2E testing; Snyk and SAST are used for scanning dependencies; GitHub CodeQL along with k6 and Artillery are used for scanning and testing codes as well as for load testing, respectively. Commonly Used Commands ESLint: npm init @eslint/config@latest then npx eslint src/Cypress: npm install cypress --save-dev then npx cypress openSnyk Dependency Scan: snyk testSnyk SAST: snyk code testnpm Dependency Audit: npm auditk6: Write a script, then run with k6Artillery: npm install -g artillery@latest (or npx artillery@latest), then artillery run my-test.yml CI automation with test gates The main purpose of the layers is to verify that they are genuine and feasible. GitHub Actions are automation scripts written in YAML syntax that can contain steps and jobs. According to the instructions provided in the official guide for Node.js Actions by GitHub, the following three standard procedures must be followed: installing Node, injecting dependencies into the environment, and testing. Minimal GitHub Actions workflow example YAML name: CI on: pull_request: push: branches: [main] jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-node@v4 with: node-version: 22 cache: npm - run: npm ci - name: Lint run: npx eslint . - name: Unit + integration run: npm test - name: Dependency scan run: | npm audit snyk test env: SNYK_TOKEN: ${{ secrets.SNYK_TOKEN } - name: SAST scan (optional but strong) run: snyk code test env: SNYK_TOKEN: ${{ secrets.SNYK_TOKEN } The actions/setup-node install and cache the Node in workflows. The playwright furnishes CI hints and assists in creating GitHub Actions workflows. GitHub documentation goes into depth on how CodeQL operates within GitHub actions workflow. Before You Merge: Checklist, Pitfalls, and Mitigations Pre-merge checklist for AI-generated web code Lint passes: Ensure lint passes and security lint is reviewed.Unit tests: Cover model assumptions (edge cases, input shapes).Integration tests: Confirm API contracts (status codes, auth, schema).E2E tests: Cover at least one critical user journey (e.g., login).Dependency scan: Run Snyk or npm audit and triage findings.SAST: Run Snyk code test or CodeQL for risky changes.Snapshot diffs: Review snapshot diffs like code; no auto-updates.CI checks: Require CI checks before merge; no exceptions. Common Pitfalls (and How to Avoid Them) Passing tests: Test real user behavior, as React Testing Library suggests.Over-mocking: Avoid mocking everything—use a real test DB.Flaky E2E tests: Use Playwright to reduce timing issues.Snapshot testing: Don’t auto-update snapshots; review them.Skipping security scanning: Include security checks for all PRs, big or small. Closing thought The most important responsibility assigned to an AI for web generation is to ensure that validation becomes unobtrusive and a straightforward activity. Individuals tend to trust processes that are uniform and happen repeatedly. Although LLMs streamline the process of generating the coding, it is the exhaustive examination that makes the process of its correctness more efficient. Those who achieve the desired outcomes are the ones who do all the above: set up CI-enforced gates, apply testing at different levels, and treat prompts as part of the design.
I have spent the better part of a decade building data protection products for global enterprises. Cloud DLP, CASB, SSPM, Behavior Threats, AI Access Security, ISPM, etc. The kinds of things that sit between a user, an agent, or an application and the sensitive data nobody wants to see in the wrong place. Every conversation I have had with a customer security architect this year eventually arrives at the same question. The threat landscape has clearly changed. What does that mean for the controls we already own? This article is the analysis I have been sharing with security architects across industries who are evaluating how their data protection programs need to evolve. It is grounded in what is publicly documented, what it actually changes for enterprise data security, and where I would direct the next dollar of investment based on a decade of building these products at scale. What Actually Shifted, With Sources There are three publicly verifiable data points worth understanding before any control conversation makes sense. Discovery Is Becoming Inexpensive Mozilla shipped Firefox 150 in April 2026 with two hundred and seventy-one fixes that came out of a single sweep using an early version of Anthropic’s Mythos preview model. That is roughly four times the project’s typical annual baseline, in one pass. Mozilla also added the most honest sentence I have read on this topic all year. They said they have not seen any bug in the set that an elite human researcher could not have found, given enough time. SecurityWeek covered the details: securityweek.com/claude-mythos-finds-271-firefox-vulnerabilities. Read that caveat carefully. The thing that became automated is not novelty. It is the cost of finding a class of bugs that humans were always capable of finding. When the price of an action drops by an order of magnitude, the action gets done at scale. That is the shift, and it is the shift that matters. Patching Is Not Getting Cheaper at the Same Rate HackerOne paused new submissions to its Internet Bug Bounty program on March 27, 2026. The IBB is the oldest crowdsourced vulnerability reward program for open source, dating back to 2013. The pause was not a budget decision. It was an admission that the gap between AI-assisted discovery volume and the ability of volunteer maintainers to ship patches had become unbridgeable on the existing incentive model. Dark Reading’s coverage is here: darkreading.com on the IBB pause. Earlier in the year, the curl project removed bounties from its program for the same reason, after a wave of low-quality AI-generated submissions overwhelmed triage. If the upstream open source ecosystem is struggling to keep pace with discovery, every enterprise that ships software with open source dependencies is downstream of that struggle. That is most enterprises. Autonomous Agents Are Already Creating Real Incidents In April 2026, the Cloud Security Alliance published two surveys that I think every data security team should read. The first study found that fifty-three percent of organizations have had AI agents exceed their intended permissions, and forty-seven percent have already experienced a security incident involving an agent in the past year. The second, published a week later, reported that eighty-two percent of enterprises have discovered previously unknown agents running in their environments, and sixty-five percent have had an agent-related incident. The most common consequence was data exposure. CSA’s findings: Enterprise AI Security Starts with AI Agents and Autonomous but Not Controlled. Take those three threads together. Bug discovery is industrializing. The patch side is bottlenecked. And inside the enterprise, autonomous agents are already operating in places nobody fully maps. That is the operating reality, not a forecast. Why This Matters More for Data Security Than for Any Other Function Most of the AI security conversation is framed around vulnerabilities and exploits. I think that framing misses what is actually changing for enterprises. When a class of vulnerabilities becomes cheaper to discover, the average time between exposure and exploitation shortens. When average exposure time shortens, the probability that any given control fails inside that window goes up. When more controls fail more often, the consequence shows up at the data layer. Data is the asset. Everything else is a path to it. The CSA finding I keep coming back to is the one that says agent incidents most often produce data exposure, not service outages. That tracks with what I see at customer sites. The blast radius of an agent compromise is determined by the data the agent had access to, the policies that were being watched, and the speed at which someone noticed. None of those three is improving on the timeline that adversaries are improving. If an agent has access to your sensitive data, the agent is part of your data security perimeter, whether your DLP product knows it or not. That sentence is the part of the conversation that I find most data security teams are not yet having internally. It needs to happen this quarter. Three Things Data Security Programs Should Rethink Now 1. Stop Treating Non-Human Identities as a Hygiene Problem CyberArk’s 2025 Identity Security Landscape, surveying 2,600 cybersecurity decision-makers globally, found that machine identities now outnumber human identities by more than 80 to 1 in the typical enterprise, up from roughly 45 to 1 in their 2024 study. GitGuardian’s State of Secrets Sprawl 2025 report found 23.8 million new secrets exposed on public GitHub in 2024 alone, a 25 percent year-over-year increase, with non-human identities flagged as the dominant credential population behind that growth. The exact ratio in any given environment is a question for the IAM team, but the order of magnitude is consistent across every serious study I have read, and it is rising fast. Most enterprise IAM programs were designed around human users. Periodic access reviews. Quarterly attestation cycles. Manager signoff. None of that infrastructure was built for a population that is now eighty times larger, that provisions itself, and that often outlives its original use case. CSA’s research found that only 21 percent of organizations have a formal decommissioning process for AI agents. Everyone else is accumulating what the report calls retirement debt: agents who completed their task months ago and still hold credentials, tokens, and data access. From a data security standpoint, the practical consequence is that an enterprise’s most overprivileged identity is rarely a person. It is a service account from 2022 that nobody remembers, an OAuth grant that an integration test attached to a real production scope, or a workflow agent that picked up admin-level permissions during deployment because the person setting it up did not want to debug a permission-denied error at 11 p.m. These identities are reachable by adversaries through a single credential compromise, and they often have direct access to the kinds of data that DLP policies were written to protect at the human user layer. The remediation requires a structured non-human identity program with a named owner, a defined lifecycle covering provisioning, rotation, and decommissioning, and quarterly access reviews that apply to bots the way they apply to humans. Workload identity federation rather than long-lived secrets. Scoped service accounts. Logging that captures what each non-human identity touched, not just whether it authenticated successfully. From a tooling perspective, this work sits at the intersection of CASB, IAM, and DLP, and in most enterprises, it has no clear owner across those three functions. Establishing that ownership is the precondition for everything else. 2. Refresh Classification and Tagging for an Agentic Environment In my own work on DLP product strategy, I have come to think of classification and tagging as the foundation that every other data control sits on. If sensitive content is correctly identified at the moment it is created or ingested, downstream policies have a fighting chance. If it is not, no amount of policy authoring downstream will catch up. Most enterprise tagging programs were designed for documents flowing through email, endpoints, and a manageable list of SaaS applications. The current generation of AI agents and copilots flows through none of those choke points cleanly. An agent reads a corpus, generates a derivative artifact, and writes that artifact somewhere else. The original tag, if there was one, often does not survive the round trip. The derivative may contain sensitive content that was reassembled from sources that were each individually below the policy threshold. Three practical refreshes are worth funding now. Treat AI-generated outputs as a first-class data class. Anything produced by an agent or copilot needs provenance metadata that travels with it: which model produced it, against which prompt, derived from which sources, with which level of human review. Most enterprise classification taxonomies do not have a slot for this yet. Add one.Lower the threshold for tagging at ingestion. The cost of misclassifying a sensitive document used to be that a human eventually emailed it to the wrong person. The cost now includes an agent reading it as part of a larger context and producing a derivative that lands in a SaaS workspace your DLP product does not inspect. Err on the side of more aggressive classification at the source.Audit your DLP coverage of LLM endpoints and agentic SaaS surfaces. Most DLP deployments I see in the field have rich coverage of email and endpoints, partial coverage of cloud applications, and almost no coverage of the LLM and agent traffic that has become a meaningful share of how sensitive data now leaves the environment. That is the coverage gap most likely to show up in a 2026 incident report. 3. Put a Model in the Pull Request Path This is one of the few areas where the offensive shift in AI capability cuts directly in defenders’ favor, and most enterprise application security programs are not yet using it. The traditional SAST and DAST queue is where AppSec hours go to die. Thousands of unverified findings, most of them noise, validated entirely by humans on a backlog that never empties. The newer pattern is to put a model-based reviewer in the pull request path itself. Every PR is reviewed by an automated agent for security defects before a human sees it. Findings show up as inline comments. High-confidence findings can block the merge. OpenAI publicly stated in April 2026 that its Codex Security agent has contributed to over 3,000 critical and high-severity vulnerability fixes across the ecosystem since launch, and that its Codex for Open Source program now provides free security scanning to more than 1,000 open-source projects. Anthropic, Semgrep, and several other vendors have shipped comparable capabilities. Whether you build on a commercial offering or assemble an internal pipeline, the workflow is what matters. One nuance worth knowing about. Standard commercial models often refuse legitimate dual-use security queries by policy. Binary reverse engineering, exploit reasoning, malware analysis. If your AppSec team has been telling you that AI tools “do not work for security,” this refusal threshold is usually the reason. Both Anthropic’s Glasswing program and OpenAI’s Trusted Access for Cyber, expanded on April 14, 2026, to thousands of verified individual defenders, exist precisely to provide a lower refusal threshold for verified defensive use cases. Enterprise procurement and legal teams should start the verification paperwork now, not after a need arises. The Supply Chain Is the Other Half of the Data Exposure Problem Two recent incidents are worth holding in mind whenever this conversation comes up. On September 8, 2025, eighteen widely used npm packages, including chalk, debug, and ansi-styles, were trojanized after a phishing campaign targeting the maintainer known as qix. Those eighteen packages collectively account for over 2.6 billion weekly downloads. The malicious versions were live for roughly two hours and were written to drain cryptocurrency wallets, but the same access could have been used to exfiltrate environment secrets, build credentials, or sensitive data from any CI pipeline that pulled the bad version during that window. Palo Alto Networks Unit 42 and others published detailed breakdowns: paloaltonetworks.com on the qix incident. A week later, on September 15, 2025, the Shai-Hulud worm became the first self-propagating malware in the npm ecosystem, compromising hundreds of packages in its initial wave and continuing to evolve through follow-on campaigns into late 2025 and early 2026. The malware integrated TruffleHog to scan for secrets in compromised environments, harvested credentials from cloud instance metadata services where available, and weaponized GitHub Actions workflows for persistence. Palo Alto Networks Unit 42, ReversingLabs, Wiz, and others have continued to track variants of the same family. The reason these matter for a data security conversation is that the attacker's objective in both cases was credentials and secrets in build environments. From there, the path to data is short. A compromised CI runner with cloud credentials can read whatever those credentials can read. A compromised GitHub token can read whatever the org allows. A compromised npm publish token can introduce a future payload that does both. Treat the build pipeline as a data security boundary, not just an engineering productivity surface. A dependency firewall that validates package provenance before installation (Sonatype Nexus Firewall, JFrog Xray, Socket.dev, or equivalents) is the highest-leverage single control I know of for closing this attack surface. The Shadow Agent Problem Is a DLP Problem in Disguise The single most striking statistic in the April 2026 CSA research, to me, was that eighty-two percent of organizations had discovered previously unknown AI agents in their environment over the past year, and forty-one percent had discovered them more than once. The agents most commonly emerged in internal automation and scripting environments, in custom assistants and plugins built on LLM platforms, in SaaS tools with built-in automation, and in developer-created workflows. This is, structurally, the same problem that shadow IT was a decade ago, and the same problem that shadow SaaS became five years ago. The difference is that the average shadow agent has read access to more sensitive data than the average shadow application ever did, because agents are useful precisely in proportion to how much context they can reach. A finance team’s reconciliation agent, helpfully built in an afternoon, often ends up with broader visibility into financial data than the human who built it. A customer support copilot frequently has a service account with access to the entire ticket database, including PII. None of this is malicious. It is the path of least resistance for getting an agent to do something useful. Three controls help close the gap, and they are mostly extensions of capabilities a mature data security team already owns. CASB and SSPM coverage of LLM and agent platforms. The platforms hosting these agents (custom GPTs, Copilot Studio, internal MCP servers, vendor copilots) are SaaS applications. They need posture management, sanctioned application policies, and inline data protection just as much as Salesforce or Workday do. Most CASB and SSPM deployments are still catching up here. Push your vendor.Inline DLP on prompt and completion traffic. The point at which sensitive data leaves the environment is increasingly the prompt itself. Inline data inspection at the LLM gateway, using the same content matchers (EDM, IDM, OCR, vector ML) you trust for email and endpoints, is the right architectural pattern. The vendors are building this, but few enterprises have it deployed.An agent registry, even a basic one. Until the agent population is enumerable, no policy applied to it is provable. A spreadsheet is fine to start. The goal is to be able to answer, on any given Monday, three questions: which agents exist in production, what data each one can read, and who is the human owner of each. CSA’s data shows that most enterprises cannot answer those questions today. What I Would Actually Start on This Week Comprehensive ninety-day plans tend to lose momentum after the first two weeks of execution. The more effective approach, which I have refined over years of operationalizing data security programs at enterprise scale, is a focused set of starting moves that can ship in two weeks and that compound into a larger program over the quarter. Run an inventory pass for AI agents and copilots in your environment. Spreadsheet is fine. Capture name, owner, data scope, and approval status. The goal is to convert the CSA shadow agent statistic from an industry survey number into a number you actually have for your own organization.Review the data scope of every service account and OAuth grant tied to an LLM, agent, or copilot platform. Most of them were sized for development convenience, not production. Tighten the ones that need tightening. Decommission the ones that are no longer in active use.Pilot a model-based reviewer in the pull request path of one codebase. Measure the false positive rate and developer satisfaction at week four. If the numbers are reasonable, expand. If they are not, tune and try again.Add provenance metadata to your data classification taxonomy. Even if the only label you can ship this quarter is “generated by an AI system,” shipping it now is more valuable than waiting for a perfect schema. Tagging at ingestion is the part of the program that compounds, and it has been undersized for the agent era at most enterprises I have seen.Open the verified access conversation with your AI vendors. Anthropic Glasswing, OpenAI Trusted Access for Cyber, and equivalent programs from other providers offer pathways to models with reduced refusal thresholds for legitimate defensive work. The application process involves coordination with General Counsel and procurement, which is why initiating it before an urgent need is critical. Programs of this kind will become foundational infrastructure for enterprise security teams over the next two years. These moves represent the structural transition that enterprise data security programs need to make over the next eighteen months. Programs that begin this work now will spend that window refining the controls and integrating them across their existing security stack. Programs that delay will spend the same window writing postmortems that explain why the controls were not in place. Conclusion The cybersecurity industry has navigated several genuine inflection points over the past decade, and the current moment qualifies as one of them on a specific structural ground: the cost curve for finding software flaws has bent, while the cost curve for shipping patches has not. The gap between those two curves is where every enterprise security program now operates, and the consequences land first at the data layer, which is where my work has been concentrated for the past decade. Data security teams that internalize this framing now will spend 2026 building defensible programs around a fundamentally changed threat economy. Teams that wait for a more dramatic signal will spend the same period responding to incidents that the structural shift made predictable.
With the increasing number of security threats, organizations have invested heavily in cybersecurity initiatives to protect their applications, infrastructure, and sensitive data. Security vulnerabilities are rarely introduced intentionally. Most of them creep into applications through shortcuts, overlooked edge cases, outdated libraries, or some bad coding habits. Modern Java has significantly improved its security capabilities, but no framework or JVM version can completely protect an application from insecure coding practices. As developers, we still need to understand where vulnerabilities originate and how to prevent them before they reach production. In this article, I am trying to summarize some of the most common Java security vulnerabilities and practical techniques used to prevent them. These are the same security best practices and lessons learned that I frequently share with new team members joining my team. I am sharing them here in the hope that they can serve as a practical handbook for Java developers looking to build more secure applications. 1. SQL Injection SQL injection remains one of the oldest and most dangerous vulnerabilities. It occurs when user input is directly concatenated into SQL statements. Consider the following example: Java String query = "SELECT * FROM users WHERE username = '" + username + "'"; Statement stmt = connection.createStatement(); ResultSet rs = stmt.executeQuery(query); If an attacker enters, the query can be manipulated to return unintended results. SQL admin' OR '1'='1 Prevention Always use parameterized queries. Java String query = "SELECT * FROM users WHERE username = ?"; PreparedStatement stmt = connection.prepareStatement(query); stmt.setString(1, username); ResultSet rs = stmt.executeQuery(); Prepared statements separate data from executable SQL, eliminating injection opportunities. 2. Hardcoded Secrets One of the most common findings during security reviews is hardcoded credentials. Java private static final String API_KEY = "abcd123456789"; This may seem harmless during development, but once committed to source control, secrets often remain exposed indefinitely. Prevention Store secrets externally. SQL String apiKey = System.getenv("PAYMENT_API_KEY"); Better alternatives are to include it in AWS Secrets Manager, Azure Key Vault, HashiCorp Vault, or Kubernetes Secrets. Secrets should never live inside source code repositories. 3. Insecure Deserialization Java serialization has been responsible for numerous security incidents. Example: Java ObjectInputStream input = new ObjectInputStream(request.getInputStream()); Object obj = input.readObject(); The danger is that attackers can craft malicious serialized objects that execute unexpected code during deserialization. Prevention Avoid Java serialization whenever possible. Prefer formats such as JSON, XML (with secure parsing), or Protocol Buffers. Example using Jackson: Java ObjectMapper mapper = new ObjectMapper(); User user = mapper.readValue(json, User.class); Using structured formats reduces attack surfaces significantly. 4. Cross-Site Scripting (XSS) Although often associated with front-end applications, backend services can accidentally enable XSS vulnerabilities when user-generated content is returned without sanitization. Example: Java String comment = request.getParameter("comment"); response.getWriter().write(comment); If the user submits, the browser executes the script. HTML <script>alert('Hacked')</script> Prevention Always encode output. Using Spring: Java String safeComment = HtmlUtils.htmlEscape(comment); Additionally, validate inputs, sanitize rich text, and implement Content Security Policies (CSP). 5. Path Traversal Attacks File download functionality often introduces path traversal vulnerabilities. Example: Java String file = request.getParameter("file"); Path path = Paths.get("/documents/" + file); An attacker could submit and potentially access sensitive files. Shell ../../../etc/passwd Prevention Normalize and validate paths. Java Path base = Paths.get("/documents"); Path resolved = base.resolve(file).normalize(); if (!resolved.startsWith(base)) { throw new SecurityException( "Invalid file path"); } Never trust file names coming directly from user input 6. Weak Password Storage Storing passwords improperly remains surprisingly common. Bad practice: Java String passwordHash = DigestUtils.md5Hex(password); MD5 and SHA-1 are no longer considered secure for password storage. Prevention Use adaptive hashing algorithms. Example with BCrypt: Java BCryptPasswordEncoder encoder = new BCryptPasswordEncoder(); String hash = encoder.encode(password); BCrypt automatically includes salting and work-factor adjustments. Other strong alternatives include Argon2, PBKDF2 or SCrypt 7. Dependency Vulnerabilities Modern Java applications often contain more third-party code than custom code. A secure application can still become vulnerable because of outdated dependencies. Prevention Integrate dependency scanning into CI/CD pipelines. Example Maven plugin: XML <plugin> <groupId>org.owasp</groupId> <artifactId>dependency-check-maven</artifactId> </plugin> Additionally, tools such as Snyk can automatically identify known vulnerabilities. We have been using Snyk for the last couple of years, and it is effective. Regular dependency updates should be part of every release cycle. 8. Improper Logging of Sensitive Data Developers often log information for troubleshooting without considering security implications. Example: Java logger.info( "Login request received for user={} password={}", username, password); This exposes credentials inside log files. Prevention Mask or exclude sensitive information. Java logger.info( "Login request received for user={}", username); Never log passwords, access tokens, credit card information, Personal health information (PHI), or PII information. This is especially important in regulated industries such as healthcare, like ours. 9. Insufficient Authentication and Authorization Authentication verifies identity, and authorization determines access. Many applications perform authentication correctly but fail to enforce authorization consistently. Example: Java @GetMapping("/admin/users") public List<User> getUsers() { return userService.findAll(); } Without authorization checks, any authenticated user might gain access. Prevention Use role-based security. Java @PreAuthorize("hasRole('ADMIN')") @GetMapping("/admin/users") public List<User> getUsers() { return userService.findAll(); } Security should be enforced at every layer, not just the UI. 10. Lack of Input Validation Many vulnerabilities originate from accepting unexpected input. Example: Java String age = request.getParameter("age"); int userAge = Integer.parseInt(age); Invalid input can cause exceptions or unexpected behavior. Prevention Validate all external input. Java @Min(18) @Max(120) private Integer age; Bean Validation provides a simple and consistent approach for validating request payloads. Never assume user input is safe. Final Thoughts Security is not a feature that can be added at the end of a project. It needs to be part of the development process from the very beginning. The vulnerabilities discussed here are not theoretical. They are among the most common findings during security assessments, penetration tests, and production incident investigations. Fortunately, modern Java provides mature frameworks, libraries, and tools that make secure development significantly easier than it was a decade ago. The key is building security awareness into everyday development practices: Use parameterized queriesProtect secrets properlyValidate all inputsKeep dependencies updatedApply strong authentication and authorizationLog responsiblyContinuously scan for vulnerabilities Security is ultimately about reducing risk. Small improvements applied consistently across a codebase can prevent incidents that would otherwise become expensive lessons later.
This is the third follow-up to Friday's release post. Saturday's was about how you iterate; yesterday's was about new platform APIs in the core; today's is about a run of pieces that change how you write the structural parts of an app. The pieces are an OpenAPI client generator, a SQLite ORM, JSON and XML mappers, a component binder with validation, build-time SVG and Lottie transcoders, and a declarative router with deep links. All ride on a single build-time codegen pipeline: a Maven-plugin pass that reads annotations or declarative source files at build time and emits typed Java that compiles into your binary. No reflection, no service loader, no Class.forName. The "How it works" section at the end of this post covers the codegen plumbing once you have seen what it powers. OpenAPI Client Generation The headline of this release for any team that talks to a backend. A new cn1:generate-openapi-client Mojo reads an OpenAPI 3.x JSON spec (a URL or a local file) and writes typed Codename One client code that compiles into your app: One @Mapped POJO per components.schemas entry.One <Tag>Api.java class per OpenAPI tag, with one fluent method per operation.Every method routes through Rest.<verb> + Mappers.toJson + fetchAsMapped / fetchAsMappedList, so the generated surface integrates with the rest of the framework instead of dragging in a separate HTTP stack. Wire it into the project's pom.xml: XML <plugin> <groupId>com.codenameone</groupId> <artifactId>codenameone-maven-plugin</artifactId> <executions> <execution> <id>petstore-client</id> <goals><goal>generate-openapi-client</goal></goals> <configuration> <specUrl>https://petstore3.swagger.io/api/v3/openapi.json</specUrl> <basePackage>com.example.petstore</basePackage> </configuration> </execution> </executions> mvn generate-sources picks the spec up, downloads it, and writes one file per schema and one per tag under target/generated-sources/. The Petstore reference spec exercised end-to-end produces six model classes (Pet, Order, Customer, Tag, Category, User) and three API classes (PetApi, StoreApi, UserApi), and the nine generated .class files compile cleanly against codenameone-core. Documented at the OpenAPI codegen Maven goal. In application code you call the generated Api class the same way you would call any other Java method: Java PetApi pets = new PetApi(); // Returns AsyncResource<Pet>; resolves with the deserialised object. pets.getPetById(42).onResult((pet, err) -> { if (err == null) Log.p("Got " + pet.getName()); }); // Returns AsyncResource<List<Pet>>. pets.findPetsByStatus("available").onResult((list, err) -> { if (err == null) { for (Pet p : list) Log.p(p.getName()); } }); // POST with a request body. addPet takes a Pet, returns a Pet. Pet candidate = new Pet(); candidate.setName("Mittens"); candidate.setStatus("available"); pets.addPet(candidate).onResult((created, err) -> { /* ... */ }); There is no hand-rolled ConnectionRequest setup, no manual JSON parsing, no string-typed request bodies. The generated client takes a typed Pet, serializes it with Mappers.toJson(...), fires the right HTTP verb, deserializes the response with Mappers.fromJson(...), and surfaces the result through the framework's AsyncResource so your callback fires on the EDT. For teams who already publish an OpenAPI spec as part of their backend (most modern backend frameworks do this automatically; FastAPI, Spring's springdoc-openapi, NestJS, ASP.NET Core, Go's gnostic), the practical effect is that the mobile client's bindings stay in sync with the backend without anyone hand-writing a single network call. Update the spec, re-run mvn generate-sources, and the new and changed endpoints land in your app as typed Java; the IDE picks up immediately. It is the kind of change that is most useful when you do not know you have it: pull a fresh spec, rebuild, and your IDE highlights every place in the codebase that called a renamed endpoint or passed the wrong type to a parameter. SQLite ORM @Entity marks the class; @Id and @Column shape the schema; @DbTransient opts a field out: Java @Entity public class TodoItem { @Id @Column long id; @Column String title; @Column(name = "completed_at") Date completedAt; @DbTransient Object cachedView; } Dao<TodoItem> dao = EntityManager.open("todos.db").dao(TodoItem.class); dao.createTable(); dao.insert(new TodoItem(0, "Read the post", null)); List<TodoItem> open = dao.find("completed_at IS NULL", new Object[] {}); TodoItem byId = dao.findById(42); dao.delete(byId); The generated DAO does the typed work underneath. No reflection in insert; the generated code calls setString(1, e.title) and setLong(2, e.id) directly against the SQLite PreparedStatement. Validation at build time catches missing @Id, fields that look like relationships but are not yet supported, and abstract entity classes; the build fails with a class name and a reason. For JPA/Hibernate developers, the API is intentionally familiar. @Entity, @Id, @Column, and @Transient (here renamed @DbTransient to avoid colliding with java.beans.Transient) carry the same meaning they do under javax.persistence / jakarta.persistence. The EntityManager name is the same. Dao#findById, Dao#findAll, Dao#find(where, params), Dao#insert, Dao#update, Dao#delete line up with the basic JPA repository contract. The query language is plain SQL (there is no JPQL or Criteria DSL), but the annotation surface, the lifecycle, and the runtime methods will feel like a long-lost friend to anyone with server-side Java persistence experience. JSON/XML Mapping @Mapped marks a class as a transferable POJO. @JsonProperty and @XmlElement (plus @XmlRoot, @XmlAttribute, @JsonIgnore, @XmlTransient) shape the wire format. The runtime entry points are Mappers.toJson(...), Mappers.fromJson(...), Mappers.toXml(...), Mappers.fromXml(...): Java @Mapped public class User { @JsonProperty("user_id") long id; @JsonProperty String name; @JsonProperty("created_at") Date createdAt; @JsonIgnore String passwordHash; } String json = Mappers.toJson(user); User back = Mappers.fromJson(json, User.class); The same @Mapped POJO is the type the typed Rest helpers accept: Java Rest.get("https://api.example.com/users/42") .fetchAsMapped(User.class) .onResult((user, err) -> { /* ... */ }); Rest.get("https://api.example.com/users") .fetchAsMappedList(User.class) .onResult((users, err) -> { /* ... */ }); Rest.fetchAsJsonList (top-level JSON arrays, no {"root":[...]} envelope trick), JSONWriter (the complement of JSONParser, with fluent builders and streaming variants for Writer and OutputStream), and URLImage.setDefaultBearerToken (auth headers on image fetches) all ship alongside. For JAXB developers, the XML surface (@XmlRoot, @XmlElement, @XmlAttribute, @XmlTransient) is a direct port of the long-established javax.xml.bind.annotation surface. The same model class can be both @XmlRoot-decorated and @JsonProperty-decorated, which gives you a single source of truth for both wire formats. The JSON surface adopts the Jackson convention (@JsonProperty, @JsonIgnore) that nearly every modern JVM JSON binding (Jackson, Moshi, kotlinx-serialization) inherited. Component Binding With Validation The fourth annotation processor on the same pipeline is the component binder. @Bindable marks a model class; @Bind(name = "userField") ties a field to a component on a form by the component's name. Field-level validation annotations compose with @Bind on the same field: Java @Bindable public class SignupModel { @Bind(name = "userField") @Required @Length(min = 3) private String user; @Bind(name = "emailField") @Required @Email private String email; @Bind(name = "ageField") @Numeric(min = 13, max = 120) private String age; @Bind(name = "roleField") @ExistIn({ "admin", "editor", "viewer" }) private String role; } The matching form sets a name on each component so the binder can find them: Java TextField user = new TextField(); user.setName("userField"); TextField email = new TextField(); email.setName("emailField"); TextField age = new TextField(); age.setName("ageField"); ComboBox<String> role = new ComboBox<>("admin", "editor", "viewer"); role.setName("roleField"); Button submit = new Button("Sign up"); Form form = new Form("Sign Up", BoxLayout.y()); form.add(user).add(email).add(age).add(role).add(submit); form.show(); SignupModel model = new SignupModel(); Binding binding = Binders.bind(model, form); binding.getValidator().addSubmitButtons(submit); Binding is the handle: refresh() re-reads the model into the components, commit() writes the components back, disconnect() tears the listeners down. Multiple validation annotations on a single field compose via Validator.addConstraint(Component, Constraint...) and GroupConstraint (first failure wins). @Validate(MyClass.class) is the escape hatch for hand-written Constraint implementations. The validation set: @Required, @Length, @Regex, @Email, @Url, @Numeric, @ExistIn, @Validate. The new BindAttr enum lets @Bind target a specific attribute of the component (TEXT, UIID, SELECTED, ...) when the default ("write a String field into the component's text") is not what you want. SVG at Build Time Drop an SVG into src/main/css/, alongside theme.css: Shell src/main/css/ theme.css star.svg gradient_circle.svg path_arrow.svg rounded_button.svg wave.svg pro_badge.svg After the next build, every SVG is a regular Codename One Image. An SVG handled by the transcoder is a vector image, but it is still an Image. Everywhere a raster Image works (Label.setIcon, Button.setIcon, BorderLayout.NORTH, the toolbar, a MultiButton's leading icon, a CSS background: url(...) rule), the SVG works too. The difference is that it stays crisp at any size: the same source file is sharp at a 16-point list-row icon, a 64-point hero header, and a 256-point launch screen, on every DPI bucket. A grid of the static SVGs from the hellocodenameone fixture, rendered through the new pipeline: Sizing in Millimeters The SVG transcoder's most useful feature is also the one most easily missed: size every SVG in millimeters from CSS. SVGs in the wild routinely declare odd width / height attributes (a 1024×1024 export of a 24×24 icon, no dimensions at all, design-pixel values from one specific framework). Pinning the rendered size in millimeters sidesteps all of that. CSS HomeIcon { background: url(home.svg); cn1-svg-width: 6mm; cn1-svg-height: 6mm; bg-type: image_scaled_fit; } LogoBanner { background: url(logo.svg); cn1-svg-width: 32mm; cn1-svg-height: 12mm; } A 6 mm icon is 6 mm tall on a 1× desktop, 6 mm on a high-DPI handset, and 6 mm on a 4K tablet. The transcoder routes both values through Display.convertToPixels() at install time, the same way font-size: 3mm already behaves elsewhere in Codename One CSS. No design-pixel guesswork, no DPI bucket to choose, no scaling surprise when the artist re-exports the source SVG at a different resolution. If a project does not use CSS for theming, the two-float constructor on the generated class takes millimeters directly: new com.codename1.generated.svg.Home(6f, 6f). Coverage and What We Still Want Feedback On The transcoder is a maven/svg-transcoder/ module that parses SVG with javax.xml StAX. No Batik, no Flamingo, no external dependencies. Coverage targets what real-world icon SVGs use: rect (rounded corners included), circle, ellipse, line, polyline, polygon, the full path grammar (M / L / H / V / C / S / Q / T / A / Z plus relative-coordinate and smooth-curve reflection), groups with affine transforms (translate, scale, rotate, skew, matrix), linear gradients via LinearGradientPaint, fill, stroke, stroke-width, linecap, linejoin, opacity. SMIL animations are supported in the same pipeline: <animate>, <animateTransform> (translate, scale, rotate), and <set>. Time values interpolate against wall-clock time on every paint, with from / to / values / begin / dur / repeatCount / fill="freeze" honored. Text and clip-path landed in the follow-up PR for the static SVG fixtures, and both are visible in the screenshot above (the "Codename One / build-time SVG" wordmark in the rounded button, the "PRO" badge text, and the clip-path-shaped rounded-corner badge underneath). <text> and <tspan> work with single-style fills and transforms; <clipPath> referenced via clip-path="url(#id)" works against rect, circle, and path clip shapes (nested clip refs are ignored). What is still not supported: SVG filter primitives, <mask> (treated as a clip, so alpha masking falls back to opaque), <radialGradient> (falls back to the first-stop color), and CSS-in-SVG (style rules inside the SVG document; the transcoder reads presentation attributes and the inline style="..." attribute, but a <style> element with selectors is not parsed). If you hit an SVG that does not transcode the way you expect, please open an issue at github.com/codenameone/CodenameOne/issues and attach the source file. The fastest way to extend the coverage is for us to run the failing case through the test fixtures and watch the output. Every SVG we ship test goldens for started as somebody else's "this doesn't render right" report. Caveat on iOS: The transcoded SVGs use the framework's shape API (fillShape, drawShape, LinearGradientPaint). The full surface is implemented on the Metal renderer. The deprecated GL ES 2 pipeline does not have parity on every operation, so an SVG drawn under ios.metal=false will often render with visible artifacts (missing gradients, clipped fills, distorted paths) rather than the placeholder you might expect. Now that Metal is the default for new iOS builds as of last Friday, this is a non-issue on most apps; if you have explicitly pinned ios.metal=false, expect some visual regressions on SVG content and let us know which. The coverage matrix and troubleshooting are in the SVG Transcoder in the developer guide. Lottie at Build Time The same pipeline carries Lottie. Drop a Bodymovin export into the same src/main/css/: JSON src/main/css/ theme.css pulse.json spinner.json After the next build, both are real Image instances on every platform that exposes the shape API. The same vector-everywhere story as SVG: a Lottie animation renders crisply at any size and slots into any Image slot in the framework. Java Image pulse = Resources.getGlobalResources().getImage("pulse"); Image spinner = Resources.getGlobalResources().getImage("spinner"); Animation runs against wall-clock time on every paint, with no Timer and no allocation in the hot path. A capture of the hellocodenameone Lottie fixture in motion: The Lottie transcoder lives in maven/lottie-transcoder/. It parses Bodymovin JSON with no external dependencies (the framework's built-in JSON parser carries the load) and lowers each file into the same SVGDocument model the SVG path uses. The same JavaCodeGenerator emits the same GeneratedSVGImage subclass, and the same SVGRegistry registers it under the source filename. No new Image base class, no new registry, no per-port wiring, since the SVG path's JavaSE reflective load and iOS / Android Stub weaving already cover the new format. Coverage in v1: shape layers (rc / el / sh) with solid fills and strokes; layer transforms (anchor, position, scale, rotation, opacity); animated rotation, position, and scale collapsed to a two-keyframe loop; solid-color layers as filled rects. Most icon-grade Bodymovin exports lower cleanly. Complex character animations from After Effects with image references, masks, and effects do not, and the transcoder logs which layers it dropped so the source of any blank output is obvious. Same ask as for SVG: if a Lottie / Bodymovin file does not transcode the way you expect, please open an issue at github.com/codenameone/CodenameOne/issues and attach the source .json. The transcoder grows one shape family at a time from the cases the community reports. The same iOS caveat applies: the renderer leans on the shape API, so the deprecated GL ES 2 pipeline shows artifacts on the more elaborate Lottie animations. Use the Metal default (now on by default for new iOS builds). Deep Links and Routing Two pieces of plumbing for apps that handle URLs from outside themselves (notification taps, marketing links, share targets, Universal Links from Safari and the equivalent App Links from Chrome on Android). Deep Links Codename One has had deep-link support for a long time through Display.setProperty("AppArg", url). The platform plumbing already writes the incoming URL into that property on cold launch, and an app-resume sets it again on warm launch; reading it back from start() works fine for a small number of patterns. Where the AppArg-only approach gets fragile is consistency. The cold and warm paths execute different lifecycle code, the value is a flat string with no parsing, and the trickiest case is the one where a user lands in the middle of the app via a link and then continues to interact: their next navigation needs to compose with the entry point, the back-stack needs to make sense as if they had arrived through the usual flow, and "fall off the edge of the app" on back is a common bug. With a hand-rolled AppArg reader it is easy to miss one of these and ship a half-working flow. This release introduces a typed DeepLink and a single handler that fires for both cold and warm launches: Java Display.getInstance().setDeepLinkHandler(link -> { // link is a normalised DeepLink: scheme, host, path, // segments, query map, fragment. Same shape cold or warm. if ("/users".equals(link.path()) && link.segments().size() == 2) { showUserDetailForm(link.segments().get(1)); return true; } return false; AppArg still works for projects that depend on it, but the new handler is what we recommend going forward. The handler runs on a consistent lifecycle path on both cold and warm starts, and the parsed DeepLink value carries the scheme, host, path segments, query map, and fragment, so app code does not need to roll its own URL parser. Routing For projects that handle more than a handful of URL patterns, the second piece is the declarative router in com.codename1.router. We built it on the same build-time codegen pipeline as the ORM and the mappers (the router was actually the first concrete consumer of the new preprocessor), so the two surfaces compose: a deep-link handler that delegates to the router becomes a one-liner. Each form declares its own path with a @Route annotation: Java @Route("/") public class HomeForm extends Form { /* ... */ } @Route("/users/:id") public class UserDetailForm extends Form { public UserDetailForm(RouteMatch match) { String userId = match.param("id"); // build UI for user `userId` } } @Route("/about") Router.navigate("/users/42") resolves the path, instantiates UserDetailForm, and shows it. The deep-link handler now collapses to: Java Display.getInstance().setDeepLinkHandler(link -> Router.navigate(link.toString())); Each form owns its own routing rule. Adding or moving a screen is a one-class change. The "what screens does this app have, and at what paths?" question is answered by an IDE search for @Route, not by reading every form constructor in the project. For Spring developers, the shape is familiar by design. @Route plays the same role as Spring MVC's @RequestMapping: a class-level declaration that announces "this controller handles URLs of this shape". The :id parameter syntax mirrors Spring's {id} path-variable syntax; RouteMatch.param("id") is the same kind of accessor as Spring's @PathVariable. The mental model carries over from server-side Java with almost no friction. The same recognition is available to anyone with React Router, Vue Router, or Angular Router experience; the :param convention is the cross-framework default. The build-time processor validates that each annotated class extends Form, that the path starts with /, that the constructor is accessible, and that there are no duplicate patterns. Any rule violation fails the build with a class name and a reason, not at runtime with a stack trace. The rest of the router surface covers the kind of thing that has become table stakes in modern client routing: Route guards run before navigation completes and can cancel or redirect.Per-tab navigation stacks via TabsForm, where each tab keeps its own back stack.Location listeners so anything in the app can subscribe to "the route changed".Form.setPopGuard(PopGuard) intercepts hardware back, toolbar back, or Router.pop() with a chance to ask "are you sure?".Sheet.showForResult() returns an AsyncResource<T> that auto-cancels with null if the user dismisses the sheet. The API is opt-in. Apps that prefer the existing Form.show() / Form.showBack() flow keep using that; nothing changes. For the link-publishing side, an AasaBuilder emits the iOS apple-app-site-association JSON and an AssetLinksBuilder emits the Android assetlinks.json. The full setup walk-through (entitlements, the Android intent-filter, the .well-known/ upload on your origin server) is at Routing and Deep Links in the developer guide. The JavaScript port bridges the router into window.history so navigating the in-app router pushes a real entry into the browser's session history. Back and forward in the browser drive the router; reloading the page lands at the deep-link URL; sharing the URL out of the address bar takes a colleague to the same in-app location. How It Works: The Build-Time Codegen Pipeline Everything above sits on a single Maven-plugin pass. The plugin has an AnnotationProcessor SPI and two new Mojos: cn1:generate-annotation-stubs (in generate-sources) and cn1:process-annotations (in process-classes). The orchestrator ASM-scans target/classes, dispatches to every registered processor, validates the annotated classes, and emits a typed runtime artifact next to each one plus a tiny Index class that registers everything with a public runtime registry. Adding a new processor later is a matter of dropping it into META-INF/services with no orchestrator changes. The reason this runs against bytecode rather than against source text is that the source-regex prototype was scrapped early. The bytecode pass sees the JVM's view of the project (extends Form is a thing the JVM actually knows, not a pattern we have to hope the user wrote a specific way), rule violations come back with class names and reasons, and the build fails fast before any generated .class lands on disk. The infrastructure shares the ASM passes that the BytecodeComplianceMojo's existing String rewrites already use. A small stub source is emitted under target/generated-sources/cn1-annotations/ during generate-sources so application code that references the generated registry resolves at compile time. The real .class overwrites the stub later in process-classes. Standard "compile against a stub, link against the real thing" pattern; it just works inside a single Maven build instead of needing a multi-module split. cn1-core ships a no-op stub of each generated index (RoutesIndex, MappersIndex, BindersIndex, DaosIndex), so application code compiles even when the project has no annotated classes. The build-time processor shadows each stub with the real implementation before packaging. The SVG and Lottie transcoders sit on a parallel pipeline (declarative graphics files in place of annotations), but they emit the same shape of code and obey the same constraints. The practical effect is that the kind of code that historically required reflection at runtime (with all the obfuscation hazards and surprise allocations that come with that) now happens once at build time and produces direct, dead-code-eliminable, rename-safe symbol references. Wrapping Up That closes this release's post series. We already have some pretty big features lined up for this Friday's release post; the headline pieces are the most substantial things to land in months and are worth checking back for. Back to the weekly index.
Blockchain is an extremely data-driven technology because its primary function is to store, verify, and coordinate independent records in a secure, distributed data network. Without this information, no transaction, smart contract execution, or network activity would be valid, and it could jeopardize the integrity of much larger functions of trust. The data coming into the blockchain affects the accuracy of the whole system. Blockchain is nothing without the data it connects to, so, as far as transparency, immutability, and safe decisions are concerned, data is the backbone of blockchain. Blockchain and data streaming are bringing unprecedented levels of security, transparency, and real-time mechanisms to move data across the digital world. Blockchain forms an unbreakable chain of trust through keeping decentralized records, and streaming data streamlines the process by allowing for insights when information is constantly flowing. These form the backbone of next-generation applications, unleashing innovation, scalability, and better decision-making across industries. Both blockchain and data streaming are independently large and powerful technologies as they exist in the present time. However, when combined, data streaming can amplify the potential impact of a blockchain solution. Real-Time Data Integration Data streaming platforms, such as Apache Kafka and Apache Flink, continuously process and deliver real-time data. When we integrate with blockchain, transactions can be updated instantly on the ledger, smart contracts can react to live data feeds, and delays can be reduced compared to batch processing. For example, we can visualize it as the IoT sensors streaming temperature data can trigger a blockchain-based smart contract in real time. Improved Scalability One major limitation of blockchain systems like Ethereum has been scalability. By leveraging data streaming, we can pre-process and filter large volumes of data before sending it to the blockchain. Can reduce unnecessary transactions that are stored on-chain, and, on top of that, offload heavy computation on-chain and push it to a stream processing engine that is available on data streaming platforms.This results in faster and more efficient blockchain performance. Enhanced Data Integrity and Trust Blockchain ensures immutability and transparency; on the other hand, data streaming ensures continuous data flow. As data stream processing enables continuous validation, filtering, and analysis of data elements before they are processed on the ledger, it enhances data integrity and trust in blockchain. Real-time processing helps identify anomalies in the data, prevent tampering, and ensure that only accurate, high-quality data enters the blockchain. Combining this provides a trusted, secure, and eventually transparent ecosystem in which information can be verified instantly and with confidence. We can consider a use case of supply chain tracking where real-time shipment data is streamed and permanently recorded. Better Event-Driven Architectures Blockchain systems can become more dynamic when combined with an event-driven streaming platform such as Confluent, Amazon Kinesis, or the open-source Apache Kafka. Smart contracts can act as automated responders to streamed events and can be enabled for automation across distributed systems, which finally reduces manual intervention. For example, a payment is automatically released when a delivery event is streamed and confirmed. Efficient Data Storage Strategy Not all data needs to be stored on-chain, which is expensive and slow, but by leveraging streaming platforms, we can store and process high-volume data off-chain. Streaming platforms can be integrated with streaming databases to store data already processed by stream engines. We can allow the Blockchain to store only critical summaries, hashes, or proofs, maintaining efficiency while ensuring verification. Real-Time Analytics and Monitoring Data stream processing facilitates real-time analytics and monitoring in blockchain by analyzing transaction data as it streams over the network. This enables organizations to detect suspicious activity, monitor system performance, and obtain real-time information on blockchain activity by analyzing transaction patterns. Transparency, responsiveness, and operational efficiency across blockchain ecosystems can be upgraded if we convert the raw data into actionable intelligence by integrating a real-time stream processing platform. Wrapping Up Combining these two technologies — data stream processing and blockchain — creates an ecosystem that blends real-time intelligence with secure, immutable record-keeping. Blockchain ensures transparency, trust, and data integrity, while stream processing powers instant analysis, continuous monitoring, and real-time decision-making based on that data. When combined, they improve power efficiency, enhance security, and enable scalable, data-driven applications. These technologies play an instrumental role in the construction of smarter, more intelligent systems that must respond to increase confidence among organizations relying on that real-time information.
Most SaaS breaches do not happen through failure. They happen through valid authentication being trusted too far, for too long, across systems that were never designed to question each other. That distinction is worth sitting with. Because if authentication failed, you'd know. You'd see it in the logs. The SIEM would fire. The investigation would start in an obvious place. When authentication succeeds — and authorization is simply absent, or context has shifted since the token was issued — the system looks healthy right up until it isn't. The logs show normal traffic. The requests look legitimate. The damage accumulates silently. This is the actual threat model for modern SaaS, and it is not adequately reflected in how most teams design, audit, or respond to their systems. The Cloudflare Case Is the Template In February 2024, Cloudflare published one of the more technically honest post-mortems the industry has seen. Their internal Atlassian environment — 14,099 Confluence wiki pages, 2 million Jira tickets, 11,904 Bitbucket repositories — had been accessed by a suspected nation-state actor. The intrusion ran for nine days before detection. The entry point was not an exploit. During the Okta breach on October 18, 2023, attackers stole one service token and three service account credentials belonging to Cloudflare. These credentials were not rotated because, mistakenly, they were believed to be unused. That is the full story of the breach. Credentials issued during one incident. Not rotated. Still valid. Still honored by Cloudflare's systems months later. A JWT created for the Moveworks Gateway was forwarding authenticated HTTP requests directly to the private, self-hosted Atlassian server. Incoming HTTP requests that attached the JWT were forwarded without further challenge. The token was valid at issuance. The system never re-evaluated whether the holder still had legitimate standing to use it. Most SaaS breaches are not authentication failures — they are trust relationships that were never designed to expire. That line is not a platitude. It is a precise description of how Cloudflare, Microsoft, BeyondTrust, and dozens of less-publicized organizations were breached in the past eighteen months — not because their authentication systems failed, but because token validity was treated as a continuous proxy for authorization correctness. It is not. What Stacking Trust Layers Actually Produces Modern SaaS architectures are composites. A single user action might pass through an API gateway, traverse a microservice boundary, call an identity provider, issue a token validated by a third-party integration, and write to a data layer with its own access model. Each component was built by different teams, under different threat models, in different years. Each layer assumes the previous one enforced constraints correctly. This assumption is not verified at runtime. It is inherited from the original design — which means it degrades silently as the design evolves. JWTs remove central control points, which also removes real-time revocation visibility. OAuth delegation enables fast integration, which also means trust propagates across service boundaries that nobody charted when the original token was issued. API gateways handle routing and coarse-grained access control, which services downstream interpret as authorization clearance they did not themselves perform. The result is not insecurity in any one component. It is trust drift across the composite — a gradual divergence between what the system was designed to permit and what it actually permits, with no mechanism to detect the gap until something external forces the question. IAM Drift: The Slow Accumulation Nobody Audits By the time a breach is discovered in a SaaS environment, the permissions that made it possible have typically been accumulating for months. Sometimes years. Through entirely routine, well-intentioned decisions. A role gets created for a project and is never sunset. A contractor is provisioned at an elevated scope to expedite an integration, then forgotten during offboarding. An OAuth application receives administrative permissions during testing, and nobody downgrades it before the production cutover. A CISA warning from early 2024 highlighted how Russian-affiliated APT29 was targeting dormant cloud accounts belonging to former employees of government agencies — accounts with standing permissions that outlasted the people they were created for. Dormant accounts with live permissions are not an edge case. They are a near-universal condition in organizations running SaaS stacks for more than three years. Russian attackers known as Midnight Blizzard gained access to Microsoft's internal systems, exploiting compromised credentials through a legacy OAuth application, which enabled the exfiltration of senior executives' emails. The phrase "legacy OAuth application" deserves more attention than it usually gets in the incident coverage. Legacy here does not mean ancient. It means provisioned before the current access model, never audited for scope creep, and still fully honored by every downstream service that inherited trust from the original identity provider. In modern SaaS, trust is not broken — it is inherited too broadly, and then never re-examined. Organizations that treat IAM as a provisioning function rather than a continuous enforcement function will produce permission surfaces that nobody at the organization can fully account for. That surface is exactly what sophisticated attackers map before they move. The Authorization Gap Nobody Wants to Instrument Authentication got the industry's attention first because it is legible. Failed authentication produces clear signals. Broken authorization, by contrast, is architecturally subtle and operationally expensive to detect — which is why it remains the more reliable attack surface. The production pattern looks like this: a user authenticates correctly, receiving a valid, properly signed token from a trusted provider. They make an API call. The gateway routes it because authentication passed. The downstream service validates the token signature and executes the operation — without independently evaluating whether the scope in that token is appropriate for this specific operation, or whether the tenant context in the request header was derived server-side from verified identity, or provided by the client. In August 2025, threat actor UNC6395 used stolen OAuth tokens from Drift's Salesforce integration to access customer environments across more than 700 organizations. The attacker needed no exploit and no phishing. The activity looked legitimate because it came from a trusted SaaS connection rather than a compromised user account. 700 customer environments. No exploit. No phishing. Just a token accepted by systems built to honor tokens — with no service in the chain asking whether this token should be trusted to make these calls on behalf of those customers. The authorization logic that would have caught it was simply not there. One integration became a doorway into everything connected to it. That is not an accident of implementation. It is the predictable consequence of treating third-party integrations as trusted extensions of the platform rather than as external parties with scoped, audited, time-limited access. Multi-Tenant Isolation: Where the Shortcut Becomes the Attack Vector Multi-tenant isolation is architecturally expensive. The pressure to shortcut it is real, and I say that without judgment — I have talked to enough platform engineers to understand the sprint calculus. The common shortcut is this: tenant context flows as a client-supplied parameter — a header, a query field, a value in the request body — which the server accepts and processes as valid context. The reasoning is that only authenticated clients can reach the endpoint, so the tenant ID they provide can be treated as ground truth. This reasoning holds until a token is stolen, a scope is broader than intended, or authorization checks are inconsistent across services. At that point, tenant boundary enforcement becomes entirely dependent on client honesty — and attackers are not honest. When tenant identity is client-provided rather than server-derived from verified credentials, cross-tenant data exposure is not a vulnerability. It is a design property. The only questions are timing and who finds it first. SaaS breaches surged 300% in 2024, with attackers able to compromise core systems in as little as nine minutes. Nine minutes is not reconnaissance time. It is the execution time of someone who already understood the gap, because architectural gaps are consistent and therefore mappable in advance. What Secure Systems Actually Do The teams I have observed building more durable SaaS security postures are not necessarily running more tools. They are enforcing different constraints at the design layer. Authorization is evaluated independently at every layer. Not "the gateway checked, so the service trusts." The service evaluates the request. The data layer enforces row-level policies. Each component performs its own authorization decision in context, at request time. This is operationally expensive. It is also the only architecture that fails safely when one layer is compromised. Identity is bound to the runtime context, not the login state. A token issued at login does not carry indefinite authorization for sensitive operations. Context — session recency, request origin, device posture — is re-evaluated at privilege boundaries. Escalation patterns trigger reauthentication. The cached token is not sufficient. Tenant isolation is a server-side invariant, not a client-side convention. Tenant ID is derived from verified identity. It is never accepted as input. Non-human identity receives the same lifecycle discipline as human identity. In December 2024, BeyondTrust identified a security incident in which a BeyondTrust infrastructure API key for Remote Support SaaS had been compromised and used to enable access to certain Remote Support SaaS instances by resetting local application passwords. API keys, service account tokens, and integration credentials are identity. They accumulate permissions. They outlast the contexts that justified them. Organizations that audit human identities quarterly and review machine credentials annually will find that the gap between those schedules is exactly where attackers operate. The Real Gap Is Not Knowledge There is a version of this analysis that ends with a list of OWASP API Security Top 10 items and a recommendation to evaluate SSPM vendors. That version is accurate. It is also not the reason any of this keeps happening. The issue is not just credentials or misconfigurations; it is the lack of visibility, real-time threat detection, and the inability to block threats before damage occurs. But even that framing undersells the structural problem. Engineers know what broken object-level authorization looks like. Security architects understand token scope. Post-mortems from Okta, Cloudflare, and Microsoft have been widely read. The gap is enforcement under velocity pressure. Authorization models do not get updated when features ship. Integrations get added without full accounting of the trust they inherit. Scopes get provisioned broadly because narrow provisioning takes time that the sprint cannot absorb. The system keeps working — correctly, from its own perspective — until someone external points out what it has been silently permitting. Brian Soby, CTO of AppOmni, framed the organizational consequence clearly: "In 2024, business was disrupted by costly SaaS 'bypass' breaches that circumvented IAM and zero-trust controls. 2025 will bring awareness to end-to-end controls needed for SaaS, with tight interdependencies between zero trust, identity, SaaS posture, and detection and response capabilities." End-to-end. Not perimeter. Not gateway. Not identity provider in isolation. Every integration point. Every inherited trust relationship. The threat model has to be continuous, or the gaps accumulate exactly where the coverage stops. The Question That Catches the Failure Verizon's 2025 Data Breach Investigations Report examined more than 22,000 security incidents; 30% originated from a third party, including SaaS applications and software vulnerabilities. Third-party integrations are now a primary attack surface — not because they are inherently insecure, but because they are the points at which one system extends trust to another system it did not design, does not control, and often does not monitor. The engineers who consistently build more defensible systems are not necessarily the ones with the most security certifications. They are the ones who read an architecture diagram and ask the productive question before anything ships: what does this component assume the other layer is enforcing — and what happens when that assumption is wrong? That question, applied systematically, catches most of the failure modes described above. Not all of them. Systems are complex, and attackers are patient. But it catches the predictable ones — the inherited trust that was never re-examined, the token that outlived its context, the tenant boundary that depended on client honesty. The question your systems need to be able to answer is not whether they are secure at the edge. It is whether your trust relationships are still valid after they were first created — and whether you have any mechanism to know if they are not. Most production systems do not. They will continue operating correctly — until correctness is no longer the same thing as safety. The author covers cybersecurity architecture, DevSecOps, and identity systems engineering. Pushback, corrections, and firsthand incident accounts are welcome.
Why Long Chats Need Session-Level Guardrails (CRA) Who this is for: Anyone building chat features, support bots, internal Q&A, coaching tools, RAG assistants. The Usual Setup (and What It Misses) A typical flow: User sends a message.You run moderation, rules, or a small model on that message (sometimes the reply too).If it passes, the big model answers. That is per message. It does not really “remember” the story of the chat. In a long chat: Message 5 looks normal.Message 12 still passes your keyword list.By message 20, something is wrong only if you compare it to how the chat started. So you can pass every single check and still end up with a bad session. That gap is what we call CRA: risk that adds up across turns, not in one obvious line. Figure 1: Each turn can look “green” while the overall thread is not. CRA in Plain English CRA = Conversational Risk Accumulation Idea: Each turn might look okay on its own, but together they break the purpose of the chat or what your company is okay with. What to build: Keep a little session memory (not the full transcript in logs — think IDs, hashes, and scores). After each assistant reply, update a few numbers that describe “how this session feels right now.” Those numbers are hints for dashboards, alerts, and gentle UI — not a courtroom verdict. Three Simple Scores + One Total (Example) We use a small, fixed set of scores and one combined score. Version tag in code: cra_telemetry_v1. Figure 2: Three inputs, one combined CRA score. ScorePlain meaningHow you might compute it (conceptually)S1Topic driftCompare the user’s recent text to how the chat started (or a stated goal). If they wander far from that, S1 goes up.S2Sensitive-looking repliesThe assistant’s answer looks like it contains patterns you care about (fake email shapes, “API key” wording, etc.). This means “flag for review,” not “we proved a leak.”S3Refusal tone shiftingTrack refusal-style phrases in the assistant’s answers over time. If refusals seem to soften late in the thread, S3 captures that shape.CRAOverall session riskA weighted sum of S1, S2, and S3, plus a small extra bump if the user or assistant text looks like prompt injection playbooks. Example weights we used: 35% S1, 45% S2, 20% S3. Rule of thumb: If you cannot explain a score in one short sentence to a product manager, do not use it to auto-block users. Hard Guardrails = Simple, Fast, “No” Hard guardrails are rules, not vibes. They should be cheap and run before you waste tokens. Examples: Max request size – reject giant payloads (HTTP 413).Rate limits – cap requests per IP so one client cannot drain your budget (429).Known-bad phrases – block obvious “ignore all previous instructions” junk (400).“Don’t paste secrets” – block prompts that look like “here is my SSN” (400) with a clear error.Lock down outputs – if your product only allows certain actions, check model output and tool calls against an allowlist before anything runs. These are not CRA. They are basics. CRA sits beside them. Figure 3: Hard = block or validate. Soft = warn, log, nudge. Soft Guardrails = CRA-Friendly, “Heads Up” Soft means: warn, log, maybe show a banner — not silent blocking. After a response, the API can add fields such as: cra_soft_notices – short text for humans (“high drift”, “sensitive-looking wording”, …).cra_signals – numbers for debugging: S1, S2, S3, CRA, turn count. Why start soft: Rules and heuristics misfire. A user might ask for fake email examples for a demo; S2 might spike on purpose. That is why the score is a signal, not proof. Bonus: Cache Duplicate Questions (Save Money) If someone double-clicks Send or retries the same text, do not call the model twice. Cache key idea: Python normalize(question) + mode + endpoint Cache the JSON answer for a few minutes. Mark responses with something like cached: true so the UI can say “from cache.” Browser Tip: Don’t Mix Up “New Chat” and Old Intent If S1 uses “first message of this session” as the anchor, browser storage can fool you: a new tab can look like a new thread while an old “first message” is still stored. Fixes: Store the anchor per session_id, not one global value.Expire or rotate the browser session after idle time so deploys and stale tabs do not reuse the wrong anchor. Telemetry vs. Guardrails (Two Different Jobs) TelemetryGuardrailJobMeasure and learnBlock or change behaviorWhen it hurts youToo many logs, privacyFalse positives, angry usersCRAGood fitUse soft first; hard only after review In logs, avoid raw secrets. Prefer hashes, lengths, and labels (channel, product area). Three Lines for Your Security Reviewer CRA is about conversation behavior over time, not a replacement for database security or tool-permission design.Labels for “bad session” are rare in the real world — use CRA to prioritize review, not as automatic guilt.If weights are public, people might game them — keep basic hard rules and spot checks anyway. Rollout Order (Keep It Boring) Ship hard limits (size, rate, obvious injection, output checks).Add session logging with safe IDs.Show soft notices only inside internal tools first.Tune thresholds on real traffic.Only then add hard session actions (pause tools, re-auth, etc.). Takeaway One-message checks are not enough for long chats. CRA gives you a simple story and a small set of session scores. Hard rules stop obvious abuse; soft CRA helps you see drift before it becomes an incident. Start with telemetry. Add blocking only when you understand the false positives. About the author: Sanjay Mishra is author of two books, The SQL Universe and Oracle Database Performance Tuning: A Checklist Approach. His research spans RAG architectures, NL2SQL, LLM safety, and enterprise AI governance, with work published in IEEE Access, Springer LNNS, and SSRN. He speaks regularly at universities and industry events on applied AI and data engineering. Tags / topics: #LLM #Security #Guardrails #Observability #OpenAI #Architecture #Chatbots
Between December 22, 2025 and January 15, 2026, an attacker spent 24 consecutive days inside Navia Benefit Solutions' systems. They quietly and methodically pulled Social Security numbers, dates of birth, health plan enrollment details, and COBRA records belonging to 2,697,540 Americans. These include teachers, state workers, and school administrators. People who signed up for employer benefits through HR software and had no idea which third-party company held their data. Navia didn't catch it for more than three weeks after the attacker had already stopped. The company published a breach notice on March 13, 2026. Individual notification letters went out on March 18 — eighty-six days after the intrusion began. The technical cause was not sophisticated. A BOLA vulnerability in Navia's API allowed an authenticated user to manipulate request identifiers and retrieve records belonging to other participants. Change a number in the API parameter, return a different person's record. The attack required no zero-day exploit. No social engineering. No supply chain compromise. Just an API that checked whether you were logged in and never asked whether the record you were requesting was yours. That's the breach that cost 2.7 million Americans their healthcare data and personal identifiers in early 2026. And it's not an outlier. I've spent the last eighteen months studying API breaches in depth — formal postmortems, SEC disclosure filings, state attorney general notification records, security research writeups, and direct conversations with incident responders who cleaned up the aftermath. The sample spans healthcare, fintech, retail, SaaS platforms, government infrastructure, and consumer applications. More than fifty incidents analyzed at a structural depth. The technologies differ. The industries differ. The victim organizations range from county governments to billion-dollar enterprises. The mistakes are, with remarkable consistency, the same five. This is not a vulnerability catalog. It is a pattern analysis. And the pattern points to something the industry has been reluctant to say plainly: most API breaches are not caused by sophisticated attackers. They are caused by undisciplined defenders repeating failures the field already knows how to prevent. The Infrastructure That Cannot Afford to Fail Quietly Before the patterns, the scale of the problem requires a precise frame — not as context-setting, but because the numbers explain why discipline failures at this layer are so consequential. API incidents now account for over 30% of all data breaches, up from less than 20% two years ago. API breaches expose an average of more than 2.5 million records per incident, significantly higher than traditional breaches. 38% of organizations discovered API breaches only after external reporting, not internal detection. That last figure is the one that should stop readers cold. More than a third of organizations learn about API breaches from someone other than their own security team. From a reporter. From a researcher submitting a bug bounty report. From a law enforcement notification. From a dark web listing of their customers' data, already sold. The Navia incident was consistent with the 38%: the company discovered the intrusion eight days after the attacker had already stopped accessing systems. By the time Navia detected anything, the data was gone, and the window for limiting exposure had closed. APIs have become the operational substrate of modern software. A mobile banking application's backend is a collection of APIs. A SaaS platform's data sharing is API-mediated. An AI agent answering customer queries calls APIs that call other services that query databases through yet more APIs. The attack surface isn't just large — for most organizations, it's partially unmapped. Endpoints built by contractors and never formally decommissioned. APIs generated by AI coding tools without the security review human-written code receives. Internal service APIs that were never intended to face external traffic and ended up there anyway. 56% of enterprises admit they lack full visibility into their API data flows. The thing they can't see is the thing that's being exploited. Pattern One: Authentication and Authorization Are Not the Same Concept — The Industry Keeps Treating Them as If They Are The Navia breach has a precise technical name: Broken Object Level Authorization. It has been the number-one entry on the OWASP API Security Top 10 since 2019. It accounted for a Parler breach that exposed 70 terabytes of user data. It drove the USPS vulnerability that sat unpatched for over a year after a researcher reported it, and was only fixed after journalist Brian Krebs published the story. It accounts for over 40% of API vulnerabilities today. Seven years. Number one. Still responsible for 40% of incidents. The reason BOLA persists is structural, not ignorance. Engineering teams understand the distinction intellectually. The failure is in the architectural gap between understanding it and enforcing it consistently across every endpoint, every integration, and every API built under deadline pressure by developers who know they should implement the ownership check and don't always do it. Authentication verifies: Who is making this request? Authorization verifies: Does this specific identity have permission to access this specific object? These are different questions. Authentication is typically enforced at a framework or middleware layer — configured once, centrally, applied everywhere. Object-level authorization is implemented per-endpoint, by the individual engineer who wrote that endpoint, with whatever understanding of the ownership model they had on the day they wrote the code. The structural asymmetry produces an architectural guarantee: authentication will be applied consistently because it's centralized; authorization will be applied inconsistently because it isn't. The attack is elementary: WHAT THE API DOES: GET /api/v1/benefits/participant/883441 → 200 OK { ssn: "XXX-XX-4291", dob: "1979-03-14", plan: "FSA" } (your record — you're authenticated, you can see this) WHAT BOLA ALLOWS: GET /api/v1/benefits/participant/883442 → 200 OK { ssn: "XXX-XX-7738", dob: "1984-11-02", plan: "COBRA" } (someone else's record — you're authenticated, but this isn't yours) GET /api/v1/benefits/participant/883443 → 200 OK ← and again GET /api/v1/benefits/participant/883444 → 200 OK ← and again ... × 2,697,540 WHAT SHOULD HAPPEN: GET /api/v1/benefits/participant/883442 → 403 Forbidden (request fails ownership check: token owner ≠ record owner) The fix is a single check, applied at the data access layer before the record is returned: does the authenticated identity own or hold explicit permission for the requested object? That check is architecturally simple. It takes minutes to write for a given endpoint. Applied to every endpoint, consistently, across a codebase that spans dozens of services and years of development history, it requires organizational discipline that companies apparently find harder to sustain than it sounds. Authorization checks for individual resources are usually too fine-grained to offload to centralized platforms like API gateways or IAM products. The responsibility sits with API developers to implement the proper checks at the API endpoint. That sentence explains why BOLA is still happening in 2026. There is no platform that catches it automatically. No gateway configuration that prevents it. No WAF rule that blocks it. The check has to be written by engineers who know what correct authorization looks like for this specific system, tested by security engineers who know how to probe for its absence, and validated adversarially in CI/CD rather than assumed to exist because someone believes they wrote it. BOLA sits at the top of the OWASP API Security Top 10. It's been the most common API vulnerability for years. Every API security guide warns about it. The organizations still producing these breaches aren't unaware of BOLA. They're applying the authorization check inconsistently, untestedly, and without the adversarial test suite that would catch it before an attacker does. Pattern Two: Trust Relationships Accumulate Silently While Security Visibility Stays Static The 700Credit breach, disclosed in early 2026 and subject to consolidated federal litigation by February of that year, traced to a compromise through a third-party integration partner. An exposed API enabled the extraction of consumer data — Social Security numbers, credit information — belonging to approximately 5.6 million individuals. The API existed because a third-party integration required it. The third party was compromised. The access chain from the compromised partner to the sensitive consumer records was shorter than anyone had documented. Third-party APIs exposed millions of records at 700Credit, while weak airline API authentication fueled mass access at Qantas. Third-party integrations now represent the initial access vector in more than a quarter of API breaches. The mechanism isn't exotic: every integration creates a trust relationship, and trust relationships accumulate faster than the security reviews that should accompany them. Consider what happens to an organization's integration landscape over two years of normal product development. A partner API is connected for a feature that shipped and drove modest adoption. The API integration remains active; the feature is no longer actively developed. A contractor builds an internal service integration for a project that was completed and handed off. The service account credential used by that integration was never revoked. A third-party data enrichment vendor is added to the user onboarding flow with read access to customer records. Six months later, the enrichment vendor updates its API client library, and an engineer upgrades the dependency without reviewing the new permission scope. None of these represents malicious action or negligent individual decisions. They represent the natural accumulation of a complex integration landscape under continuous development, without the organizational process to maintain security visibility at pace with that development. Machine identities — credentials that authenticate services, workloads, and devices — outnumber human identities by more than 45 to 1, according to CyberArk. The proliferation of static keys, long-lived tokens, and embedded credentials has led to uncontrolled secrets sprawl across codebases, repositories, and collaboration tools. Machine identities don't appear in quarterly access reviews. They don't get deprovisioned when a project ends or when the engineer who created them changes roles. They don't trigger MFA prompts. When a machine identity is compromised — whether through a leaked credential or a supply chain attack on the service using it — the blast radius is often substantially larger than any individual's human identity would have been, because the service account may have been provisioned with elevated permissions for a project requirement that no longer exists. The structural fix requires treating machine identity governance with the same rigor as human identity governance: defined business purpose at provisioning, periodic review against defined staleness criteria, automated detection of credentials operating outside their documented scope, and revocation procedures that can be executed without requiring the engineer who originally created the credential to be in the loop. Most organizations are three to five years behind on this. The incident record reflects it. Pattern Three: Secrets Leak Into Every Surface, and Almost Nobody Rotates Them 28.65 million new hardcoded secrets were added to public GitHub commits in 2025 alone — a 34% increase year over year and the largest single-year jump GitGuardian has recorded. That number deserves a full stop. Secret leak rates in AI-assisted code were, on average across the year, roughly double the GitHub-wide baseline. AI service credential leaks increased 81% year over year, to 1,275,105. Claude Code-assisted commits leaked secrets at approximately 3.2%, twice the baseline. The acceleration has a specific mechanism. AI coding tools have lowered the barrier to building API integrations, which is mostly good. They've simultaneously created a new class of developer — experienced in product and logic, less experienced in security conventions — who builds quickly and may not know that the API key they copied from the project documentation should go into a secrets manager rather than the .env file committed alongside the rest of the project. Across 6,943 systems, GitGuardian identified 294,842 secret occurrences corresponding to 33,185 unique secrets. On average, each live secret appeared in eight different locations on the same machine, spread across .env files, shell history, IDE configs, cached tokens, and build artifacts. 59% of compromised machines were CI/CD runners, not personal laptops. The CI/CD figure is where the pattern becomes structurally dangerous rather than merely careless. A secret on a developer's laptop is an individual exposure. A secret on a CI/CD runner is accessible to every process that executes in that environment — including processes introduced through supply chain attacks. The LiteLLM supply chain attack demonstrated this pattern concretely: compromised packages harvested SSH keys, cloud credentials, and API tokens from developer machines where AI development tooling had concentrated credentials. MCP configuration files are a new and largely unmonitored leak surface. In 2025, 24,008 unique secrets were exposed in MCP-related configs on public GitHub — 8.8% confirmed valid at the time of detection. The remediation gap transforms bad leak rates into chronic exposure. Nearly 70% of credentials confirmed as valid in 2022 were still valid in January 2025. When retested in January 2026, the validity rate was still above 64%. Three years of known exposure. More than six in ten credentials still live. The detection is working; the remediation isn't. Organizations that deploy secret scanning without building the organizational process to act on findings — to rotate credentials on a defined timeline, to identify every system using a given credential before revoking it, to treat found secrets as an urgent remediation item rather than an informational alert — are doing the technical equivalent of installing smoke detectors and then watching the building burn. Pattern Four: Monitoring Was Built to Watch the Infrastructure, Not the Behavior In 2025, the global median attacker dwell time after initial compromise was 14 days — up from 11 days in 2024, according to Mandiant's M-Trends 2026 report. The interval between initial compromise and lateral movement fell to 29 minutes — a 65% acceleration from the previous year. In at least one case, data exfiltration began within four minutes of entry. Fourteen days median dwell time. Four minutes to exfiltration in the fastest case. The attacker's operational tempo in 2025 was faster than any previous year on record; the detection tempo moved in the wrong direction. The Navia breach ran for 24 days without triggering any internal detection. That's not exceptional — it's slightly above median. 34% of incidents had an unknown or undetermined initial vector, indicating significant gaps in logging and detection capabilities. The unknown-vector incidents are, by definition, the ones where the monitoring infrastructure failed to capture the access path entirely. The reason BOLA exploitation goes undetected for weeks is that it produces none of the signals that infrastructure monitoring was built to catch. The requests are correctly formed. The authentication succeeds. The responses return 200. The rate may be elevated, but elevated API request rates are also the signature of legitimate mobile applications, legitimate batch processing, and legitimate partner integrations under load. The only distinguishing characteristic — that the object IDs being queried belong to other users — requires business logic context that standard monitoring infrastructure doesn't have. You cannot investigate data you never collected. The more consequential version of that principle is: you cannot detect anomalies against a baseline you never defined. Application-layer attacks — exploits targeting web applications, APIs, and software supply chains — often fly under the radar because traditional security tools were not designed to see them, especially at runtime. API behavioral monitoring requires two things that most organizations have not built. First, a behavioral baseline per endpoint: what does legitimate usage look like for this specific API, this specific authentication context, this specific integration? What's the expected distribution of object IDs accessed per session? What rate of data retrieval is consistent with the documented business purpose of each authenticated identity? Second, anomaly definitions calibrated to those baselines: what specific patterns constitute evidence of enumeration or exfiltration rather than legitimate high-volume operation? Baselines cannot be automatically inferred from traffic data without business logic context. They require human authorship — people who understand what the API is supposed to do, defining what legitimate usage looks like in operational terms. That work is unglamorous. It doesn't ship a feature. It doesn't close a compliance checkbox. It is the difference between detecting a breach in hour four and detecting it after the attacker has been gone for eight days. Pattern Five: Security Is Defined as a Project With an End Date The three major French retailers — Boulanger, Cultura, and Truffaut — experienced a coordinated API attack through their shared e-commerce backend in 2024. The breach stemmed from poorly configured API security rules. One misconfiguration. Three companies compromised. Millions of customer records stolen. Shared infrastructure meant one vulnerability cascaded across all platforms. The shared infrastructure attack surface is an example of what happens when security review occurs at deployment and isn't revisited as the integration architecture evolves. Each retailer's security posture changed when the shared backend was modified, when new partners connected, and when access control configurations were updated for a new feature. The review that approved the original configuration didn't cover those subsequent changes. This is the fundamental failure of treating security as a project: projects have end dates. Security exposure doesn't. A penetration test produces a snapshot of a system as it existed during the two-week engagement window. That snapshot is accurate when it's produced and becomes less accurate with each subsequent code deployment, configuration change, and new integration. Organizations that treat the pen test result as ongoing assurance — that consider security "done" until the next compliance cycle — are operating on a security posture that no longer accurately describes their actual attack surface. Attackers don't operate on project timelines. Automated scanning tools find newly deployed endpoints within minutes. Attackers use automated scanning tools to identify API vulnerabilities within minutes of deployment. The enterprise security review cycle typically runs quarterly or annually. The gap between "API deployed" and "API found by automated scanner" is measured in minutes. The gap between "API deployed" and "API reviewed by security team" is measured in months. 68% of organizations experienced an API security breach resulting in costs exceeding $1 million. The organizations accumulating that exposure are largely not the ones that skipped security entirely. They're the ones that did security once — at the right moment, with the right tools, producing the right findings — and then moved on. The API Security Lifecycle: What Continuous Practice Actually Looks Like The pattern analysis above points to a consistent structural need: security disciplines that operate continuously across the full API lifecycle, not at discrete compliance milestones. The following framework — the API Security Lifecycle — organizes those disciplines into a model where security is a property the system continuously maintains, not a state the organization periodically verifies: StageWhat happens hereBreach pattern closedDesignDefine the object ownership model before the first line of code is written.Pattern 1: BOLA — Prevents broken object-level authorization by design, not just testing.DesignDocument machine identity scope at provisioning.Pattern 2: Trust boundaries — Defines access limits before integrations go live.Threat modelingMap the BOLA surface by reviewing every endpoint that returns objects and assessing ownership enforcement.Pattern 1: BOLA — Forces teams to identify authorization gaps before shipping.Threat modelingAudit trust boundaries by documenting every integration and its scope.Pattern 2: Trust boundaries — Makes third-party attack surfaces visible before they become blind spots.DevelopmentEnforce BOLA checks at the data layer, not just the controller.Pattern 1: BOLA — Makes ownership checks harder to bypass.DevelopmentUse secrets from a vault starting with the first commit, with enforcement during code review.Pattern 3: Hardcoded secrets — Keeps credentials out of the repository.TestingRun an adversarial BOLA test suite for each endpoint in CI/CD on every push.Pattern 1: BOLA — Validates every endpoint before it ships.TestingAdd secret scanning to CI with a defined remediation SLA.Pattern 3: Leaked secrets — Ensures leaks are rotated, not just detected.MonitoringBuild behavioral baselines per endpoint with input from people who understand the API.Pattern 4: Weak detection — Makes Navia-type enumeration detectable in hours, not weeks.MonitoringTie anomaly definitions to ownership context, not just rate thresholds.Pattern 4: Weak detection — Triggers alerts on enumeration behavior, not only traffic spikes.Continuous validationAutomate API inventory so every live endpoint is known, documented, and reviewed.Pattern 5: Unknown endpoints — Finds new endpoints before attackers do.Continuous validationReview trust relationships every 90 days with defined revocation criteria.Pattern 2: Stale trust — Removes unnecessary integrations before they become attack paths.Continuous validationEnforce credential rotation automatically with documented rotation SLAs.Pattern 3: Stale secrets — Reduces the risk of old or exposed credentials remaining valid. The framework's structure is intentional: every stage maps to a specific failure pattern, and every failure pattern is addressed at the stage where prevention is cheapest. BOLA is cheapest to address at design and development; catastrophically expensive to address after 2.7 million Social Security numbers have been exfiltrated. Secret exposure is cheapest to address at development, with vault-first discipline and code review enforcement; expensive to address after a compromised CI/CD runner has propagated credentials across build infrastructure. At Design The object ownership model gets written before the first endpoint is coded. Not as an afterthought — as a specification that the authorization implementation must satisfy. The authorization model names every object type in the system, defines the ownership structure, and specifies the access control rules governing cross-user access. That specification becomes the adversarial test suite's source of truth. At Threat Modeling The BOLA surface gets mapped: every endpoint that returns an object, every parameter that could be manipulated, every authorization assumption that isn't yet validated. This doesn't need to be a multi-week engagement. For a new API, a focused 90-minute session with the engineering team produces a complete BOLA surface map and surfaces the authorization assumptions that need explicit testing. At Development The ownership check lives at the data access layer — not at the controller layer, where a bypass path might exist. A controller-layer check can be bypassed if there's a second code path to the same data. A data layer check cannot. This architectural discipline requires a conversation during design, not during code review. At Testing The adversarial BOLA suite runs in CI/CD on every push. Not once a quarter during a security review — on every push. The suite consists of tests written to fail if authorization is absent: authenticated requests for objects the test user doesn't own, verifying that the response is 403 rather than 200. These tests are not generated by scanners. They are written by engineers who know the ownership model, because ownership model knowledge is not accessible to automated scanning tools. At Monitoring Behavioral baselines per endpoint are authored, not inferred. For the Navia breach scenario, a baseline that defined expected participant record access as "1-3 records per authenticated session, with alert threshold at 15 distinct participant IDs in a 60-minute window" would have triggered an anomaly detection response within the first hour of the 24-day access window. The attacker would not have had weeks of silent operation; they would have triggered a human investigation while the breach was still recoverable. At Continuous Validation Security review becomes a property that the system maintains continuously, not a milestone that occurs at fixed intervals. API inventory automation catches new endpoints before they go through a full quarter unreviewed. Trust relationship reviews on a defined cadence — 90 days is a reasonable default — ensure that stale integrations and credentials don't survive long enough to be exploited. Credential rotation with automated enforcement ensures that the 2022 leaked secrets that are still valid in 2026 don't remain valid in 2027. What the Next Three Years of API Security Look Like The five patterns described above operate against the current API attack surface. The emerging surface stresses those patterns further and creates new failure modes that the field is only beginning to grapple with. AI-generated APIs are the newest expansion of the BOLA surface. AI coding tools that scaffold endpoint logic do so quickly and efficiently, and at double the baseline secret leak rate. Whether those endpoints enforce object-level authorization correctly is a function of the prompts used to generate them, the review those prompts received, and the adversarial test coverage applied afterward. Organizations that have embedded security requirements into their AI coding tool configurations — ownership check as a required component of every endpoint scaffold, secrets-in-vault as a non-negotiable default — are addressing this. Organizations that are using AI coding tools as productivity accelerators without corresponding security configuration adjustments are building the BOLA surface of 2027. Agent-to-agent APIs are creating authorization chains that most API security practices weren't designed to evaluate. When an AI agent makes a tool call that calls an API that calls another service, the authorization context propagates through multiple hops. Whether each hop enforces the ownership model correctly, and whether the aggregate chain produces authorized outcomes even when individual hops appear compliant, requires analysis at the orchestration boundary that current API security tooling doesn't perform. This is not a solved problem. The breach categories it will produce are already structurally predictable. Machine identity sprawl will continue to grow faster than machine identity governance. Since 2021, secrets have been growing roughly 1.6 times faster than the active developer population. Every AI agent deployment creates non-human identities with scoped permissions. Those identities accumulate. The credential management failure that produced the current breach record will produce a larger breach record when the number of machine identities per organization doubles again. Real-time risk assessment — dynamically adjusting API access based on behavioral context, identity posture, and request risk signals — represents where the field needs to move. Continuous authorization rather than static permission grants. Access decisions that incorporate session history, anomaly signals, and behavioral baseline deviation. This is architecturally ambitious and requires the behavioral monitoring foundation that Pattern Four identifies as currently absent from most deployments. The prerequisite for all of these advanced capabilities is getting the five fundamentals right first. Zero-trust architectures built on top of authorization logic that doesn't enforce ownership checks are security theater. Advanced anomaly detection built on top of monitoring that has no behavioral baselines is expensive noise generation. The advanced work only creates value if the foundational discipline exists. The Pattern Is the Point The Navia breach didn't require a sophisticated attacker. It required an enumerable resource identifier and the absence of an ownership check. The same technique that worked against Parler in 2021, against USPS before that, against Spoutible, against Optus. The technique hasn't changed because the foundational failure it exploits hasn't been corrected at the organizational level. The five 2025 API security incidents are not the result of exotic exploits, but of fundamental security omissions. From forgotten legacy endpoints and broken authorization to excessive data exposure, they prove that the greatest threats lie in what is unmanaged, untested, and untracked. The industry has a framing problem. Every major breach gets treated as a novel incident requiring a novel analysis. The technical specifics differ; the structural failures underneath them are the same five patterns, in different combinations, producing different consequences. Treating each incident as sui generis means the field never builds the pattern recognition that would let organizations address the root cause rather than the surface symptom. Security maturity begins when organizations stop analyzing each breach individually and start recognizing the structural failures that keep producing them. The five patterns here are not predictions about where the next breach will come from. They are descriptions of the conditions present in most production API environments right now — conditions that produce predictable consequences when an attacker decides to look. The Navia breach affected 2.7 million people. It was discovered eight days after it ended. The notification went out eighty-six days after it began. The vulnerability that enabled it has been the industry's number-one documented API risk for seven years. The next one is already running. In an organization with excellent infrastructure monitoring, clean logs, and a security team that reviewed the codebase at launch. In a system where nobody wrote the adversarial authorization test that would have caught it. The data will be there in the logs. The pattern will be familiar. The prevention was always available. References Navia Benefit Solutions breach disclosure (Maine AG filing, March 2026)700Credit breach federal litigation records (February 2026)GitGuardian State of Secrets Sprawl 2025 and 2026Mandiant M-Trends 2026OWASP API Security Top 10 (2023 and 2025 editions)Equixly 2025 API Incident AnalysisAPIsecurity.io Top 5 API Vulnerabilities 2025CyberArk Machine Identity Management Report 2025SQ Magazine API Security Breach Statistics 2026Corelight Attacker Dwell Time Analysis (2026)SecurityWeek Navia breach reporting (March 2026)
When you are triaging an incident at 2 AM, caused by what your agent did, the only thing that matters at that moment is whether you can understand why the agent did what they did. Eighteen months into the agentic AI wave, the gap between what an agent logs and what a human needs is the bottleneck most teams are facing. It’s easy to answer “what the agent did,” but not “why the agent did it.” An AI agent will not be fully autonomous unless it can explain its reasoning to a variety of stakeholders, ranging from an engineering manager, a customer, or an audit reviewer, at the right granularity. Whether an agent ends up running mission-critical workflows or stays parked on low-stakes tasks boils down to one question: can a human understand what the agent is doing? Observability and Explainability Are Not the Same Thing The terms observability and explainability are borrowed from DevOps taxonomy, but in the context of AI agents, they don’t mean the same thing. And that origin matters for how we use them here. Observability is about what happened. It is a mechanical, deterministic record of tool calls, inputs, outputs, and branching paths. This is a structured logging problem, and it's largely solved. The remaining challenge at this stage is making these logs useful at scale. Explainability is about why it happened. This is the agent's reasoning behind its actions, the alternatives it considered, and how confident it was. This is a harder and partly unsolved problem. A real-world example that illustrates the point is that you are sitting at home one afternoon, and your dog comes home covered in mud. Observability is the tracker you have on your dog that shows he went to the park, the creek and the neighbor’s yard. You know where your dog was, but you lack the context as to why. That’s where explainability comes into play, where your dog would tell you why he jumped into the creek (if he could talk), which, as a parent, is the part you care about the most. When Do You Need Explainability? Consider a scenario where you are triaging a tier-1 severity incident. As you navigate the codebase and recent pull requests for root cause, you discover that the error lies in the agent modifying, for example, both the authentication logic and the database schema when it was only tasked with updating the authentication logic. When you look at that code, you have no idea why the agent took that specific approach. Extrapolate that to all the developers in the company, and you will see a macro pattern emerge where developers become less willing to rely on AI agents for critical workflows. Or worse, they add manual steps through the workflow, eroding the productivity gains AI promises and slowing adoption over time. Product Managers and Analysts may erroneously chalk that up to novelty effect, but it’s really a “trust tax” that your agent incurred. It failed to build trust with its users and has now been relegated to non-critical sidekick tasks,such as clustering the tickets on your issue tracking system. Three conditions push an agent into explanation-required territory: Acting on behalf: When an agent has write access to production systems, or is communicating with people on the user's behalf, or making decisions a human will be held responsible for. Cost of being wrong: When errors are expensive or irreversible. For example, agents writing public-facing social media posts, signing contracts, moving money, issuing refunds. Sensitive contexts: When agents are operating in regulated environments, working with PII or financial data, or generating output that feeds other agents downstream that operate in a regulated environment. If there is no explainability in such situations, errors can compound exponentially through automation chains. The Explainability Stack: 8 Layers of "Why" Explainability is not one feature; it's a layered architecture, and each layer is the right answer for a different user in a different context. Layer 0 – Outcome: Did it work? Yes/no. What most users want most of the time.Layer 1 – Narrative: A plain-language summary. "Created the PR, flagged three issues, posted inline comments on lines 42, 87, 203." Expedition report: the agent went out, came back, and here is what it found.Layer 2 – Decision trace: Why did it choose what it chose? What did it consider and reject? Reasoning made visible, not just actions.Layer 3 – Tool and branch log: What tools were called with what parameters, what was returned, what paths were explored, what dead ends were hit. This is where engineers live when something breaks.Layer 4 – Model reasoning: Chain-of-thought at inference time. Critical for evals, fine-tuning pipelines, and production debugging. Caveat: CoT may be confabulation, not true introspection.Layers 5–7 – The deep stack: Attention patterns, neuron activations, sub-symbolic feature detection. Territory of mechanistic interpretability research, and not a product surface (yet). The closer someone sits to the implementation, the deeper they want to go. A solutions engineer reviewing a Monday digest lives at Layer 1. A developer debugging an unexpected tool call lives at Layer 3. A researcher studying emergent model behavior lives at Layer 5. Explainability is not one-size-fits-all and is defined by where your user actually sits. Layered Disclosure Beats "Show Logs" Most teams collapse this entire stack into a single "show logs" toggle. That over-shows to non-technical users and under-shows to engineers. And ends up losing the trust of both. The fix is layered disclosure tied to specific surfaces: Layer 0 in the headline UI. Green check on the PR. "3 tickets resolved" badge.Layer 1 in asynchronous recaps. Monday digest in Slack. Weekly email summary.Layer 2 behind a one-click "why?" on any decision the user might disagree with.Layers 3 and 4 gated behind a developer console or audit export. The payoff shows up clearly in support across any enterprise deploying agents. When a customer complains and the Solution Engineer sees that the agent did the wrong thing, they can walk down the explainability stack with the customer, starting with the outcome and going deeper only as needed. Explainability, in other words, isn’t just an internal tool. It’s how the customers build trust with you, and that has a dollar value attached to it. The Goldilocks Constraint There's a calibration problem at the center of all this: Too little explainability: Users can't verify the agent's reasoning, so they won't hand it anything that matters.Too much explainability: Users hit decision fatigue. They stop reading and start rubber-stamping. Engagement becomes performative. The first failure mode is well-documented above. The second is more insidious - it produces the appearance of oversight without the substance. In a regulated environment, that gap can become a compliance liability faster than it looks. This is Goodhart's Law showing up in a new domain. When "volume of explanation" becomes the proxy for "quality of oversight," products optimize the proxy and lose the thing it was meant to measure. More logs, more traces, more reasoning text, all consumed by a reader who has stopped engaging. The reference point I keep returning to: what does a skilled human collaborator tell you after working on something independently? They don't narrate every search query or share their browser history. They say: "I looked at X and Y. X was a dead end for this reason. Y is the path forward, here is why, and here is what I am not certain about." That is the goal. Trust Is the Whole Game Foundational models are heading toward commoditization. The weights are commoditizing. The homework is not. A few years from now, the products with better explainability will be the ones running mission-critical workflows — and the ones without it will still be sidekicks. Trust is the foundation of any bond, for humans and for products. It is also the part of the stack you cannot ship in a model upgrade. At CodeRabbit, we are building explainability across all of our products. Our vision is to show developers what happened and why it happened without burying them in output. More on what that looks like soon.
Apostolos Giannakidis
Product Security,
Microsoft
Kellyn Gorman
Advocate and Engineer,
Redgate
Josephine Eskaline Joyce
Chief Architect,
IBM
Siri Varma Vegiraju
Senior Software Engineer,
Microsoft