You Don't Get to Retrofit Trust: Why API Security Must Be Designed In, Not Bolted On
A field-level examination of how one startup got it right — and what the rest of the industry keeps getting catastrophically wrong.
Join the DZone community and get the full member experience.
Join For FreeThere is a specific kind of silence that falls in a war room after a breach.
I've been in two of them. Not as the person responsible, but as the journalist who got the call. The first was at a mid-sized fintech in 2019. The second, more recently, was at a SaaS company that had been operational for less than eighteen months. In both cases, the root cause wasn't sophisticated. No nation-state actor. No zero-day that nobody had ever seen. In both cases, someone had built an API without thinking seriously about who — or what — would be on the other end of it. And the results were exactly what you'd expect when you hand a loaded system to the world with the safety off.
I think about those rooms a lot when I read the breach reports. Which is often.
The Scale of a Problem We Keep Pretending Is Solvable Later
Let's start with numbers, because the numbers are damning.
In 2025 alone, APIs accounted for 11,053 of the 67,058 published security bulletins — roughly 17% of all reported software vulnerabilities, making them one of the largest single attack surfaces in modern software. That figure has been climbing year over year, and the trajectory shows no signs of flattening. Nearly half of the newly added CISA Known Exploited Vulnerabilities in 2025 — 106 of 245, or 43% — were API-related. No other single surface comes close.
Despite this, only 21% of organizations report a high ability to detect attacks at the API layer. And a mere 13% can prevent more than half of incoming API attacks.
Read that again. Thirteen percent. In an era where APIs are the connective tissue of virtually every digital product and service — banking, healthcare, logistics, authentication, payments — the overwhelming majority of organizations cannot stop more than half of the attacks aimed at their most exposed surfaces. That's not a gap. That's a structural failure. And the reason it persists is not technical. The technology to build secure APIs exists. It has existed for years. The reason it persists is cultural: the industry keeps treating security as a phase of development rather than a dimension of it.
A Brief and Uncomfortable History of Recent Mistakes
To understand why security by design matters, you have to understand what security by neglect actually looks like at scale. The past eighteen months have been instructive.
In February 2024, a leaky API at Spoutible exposed user data, including bcrypt-hashed passwords. In March, nearly 13 million API secrets were exposed through public GitHub repositories, leaving companies vulnerable as attackers exploited the credentials to gain unauthorized access. In April, critical vulnerabilities in PandaBuy's API led to the theft of data affecting 1.3 million users. In May, attackers accessed Dropbox's production environment via compromised API keys, exposing customer data and multi-factor authentication information.
A separate incident that same year involved a buggy API that granted unauthorized access to 650,000 sensitive messages, leaked Office 365 credentials, and allowed a penetration tester to retrieve a trove of confidential communications. A Trello API exposure compromised over 15 million users by linking private email addresses with public Trello account data.
These are not edge cases. They are the mode. The average, repeated, utterly predictable outcome of building fast and securing later.
But the incident I keep returning to — the one that should have been a defining moment of reckoning for how technical teams think about credential management — happened in July 2025. Marko Elez, a 25-year-old DOGE employee with access to sensitive databases at the Social Security Administration, the Treasury and Justice departments, and the Department of Homeland Security, committed a code script to GitHub called "agent.py" that included a private API key for xAI. That single exposed key unlocked access to at least 52 large language models, including one called "grok 4-0709" created just four days before the leak.
Here is the part that matters most: after security researcher Philippe Caturegli of Seralys alerted Elez to the exposure, the GitHub repository was removed — but the API key itself was not revoked, and access to the models remained active. The repo was gone. The damage was still live.
Tom Pohl, Director of Penetration Testing at LMG Security, put it bluntly: "If you can't rotate a key without rebuilding or redeploying code, you don't own the key — it owns you."
That sentence deserves to be printed and framed in every engineering office that has ever shipped a credential inside a config file.
Caturegli was even more pointed: "One leak is a mistake. But when the same type of sensitive key gets exposed again and again, it's not just bad luck — it's a sign of deeper negligence and a broken security culture."
And this, right here, is the core problem. It was not the first time a DOGE staffer had leaked an xAI key. It was the second, the first having been discovered in May of the same year, with keys granting access to custom LLMs built on Tesla and SpaceX internal data. Same organization. Same class of mistake. Different month.
A broken security culture doesn't produce one incident. It produces a pattern.
What "Security by Design" Actually Means — and What It Doesn't
Security by design is a phrase that has been so thoroughly absorbed into vendor marketing that it has nearly lost all meaning. Every platform claims it. Every white paper invokes it. Most of them are describing something considerably less rigorous than the words suggest.
What it actually means is this: security properties are not features you add to a system. They are constraints under which you build one. The difference is not semantic. It is architectural, and it shows up in every technical decision the team makes from the first commit forward.
There is a startup — cloud-native, public-cloud Kubernetes deployment, handling user profile data and financial transactions — whose build process I've been examining closely. They had six months, a small team, regulatory obligations around data protection and access logging, and a performance mandate that ruled out heavyweight solutions. Exactly the kind of constraints that, in most shops, produce the decision to defer security work until post-launch.
They didn't defer it. What they did instead is worth studying in detail.
Authentication: The 15-Minute Decision That Changes Everything
The team chose short-lived JWT access tokens with a 15-minute expiration window.
This sounds minor. It isn't.
A JWT consists of three parts: a header, a payload, and a signature. The signature exists to guarantee that the data transmitted in the token hasn't been tampered with. If signature verification is missing or improperly implemented, an attacker can forge the token entirely — changing the user identifier in the payload to point to a different account and gaining unauthorized access to that user's data. This is not a theoretical attack. It has been the root cause of real production breaches in the past two years.
JWT misuse is consistent: APIs accept unsigned tokens — the so-called "alg=none" vulnerability — or fail to rotate signing keys on any predictable schedule. Both failures extend the window during which a compromised token remains useful to an attacker. A 15-minute expiration collapses that window. It doesn't eliminate the risk of token theft, but it radically limits what theft can accomplish.
The operational cost was real. Building a secure refresh flow and revocation mechanism added engineering complexity the team's timeline didn't easily accommodate. They built it anyway. The logic was simple: a token that expires in 15 minutes is a recoverable problem. A token valid for eight hours — or one with no expiration claim at all — is an open door with a handshake.
What they also did, which is less commonly discussed, was enforce rate limiting on authentication endpoints specifically. Authentication endpoints with no rate limiting are exactly what credential stuffing campaigns are designed to exploit. Removing that surface isn't complex. It is, however, a decision that has to be made early, because adding it to a live production system that wasn't designed with it creates friction — and friction, in engineering teams under delivery pressure, tends to lose.
Authorization: The Boring Problem That Breaks Everything
If authentication is who you are, authorization is what you're allowed to do. Most security discourse focuses on authentication — it's the dramatic failure mode, the stolen password, the compromised token. Authorization failures are quieter and, in practice, significantly more common.
The startup implemented role-based access control from day one, with authorization checks enforced at every endpoint — not just at the UI layer, not just at the gateway, at the endpoint. Authorization checks must happen at every API endpoint. Access should be granted only to permitted resources, based on user roles and the sensitivity of the resource being requested. This sounds like an obvious design principle. It is frequently violated.
Consider what happens when it isn't: a backend API endpoint left unauthenticated generates an OAuth 2.0 app-only access token for Microsoft Graph via the client credentials flow. The token carries high-privilege application permissions — User.Read.All, enabling complete directory enumeration. Since no authentication or caller restrictions were enforced, anyone on the internet could obtain a valid Graph token and directly query Microsoft Graph endpoints, exposing the information of over 50,000 Azure AD users at a single organization.
The misconfigured API in that case wasn't a legacy system running on forgotten infrastructure. It was a modern integration with a modern identity provider, built without authorization checks because nobody on the team had stopped to ask: what happens if someone calls this endpoint who shouldn't be calling it?
The startup asked that question at the beginning. They started with broader roles, refined them incrementally as the product matured, and made least-privilege a principle rather than an optimization. It added policy complexity. It also meant no single compromised credential could traverse the system laterally.
Input Validation: Why Allow-Lists Win
The team chose strict allow-lists for request validation — every field, every endpoint, every time.
The distinction between allow-listing and block-listing matters more than most developers appreciate. Block-listing is intuitive: you identify known bad inputs and reject them. The problem is that the set of known bad inputs is never complete. Attackers have been innovating on injection techniques for decades. Any block-list you write today will have gaps tomorrow.
Allow-listing inverts the logic. You define exactly what is acceptable — specific data types, character sets, length constraints — and reject everything that falls outside those boundaries. It is more rigid to implement and requires more upfront design work. It is also substantially more effective, because it doesn't depend on the defender knowing what the attacker will try.
In 2025, injection attacks dropped from first to second place in API attack volume — but remained in the top two every single quarter. They are particularly relevant as AI-driven APIs pass untrusted input directly into models and downstream pipelines. The migration of business logic into AI-backed APIs hasn't reduced the injection surface. It has expanded it, because an LLM that processes untrusted text is an injection target with additional downstream consequences.
Rate limiting ran alongside validation. The team set conservative per-user thresholds — tight enough to curb abuse, loose enough not to block legitimate traffic. They accepted minor throughput overhead in exchange for suppressing malicious burst patterns. Insecure resource consumption — driven by automated scraping, enumeration, and denial-of-service patterns — rose from seventh place in 2024 to fourth in 2025 and held that position through the year. Rate limiting is not a performance feature. It is a defense against a threat class that has been growing consistently for two years.
Secrets Management: The Problem That Keeps Appearing in Headlines
The startup used a managed secrets vault with automatic rotation. No credentials existed in the codebase. No API keys in config files. No database passwords in environment variables committed to version control.
This sounds basic. It is, in fact, the single most commonly violated principle in production API security.
GitGuardian found more than 10 million secrets exposed in public repositories in a single year. The DOGE/xAI incidents weren't anomalies. They were illustrations of the norm — the everyday practice of developers treating credentials as configuration rather than secrets, embedding them in code because it's convenient, and discovering the cost of that convenience only after something goes wrong.
LMG Security's Tom Pohl noted at DEF CON that he's found Apple- and Google-blessed TLS certificates with their private keys embedded in Fortinet firewall firmware — not expired, valid production certificates — by simply unzipping firmware and searching for keywords. Hardcoded admin credentials in network appliances, AES keys in compiled Java JARs, authentication tokens in printer firmware. These aren't advanced techniques to find. They are basic.
The startup's architecture made this entire class of exposure impossible by design. The vault handled issuance and rotation. No developer ever touched a raw credential. Initial setup took time. Ongoing rotation policies added maintenance overhead. The tradeoff was explicit: accept operational complexity now, or accept the risk of a credential aging quietly in a repository until someone finds it, which, based on the data, will happen.
DevSecOps: The Pipeline That Complains Until It Matters
The team wired static code analysis, dependency scanning, and container-image checks into the CI/CD pipeline on every commit.
The first two weeks, by the lead developer's own account, were genuinely annoying. Builds slowed. False positives fired. Developers had opinions about this.
Then the pipeline caught a vulnerable dependency in a third-party authentication library before it reached production. A real vulnerability, in a library the team was actively using, was caught before it became a runtime problem. The complaints stopped.
GitLab's 2024 Global DevSecOps Survey found that while 56% of developers release code multiple times daily, only 29% have fully integrated security into their workflows. That gap is where the exposure lives. The velocity of modern development — multiple deployments per day, hundreds of dependencies, automated container builds — creates a surface area that no human review process can cover consistently. Automated scanning doesn't slow development down in any meaningful sense. What it does is enforce a consistent standard at a pace that matches the delivery cadence.
The container-image scanning deserves specific attention. Kubernetes deployments in public cloud environments create a supply chain: every image that runs in a pod is either verified or trusted on faith. When an organization integrates a third-party service via an API, it inherits the security posture of that vendor — and vetting that posture is not a one-time event. It requires continuous assurance as the vendor's environment changes. Scanning every image on every commit is the only way to catch the moment when that inherited posture degrades.
The Architecture That Doesn't Make Headlines
There is something worth acknowledging about this startup's outcome: it is, on its face, unremarkable.
The API launched on schedule. No major incidents in production. No breach notification letters. No postmortem was published to a shocked engineering community. The compliance audit found nothing to flag. The system performs within the latency targets the product team required.
This is what success looks like in security. Not a dramatic rescue. Not a last-minute patch before a zero-day hit production. Nothing happening — because the conditions for something happening were designed out from the beginning.
Only 13% of organizations can prevent more than half of API attacks. The startup is in that 13%, not because they had a larger security budget or a more experienced team. They had six months and a limited headcount. They are in that 13% because they decided, at the beginning, that security was a design constraint rather than a delivery risk.
That decision compounded. Short-lived tokens meant that when credentials inevitably cycle through exposure risk — every public API has this surface — the blast radius was bounded by time. RBAC enforced from day one meant no credential, however obtained, could traverse the full system. Allow-list validation meant the injection surface never existed in the first place. Vault-managed secrets meant the DOGE scenario — the credential in the commit, the key that keeps working after the repo comes down — was structurally impossible.
These controls did not add up to a sum greater than their parts. They composed. Each one reduced the value of defeating the others.
The Debate That Needs to Happen
Here is where I want to be direct, because there is a conversation the industry is not quite having, honestly.
Security by design is often framed as a best practice — something well-resourced teams do when they have the luxury of time and the maturity to prioritize it. The implicit message is that it's an ideal, not an expectation. That startups with six-month timelines and small teams should be forgiven for the security debt they accumulate, because they were moving fast, and the alternative was not shipping.
I think this framing is doing serious damage. And I think the damage is not abstract.
When the Trello API exposed 15 million users' private email data, those were real people. When the Spoutible breach surfaced bcrypt-hashed passwords, those were real credentials that real attackers ran real cracking attempts against. When a ChatGPT plugin vulnerability sat unpatched for nearly a year while proof-of-concept exploit code was publicly available, and then received over 10,000 exploitation attempts from a single IP address within a single week in March 2025 — those were real API consumers, real integrations, real downstream systems exposed.
The cost of retrofitting security is not paid by the engineering team that deferred it. It is paid by the users who trusted the product.
IBM's 2024 Cost of a Data Breach report established the global average breach cost at $4.88 million. That number includes incident response, regulatory exposure, reputational damage, and customer churn. It does not include the class action exposure that follows significant PII breaches, the partner contract reviews that get triggered by security incidents, or the months of engineering work that go into rebuilding user trust after a disclosure.
The startup in this case study spent engineering hours upfront on refresh token flows, RBAC policies, and vault configuration. I would estimate — generously — a few weeks of additional development time across the team. That is the cost of security by design for a product of this scale. The cost of the alternative is measured in a different currency entirely.
What the Next Eighteen Months Will Make Worse
There is a dimension to this problem that the industry is only beginning to grapple with seriously.
Of the 2,185 AI vulnerabilities identified in 2025, 36% also qualified as API vulnerabilities. Among AI-related Known Exploited Vulnerabilities, the overlap was identical — 21 of 58 exploited AI vulnerabilities involved APIs directly. As AI matures, its risks don't shift elsewhere. They still come through APIs.
The integration of LLMs into production systems has expanded the API attack surface in a specific and poorly understood way. When a user input reaches an LLM endpoint, it is no longer just a request for data. It is an instruction to a system that generates outputs, triggers downstream actions, and in agentic configurations, executes code. Injection attacks against these endpoints don't just exfiltrate data — they can redirect behavior, manipulate outputs, and compromise the integrity of anything the model produces.
The Model Context Protocol, which serves as the control-plane API for autonomous agents, had already accumulated 315 documented vulnerabilities as of 2025, accounting for 14.4% of all AI vulnerabilities. From Q2 to Q3, MCP vulnerabilities increased by 270%.
The common failure modes are familiar: over-permissioned tools, direct API access without adequate authentication and authorization, and the absence of runtime enforcement. The same failures that produced the Trello breach. The same failures that produced the DOGE API key incidents. The same failures that have been producing API breaches for a decade, now running on infrastructure that can act autonomously in response to compromised inputs.
Security by design is not a practice that AI-era architecture has made optional. It's one that the AI era has made urgent.
Five Things That Are True and Worth Arguing About
I want to close with positions, not summaries. These are the things I believe the evidence supports, and the things I expect reasonable engineers to push back on.
1. Short token lifetimes are not an operational burden. They are an operational discipline. The argument against 15-minute JWTs is always some version of "the refresh flow is complex." The counterargument is what happens when a 24-hour token belonging to an admin user gets harvested from a compromised device. Complexity in the refresh mechanism is a solved engineering problem. A valid admin token circulating in attacker infrastructure for 24 hours is not.
2. DevSecOps scanning is not optional at modern delivery velocities. If your team ships multiple times per day, human review cannot maintain consistent security coverage across that surface. Automation doesn't replace judgment. It enforces the standards that judgment has already established, at the speed the pipeline requires.
3. Secrets in code are not a developer error. They are an architectural failure. If the path of least resistance in a codebase is to put a credential in a config file, the architecture created that path. Pre-commit hooks, automated scanning, and vault integration don't prevent this class of exposure by catching it after the fact. They prevent it by making the wrong path harder than the right one.
4. RBAC granularity and security are not in tension. The argument that fine-grained access controls are too complex to maintain is, in practice, an argument that the team hasn't built tooling to manage them. That's a different problem. Broad permissions aren't simpler — they're deferred complexity that manifests as blast radius during an incident.
5. The industry needs to stop calling security a best practice. Best practices are things you do when you have the resources and culture to do them. Security is a property of the system that either exists or doesn't. If it doesn't exist at launch, the users bear the cost — not the engineering team, not the investor, not the person who made the timeline call. The people who trusted the product.
The Unglamorous Conclusion
The startup I described in this piece didn't do anything novel. There are no proprietary techniques here, no advanced threat modeling frameworks that require external consultants, no six-figure tooling budget. The OWASP API Security Top 10 has documented the dominant failure modes for years. The defenses are known. The implementation patterns are well-established. The engineering patterns — vault-managed secrets, short-lived tokens, RBAC, allow-list validation, CI/CD scanning — are all things that every engineering team working on a production API could implement on a standard startup budget.
What this team had was not resources. It was a decision, made early and maintained under pressure, that security was a design constraint and not a delivery variable. They treated every tradeoff explicitly — token lifetime versus convenience, RBAC granularity versus overhead, scan depth versus build speed — and made those tradeoffs in writing, with awareness of what they were accepting in each direction.
That is security by design. Not a posture. Not a framework. A decision about what kind of architecture you are building, made before the architecture exists.
The alternative — and the industry's dominant practice — is to build the architecture, ship it, and discover what kind of security it has when someone tells you what they found.
Brute force attacks moved into the top three API breach methods in 2025. DDoS and fraud remain the most frequent vectors. Injection hasn't left the top two in any quarter of the year. None of this is new intelligence. None of it is surprising to anyone who has been reading the threat reports.
The gap isn't knowledge. The gap is will — and sometimes, a concrete model of what it looks like when someone actually closes it.
This analysis is grounded in documented case study materials, publicly reported breach data, and open-source threat research. The startup referenced declined attribution. All technical claims are independently sourced and footnoted above.
Opinions expressed by DZone contributors are their own.
Comments