One Stolen Key, One Stolen Token: Why Machine Identity Is Cloud-Native's Quietest Crisis — and the Only Fix That Actually Holds

Learn how stolen machine credentials fuel major cloud breaches and how policy-as-code and short-lived identities help stop modern attacks.

Igboanugo David Ugochukwu

CORE ·

Jul. 01, 26 · Analysis

Likes (0)

Comment

Save

160 Views

On December 2, 2024, a security vendor called BeyondTrust noticed something wrong inside its own AWS account. By the time the investigation closed, the story that emerged was almost absurdly simple for something with this much fallout: an attacker — later attributed to the Chinese state-sponsored group Silk Typhoon — had used a software flaw to reach into a BeyondTrust cloud account and pull out an API key. Not a password. Not a phishing victim's login. A string of characters that a piece of software used to talk to another piece of software.

With that one key, the attacker walked straight into the U.S. Department of the Treasury, reset internal passwords, accessed workstations inside the Office of Foreign Assets Control, and read unclassified documents before anyone noticed. The Treasury disclosed it to Congress on December 30. The Department of Justice indicted the alleged operators in March 2025.

If you've never worked in security, here's the plain-English version of what happened: somewhere inside the machinery that runs modern software, there's almost always a "key" — a credential one computer program shows another to prove it's allowed to be there. Humans log in with passwords and, increasingly, a second factor on their phone. Software mostly doesn't. It just holds a key, often for months or years at a time, and whoever holds that key gets treated as trustworthy, no questions asked. The Treasury breach happened because one of those keys ended up in the wrong hands and nothing else stood between that key and a federal agency's internal documents.

Two months later, a different flavor of the same problem produced the largest theft of digital assets in history.

$1.5 Billion, One Developer's Laptop

In February 2025, the cryptocurrency exchange Bybit lost approximately $1.5 billion in Ethereum in a single operation. Palo Alto Networks' Unit 42 threat research team later tied the attack to Slow Pisces, a North Korean state-linked group also known as Lazarus or TraderTraitor, and traced the entry point back to a developer at a third-party vendor that managed Bybit's multi-signature wallet infrastructure. The attackers didn't break Ethereum's cryptography. They stole that developer's AWS session tokens — another form of machine credential — and used them to gain administrative access to cloud infrastructure that could authorize transactions, then quietly altered what a routine-looking transaction actually did before it executed.

Unit 42 then found the same pattern at a second cryptocurrency exchange later in 2025, this time running through Kubernetes, the orchestration system that now runs much of the cloud-native world. The attackers phished a developer, used the access on the developer's machine to drop a malicious workload directly into the exchange's production Kubernetes cluster, and had that workload expose its own service account token — a credential Kubernetes automatically hands to every running pod so it can talk to the cluster's control plane. The stolen token happened to belong to a CI/CD management identity with sweeping permissions. From there, the intruders queried secrets across namespaces, planted a backdoor, and pivoted into the exchange's cloud-hosted backend, reaching the financial systems behind it. Unit 42's broader research found suspicious activity consistent with service-account-token theft in 22 percent of cloud environments analyzed in 2025, and recorded a 282 percent year-over-year jump in Kubernetes-directed attacks overall.

Different industries, different attackers, same root cause: a non-human credential that was both long-lived and broader in scope than the task in front of it ever needed.

Why This Keeps Happening

Identity and access management, as a discipline, was built for people. People have managers, onboarding dates, performance reviews, and an HR system that flags them the day they leave. A workload has none of that. A microservice can spin up, do its job, and disappear thousands of times a day; a service account, by contrast, often gets created once and never revisited again. CyberArk's research has been blunt about the resulting imbalance: machine identities now outnumber human ones by more than 80 to 1 in the average enterprise, and the security architecture protecting most of them still assumes the old, human-shaped world — an org chart, not a fleet of ephemeral containers.

That mismatch is exactly why static secrets sprawl the way they do. A developer hardcodes a key during a deadline crunch, intending to externalize it "later." A Terraform state file ends up holding plaintext cloud credentials because nobody flagged it in review. A default Kubernetes service account token, more permissive than anyone realized, gets mounted into a pod by default because turning that off requires deliberate configuration most teams never get around to. None of these are exotic mistakes. They're the ordinary residue of moving fast, and they accumulate the way unpaid debt does — quietly, until the day someone calls it in.

The structural fix has a name by now, even if adoption is uneven: frameworks like SPIFFE and its production runtime SPIRE replace the static key with a short-lived, cryptographically attested identity — something closer to a backstage pass that's reissued before every single show rather than a master key cut once and handed out forever. A workload proves what it actually is — which Kubernetes service account launched it, which container image it's running — and receives an identity document valid for minutes, not months. Steal that, and an attacker is racing a clock that resets automatically rather than one that only resets when a human notices something is wrong. Cloud providers offer narrower versions of the same idea for their own platforms — AWS's IAM Roles for Service Accounts, Google's Workload Identity Federation — letting a workload trade a short-lived token for cloud access instead of carrying a standing key in the first place.

But identity alone doesn't close the loop, and this is the part most "zero trust" conversations skip past. None of it matters if nothing in your pipeline actually enforces it.

Security By Design Is a Promise. CI/CD Is Where You Find Out If It's Kept.

Plenty of organizations will tell you, with complete sincerity, that they practice "security by design." Most of them mean it stopped at an architecture review months before the first line of code shipped. That's not a fix, it's a memory of one. Code that deploys daily — sometimes hourly — doesn't wait for an annual audit to catch a misconfigured token or an over-privileged service account, and by the time a quarterly review would have caught the BeyondTrust-style key or the Bybit-style session token, the damage in both real cases was already done.

The only version of "security by design" that survives contact with a real production pipeline is the one written as code and enforced automatically, at every stage, by something that can actually say no.

Picture the pipeline this way:

    Plain Text
   
 

   Developer commits code
        |
        v
CI build triggers
        |
        +--> SAST (code flaws) + SCA (dependency CVEs) + secrets scan
        |          |
        |       fail? -----> build blocked, developer notified
        |          |
        |         pass
        v
Generate SBOM + sign artifact (Cosign) + build provenance (SLSA)
        |
        v
Policy-as-code gate (OPA / Kyverno)
        |
        +--> checks: image from approved registry? running as non-root?
        |             signature valid? provenance matches expected builder?
        |             service account scoped to least privilege?
        |
        |       fail? -----> deployment rejected, logged, alert raised
        |       pass
        v
Deploy to production
        |
        v
Runtime monitoring + short-lived workload identity (SPIFFE/SPIRE, IRSA)
        |
        v
Continuous re-verification — nothing trusted indefinitely
  

Every box in that chain is a place where the Treasury breach or the Bybit breach could have stopped instead of escalating. A policy-as-code rule using Open Policy Agent's Rego language, or Kyverno's Kubernetes-native YAML equivalent, can flatly refuse to schedule a pod requesting broader RBAC permissions than its declared task needs — which would have directly undercut the over-privileged CI/CD identity that the crypto-exchange attackers rode into the cluster. A signing and attestation step using Cosign, tied to SLSA provenance, means a deployed artifact has to prove which build system actually produced it before it runs at all — closing exactly the kind of trust gap that let a single compromised AWS asset cascade into a stolen infrastructure API key at BeyondTrust. None of this is theoretical tooling. Red Hat's own Enterprise Contract documentation describes signing as tying an image to a specific builder identity precisely so an attacker can't substitute a malicious binary without the signature itself breaking and announcing the tampering.

The Uncomfortable Bottom Line

I don't think either of this year's headline breaches happened because anyone involved was careless in some obvious, fireable way. They happened because the credential — not the firewall, not the encryption, not the cleverness of the malware — was the actual asset under attack the entire time, and almost nothing downstream of "the key worked" was built to ask a second question. Gartner named non-human identity management a top strategic security trend for exactly this reason in 2025, and OWASP followed with a dedicated Non-Human Identity Top 10 the same year, an overdue acknowledgment that the tooling built for human logins was never going to be enough.

My honest prediction, watching this pattern repeat across a federal agency and two of the largest crypto exchanges on earth within twelve months of each other: the organizations that treat policy-as-code enforcement and short-lived machine identity as default infrastructure — not optional hardening bolted on after an incident — are the ones that won't end up writing the next version of this story. Everyone else is currently running on borrowed time, secured by a key that, statistically, is already older than it should be.

Kubernetes Cloud security

Opinions expressed by DZone contributors are their own.

Related

Trending