Cloud Sprawl Is a Given; Cloud Complexity Doesn’t Have to Be

Cloud sprawl is inevitable — but complexity isn’t. IaC alone isn’t enough. Real control comes from visibility, automation, and enforcing every change through code.

Aharon Twizer

Aug. 06, 25 · Opinion

Likes (1)

Comment

Save

2.4K Views

Less than a decade ago, most teams ran dev, staging, and production in a single cloud account.
Today, that seems unimaginable.

Now, you start your cloud journey with at least 10 AWS accounts. One for each environment: one for networking, one for logging, one for security, one for… you get the idea. And if you have multiple business units or products? Multiply all that by at least three.

When we started ControlMonkey, I spun up infrastructure (infra), of course. A week later, one of our developers asked: “Why do we already have so many cloud accounts?” He was joking, kind of.

It's because the number of accounts isn’t the real problem; it's how cloud leaders manage them.

The Problem Isn’t Multi-Account — It’s Multi-Everything.

The challenge isn’t just scale — it’s volatility: Engineers change roles, requirements shift, new tools get introduced without much documentation, etc. Before long, you’ve got fragile systems no one fully understands. And when something breaks, your team burns hours playing detective instead of delivering. The stakes aren’t just technical; they’re organizational, operational, and financial.

Once you’re operating across dozens of cloud accounts, here’s what gets harder almost instantly:

Visibility: You’re jumping between tabs, dashboards, and logs just to find what’s running where.
Security and compliance: Every account becomes another attack surface, another audit trail, and another backlog.
Knowledge retention: The engineer who set up that “legacy” account is gone — and so is the context.
Engineering toil: Manual tickets, console clickops, drift investigations. Everyone’s firefighting.

And this isn’t just about clouds. Throw in SaaS tools, observability platforms, CI/CD systems, and version control, too. Now, you’re managing dozens of systems that impact your infra footprint.

The future of all this is clear: There will be more accounts, infra, and requirements, with less institutional knowledge than today as team members leave, and no slowdown in delivery expectations.

If your operating model can’t keep up, all this complexity is a recipe for chaos.

Everyone Knows the Answer: Infrastructure as Code

First, let’s be honest — just having Terraform or OpenTofu doesn’t mean you’re using it at scale, consistently, or safely. Infrastructure as Code (IaC) coverage by itself isn’t a panacea.

I suggest asking yourself (as I ask myself):

Are all infra changes going through code?
Can anyone bypass the pipeline with a manual change?
Are you constantly triaging alerts and rolling things back?

In my experience, most cloud teams don’t need to be convinced IaC is the answer. The real problem is enforcement and scale. Unless you can guarantee every change goes through code, you’re flying blind and paying for it in accumulating tickets, toil, and risk.

Wear a Seatbelt… And Drive Faster.

Here’s how I think about it (with another transportation metaphor).

Cloud teams today are driving 100 mph in a dense fog. We’re accelerating delivery, shipping faster, deploying AI workloads, and expanding globally. But without visibility and control, you’re speeding without a seatbelt.

A resilient (read: seat-belted) multi-account strategy starts with three things:

Total visibility: You can’t govern what you can’t see. Every account, every resource, every pipeline — visible in one place. Dashboards, not detective work.
Totalautomation: Infrastructure should only be delivered one way: through code. No manual shortcuts and no one-off pipelines; just one path to production, by design.
Total resilience: All configuration is backed up, and every change is validated and policy aligned before it reaches production. That’s what lets your team sleep at night and build during the day.

And if you don’t have these things? You’ll feel it everywhere: security issues, compliance audits, attrition, and endless toil. Meanwhile, the rest of the business isn’t waiting. AI, product velocity, global expansion — these don’t pause while you figure out how to regain control.

So, what to do?

Where to Start: 5 Questions in 10 Minutes

What’s your real IaC coverage by environment?
Can you detect if someone bypasses your Terraform or OpenTofu pipeline?
How much time does your team lose to infra pull request (PR) back-and-forths and manual reviews?
Are you going to manage more infra in the next 12–24 months?
Can you prove your production infra is compliant right now without manual digging?

If any of those made you pause, it’s probably already costing you.

Whether you build your own framework or adopt a platform is up to you. If it is multi-cloud on OpenTofu or Terraform, what matters is that you stop accepting cloud sprawl as inevitable and start governing all those accounts.

Total visibility. Total automation. Total resilience. That’s how you stay in control, no matter how fast you’re going.

Read more articles in the ControlMonkey collection.