AI Broke Your Definition of Done

When a machine writes most of the code, "the code shipped" stops being a finish line. The work that's left is the work your definition of done was already skipping.

Matt Watson

Jun. 24, 26 · Analysis

Likes (2)

Comment

Save

1.1K Views

Ask a Scrum team what "done" means, and you'll get a clean answer. The Scrum Guide calls the definition of done "a formal description of the state of the Increment when it meets the quality measures required for the product." Reviewed, tested, merged, deployable. Check the boxes, and you're done.

I've shipped software for more than 20 years, across three companies I founded and sold, and I think that definition is quietly wrecking products. It was always incomplete. What's new is that AI just made it useless, because the one thing it actually measured is now the one thing machines do best.

Let me make the case in order, because the AI part only lands if you see what was already broken.

"Code Shipped" Was Always the Wrong Finish Line

To be fair to the idea, a definition of done does real work. It's a shared checklist a team agrees on so that "done" means the same thing to everyone, instead of one engineer's "done" being another's "barely started." A typical one looks like this:

Code is written and peer-reviewed
Unit and integration tests pass
The feature meets its acceptance criteria
Documentation is updated
It's merged and deployed to production

That's useful, and I'm not arguing against having one. People sometimes confuse the definition of done with acceptance criteria, so it's worth being precise: acceptance criteria are specific to one story and ask, "Did we build the right thing?" The definition of done applies to every item on the team and asks, "Did we build it to our standard?" Atlassian and most agile coaches draw the same line.

Here's the problem. Both of them stop at the same place, and both treat "done" as a property of the code. Neither asks the only question that actually matters: did the customer's problem go away, and did anyone check?

For most teams, the honest answer is no, because the process gives them no reason to ask. The sprint ends, the ticket moves to Done, the next sprint starts, and there's no time built in to see whether the thing you shipped helped a single human being.

So you get the result the whole industry quietly lives with. Pendo studied feature usage across hundreds of products and found that 80% of features are rarely or never used. Most of what your team marks "done" is code no customer ever touches.

Sit with that. The majority of the work an engineering org ships, work that passed code review, passed tests, met its acceptance criteria, and was correctly marked done, delivered nothing. By the standard definition, all of it succeeded. That's not a quality problem. Your tests passed. It's a definition problem.

And it isn't free. Every unused feature is more code to maintain, more surface area for bugs, and more weight the product carries forever. It's technical debt that doesn't even have the decency to support something a customer uses.

I lived this at VinSolutions, the auto-dealer software company I co-founded and later sold. We built features for car dealers that we were proud of and that, when I actually looked at the usage data, dealers never opened. We had shipped them. We had marked them done. The dealer's problem was sitting right there, untouched, because nobody on the team owned the gap between "we deployed it" and "it changed something for the customer." The definition of done told us to stop the moment the code went out.

The Old Definition Built an Assembly Line

There's a structural reason this keeps happening, and it isn't a people problem.

The way we built software teams turned them into an assembly line. A typical feature passes through a row of stations: a project manager schedules it, a product owner writes the ticket, a front-end developer builds the screen, a back-end developer builds the logic, someone wires the API, a QA engineer tests it. Each person does their small part and passes the ball to the next station.

On an assembly line, your job is to do your piece and hand it off. So that is exactly what everyone's definition of done became. The product owner's done is "I wrote the ticket." The back-end dev's done is "the API works, I handed it to the front end." QA's done is "it passed my tests, I moved it along." Everybody on the line hits their own definition of done. The product still fails.

Nobody on the line was responsible for whether any of it mattered.

That's the real defect. When you split the work into stations, you also split ownership into pieces so small that no one is left holding the big picture. Each person becomes an order taker doing their part, and the only thing anyone measures is movement: did the work get to the next station? Whether the customer's problem was solved was never anyone's line on the board.

This is also why more process never fixes it. More handoffs, more ceremonies, more sign-off gates just make the line longer. The thing that's broken is that the line exists at all. Teams that ship products that matter tear the stations down and put product, design, and engineering in the same room, where the people building the thing also understand the problem and care how it lands.

Real Done Is When the Customer Succeeds

The fix is to expand what you call "start" and what you call "done." I wrote a whole chapter on this in my book, Product Driven, because it's the single shift that turns coders into product engineers. The line I keep coming back to: starting when the ticket arrives and finishing when the code ships isn't ownership, it's task completion.

Start doesn't begin when the ticket lands in a queue. It starts when your team sees a problem worth solving, which means engineers have to be close enough to the customer to see it. And done doesn't end at deploy. It ends when the user can succeed with what you built, and you've confirmed they did.

When you redefine done as a solved problem, the work that was always invisible suddenly becomes the job. You read the support tickets and sit in on the customer call. You watch a session recording of someone fumbling through the feature, then check the dashboard a week later to see if the fix actually moved anything. I call this the hidden work, and on most teams, it falls on whoever cares most, which is how it ends up trapped in two or three senior people who can't possibly scale it.

If this sounds like "outcomes over outputs," it is. The idea isn't new. Marty Cagan has spent years pointing out that most teams still ship feature roadmaps instead of outcome roadmaps. I'm not claiming to have invented it.

So why hasn't it stuck? For twenty years, this was a philosophy. A nice idea you could nod along to and then ignore, because the code still had to get written either way, and writing it was most of the job. That's the part that just changed.

AI Made the Incomplete Definition a Useless One

Here's why this matters more in 2026 than when I started writing about it. If "done" means the code is checked in, then AI is already done with everything.

AI now writes much of the new code on most teams. Google's CEO said 75% of the company's new code is now AI-generated, up from 25% two years earlier, and that the work has turned agentic: engineers supervise the AI instead of typing the lines themselves. Across the industry, the Stack Overflow 2025 developer survey found 84% of developers use or plan to use AI tools. The typing, the boilerplate, the glue between systems, the part that the old definition of done was secretly measuring, is the part being automated fastest. If your finish line is "the code exists, and it's merged," you've pegged the job to exactly what machines now do fastest.

When code is cheap to produce, the value moves entirely to the two ends that the standard definition ignored: deciding what to build, and verifying that it matters. The hard part of software development was never writing the code. It's understanding the problem you're trying to solve, and that's the half of the work AI can't do for you.

There's a second reason verification matters more now, not less. Cheap code isn't the same as correct code. Veracode tested AI-generated code and found 45% of it carried a known security flaw, and the bigger, newer models didn't meaningfully fix it. When a machine can produce plausible code in seconds, confirming that the thing actually works and actually helped is the part that needs a human most. AI inflates the volume of "looks done" while doing nothing for "is done."

So the new definition of done is the inverse of the old one. Done isn't "we wrote the code." A machine does that in seconds. Done is "we figured out the right thing to build, we shipped it, we confirmed it delivered value, and we learned something from how it got used." That arc is the job now. The code in the middle is the easy part.

The old definition of done versus the new one. The old: code reviewed, tests passing, acceptance criteria met, merged to production, with the finish line at "the code shipped" and nobody owning whether it mattered. The new: everything in the old quality bar, plus picking the right thing to build, confirming it delivers value, and learning from real usage, where done means the problem is solved and verified.

What to Actually Do About It

You don't change a definition of done with a memo. You change it by changing what the team is responsible for and what you reward. Four things that move it:

Expand the boundaries on paper. Rewrite your definition of done so the last line isn't "deployed to production." Make it "confirmed the customer outcome we intended." Keep every quality check you already have. You're not lowering the bar; you're adding the part that was missing. If you can't confirm the outcome, the work isn't done; it's shipped. Those are different words for a reason.
Put engineers near the customer. They can't own an outcome they never see. Get them into support tickets, customer calls, and usage data. If they genuinely can't reach end users because you're in a regulated space or a deep platform team, give them the next best proxy: the usage data, the support queue, and a product manager who carries the customer's voice into the room. The moment an engineer watches a real user struggle, the definition of done changes itself.
Build follow-through into the schedule. Leave room after a launch to check whether it worked. This only works once the team believes done means solved, so add the time after the belief changes, not instead of it. A feature nobody validates is a coin flip you chose not to look at.
Reward outcomes, not output. As long as you celebrate "shipped 40 tickets this sprint," you're paying people to hit the old definition of done. Celebrate the fix that dropped support volume, or the feature that moved a real number. In an AI-assisted team, this is existential: ticket throughput is now a measure of how fast your tools autocomplete, not whether anyone built the right thing.

None of this requires a new framework. It requires deciding that done means the customer succeeded, and then building the team and the process around that one sentence.

The Bottom Line

The old definition of done assumed a human author, human intent, and that writing the code was most of the job. The last assumption just collapsed. When the machine writes the code, "the code is finished" measures the cheapest, most automated part of the work, and stays silent on the only parts that were ever hard: deciding what to build, and proving it mattered.

Most teams will read this and keep their old definition anyway, because the old one is comfortable, and it lets everyone go home when the code merges. The ones that change it will build products people actually use. That was always the game. AI just stopped letting us pretend otherwise.

Opinions expressed by DZone contributors are their own.

Related

Trending