AI Code Generation: The Productivity Paradox in Software Development

AI boosts coding speed short-term, but long-term gains need human oversight, reuse, and focus on code quality beyond cycle time.

Ammar Husain

CORE ·

Nov. 05, 25 · Analysis

Likes (4)

Comment

Save

3.2K Views

Measuring and improving developer productivity has long been a complex and contentious topic in software engineering. With the rapid rise of AI across nearly every domain, it's only natural that the impact of AI tooling on developer productivity has become a focal point of renewed debate.

A widely held belief suggests that AI could either render developers obsolete or dramatically boost their productivity — depending on whom you ask. Numerous claims from organizations linking layoffs directly to AI adoption have further intensified this perception, casting AI as both a disruptor and a catalyst.

In this article, we'll examine the current landscape and delve into recent studies and surveys that investigate how AI is truly influencing developer productivity.

Studies

Let's explore the findings from the studies below, which assess the impact of AI tooling on developer productivity.

Study #1: Experienced Open-Source Developer Productivity

To evaluate the impact of AI coding assistant tools on the productivity of experienced open-source developers, a randomized controlled trial (RCT) was conducted from February to June 2025 using the tools.

A total of 16 developers with an average of 5 years of experience were chosen to complete a total of 246 tasks in mature projects. These tasks were randomly assigned among developers, with either AI tools being allowed or disallowed, respectively.

Before starting tasks, developers forecast that task completion time would decrease by 24% with AI. After completing the task, developers estimated that with AI, the completion time had been reduced by 20%. However, on the contrary, the study found that allowing AI actually increased task completion time by 19%. Moreover, these results are in stark contradiction of experts prediction of task completion time reduction of up to ~39%.

Below is the summary of the prediction and findings mismatch:

Experts and study participants misjudged the speedup of AI tooling. Image courtesy of respective research.

Although the study concludes that AI tooling slowed developers down, it could be due to a variety of factors, with five key factors for observed slowdown listed below:

Over-optimism about AI usefulness (Direct productivity loss). Developers are free to use AI tools as they see fit, but their belief that AI boosts productivity is often overly optimistic. They estimate a 20–24% time reduction from AI, even when the actual impact may be neutral or negative, potentially leading to overuse.
High developer familiarity with repositories (Raises developer performance). AI assistance tends to be less helpful, and may even slow developers down, on tasks where they have high prior experience and need fewer external resources. Developers report AI as more beneficial for unfamiliar tasks, suggesting its value lies in bridging knowledge gaps rather than enhancing expert workflows.
Large and complex repositories (Limits AI performance). Developers report that LLM tools struggle in complex environments, often introducing errors during large-scale edits. This aligns with findings that AI performs worse in mature, large codebases compared to simpler, greenfield projects.
Low AI reliability (Limits AI performance). Developers accept less than ~44% of AI-generated code, often spending significant time reviewing, editing, or discarding it. Even accepted outputs require cleanup, with ~75% reading every line and ~56% making major changes, leading to notable productivity loss.
Implicit repository context (Limits AI performance, raises developer performance). AI tools often struggle to assist effectively in mature codebases due to a lack of developers' tacit, undocumented knowledge. This gap leads to less relevant suggestions, especially in nuanced cases like backward compatibility or context-specific edits.

Due to these factors, the gains of auto-code generation are offset considerably, and thus the significant contrast in perceived/forecasted and actual results in developer productivity is exposed.

Also, with the AI tooling, the developer is required to spend additional time on prompting, reviewing AI-generated suggestions, and integrating code outputs with complex codebases. Thus, adding to the overall completion time. See below for average time spent per activity — with and without AI tooling.

Average time spent per activity. Image courtesy of respective research.

Takeaway: The study reveals a perception gap where AI usage subtly hampers productivity, despite users believing otherwise. While findings show a slowdown in large, complex codebases, researchers caution against broad conclusions and emphasize the need for rigorous evaluation as AI tools and techniques continue to evolve. Thus, the study should merely be considered as a data point in evaluation and not a verdict.

Study #2: GitClear

The GitClear study analyzed ~211 million structured code changes from 2020 to 2024 to assess how AI-assisted coding impacts developer productivity. It categorized changes — like added, moved, copied/pasted, and churned lines — using GitClear's Diff Delta model to track short-term velocity versus long-term maintainability. Duplicate block detection was introduced to measure how often AI-generated code repeats existing logic. The methodology links rising output metrics to declining code reuse, revealing hidden costs in perceived productivity gains.

Below is the trend of code operations and code churn by year as cited in the report.

GitClear AI Code Quality Research — Code operations and code churn by year. Image courtesy of respective research.

The following points can be inferred from the study:

Increased code output: AI-assisted development led to a significant rise in the number of lines added, up 9.2% YoY in 2024. This could be perceived as an increase in developer productivity due to faster code generation and higher task (ticket) completion throughput. However, the key question remains — are the added lines of code required in the first place?
Decline in refactoring (“moved” code): “Moved” lines — an indicator of refactoring — dropped nearly 40% YoY in 2024, falling below 10% for the first time. This can be attributed to the developer accepting the AI-generated code as-is and skipping the effort to refactor (to save time). Moreover, AI tools rarely suggest refactoring due to limited context windows, and thus fuel the overall drop.
Surge in copy-and-pasted and duplicated code. Copy/pasted lines exceeded moved lines in 2024, with a 17.1% increase YoY. Commits with duplicated blocks (≥5 lines) rose 8x in 2024 compared to 2022. 6.66% of commits now contain such blocks. This, too, can be attributed to the developer accepting the AI-generated code as-is without much effort to keep the code DRY.
Increased churn in newly added code. Churn — code revised within 2–4 weeks — increased 20–25% in 2024, i.e., developers are revisiting new code more frequently. This also implies that although the code output surged with AI tooling, due to low quality, code is being revised sooner than it used to happen earlier (when no or limited AI tooling was utilized).

Takeaway: The rise in AI-generated code has led to a parallel increase in copy-pasted fragments, duplication, and churn — while refactoring efforts have notably declined. This trend signals a deterioration in overall code quality.

Many organizations still gauge developer productivity by metrics like lines of code added or tasks completed. However, these indicators can be easily inflated by AI, often at the expense of long-term maintainability. The result is bloated codebases with higher duplication, reduced clarity, and an expanded surface area for bugs.

While AI may boost short-term development velocity, the trade-off is accumulating technical debt and diminished code quality — costs that will surface over time in the form of increased maintenance burden and reduced agility.

Surveys

While studies often rely on data-driven methodologies, these approaches can sometimes be questioned for their assumptions or limitations. Surveys, on the other hand, offer direct insight into developer sentiment and can help bridge gaps that traditional studies might overlook. In the sections below, we explore findings from independent surveys that assess the impact of AI tools on developer productivity.

Survey #1: StackOverflow

In its 2025 annual developer survey, Stack Overflow received over 49k responses, covering various aspects, including AI tooling and its related impact. Do note that I, too, was one of the respondents.

Among respondents, overall AI tool usage surged to ~84% from ~76% the previous year. The AI tool positive sentiment however dropped by ~10 percentage points signaling a trust deficit by the developers— more on this later.

*AI tools usage and sentiment. Image courtesy of respective survey results.*

Trust in AI tools accuracy and ability to handle complex tasks — *AI tools usage and sentiment. Image courtesy of respective survey results.*

AI agents and impact on work productivity. Image courtesy of respective survey results.

Takeaway: The survey revealed a sharp rise in AI tool adoption accompanied by a notable drop in positive sentiment highlighting a growing trust deficit. Majority of respondents expressing active distrust in AI tool accuracy, due to subpar solutions, suggesting that AI-generated code often demands extra effort to refine and validate. This offsets the productivity gain from faster code generation.

Interestingly, trust in AI tools' ability to handle complex tasks rose, reflecting cautious optimism rather than full confidence. Developers still see themselves as the ultimate judges of code quality, reinforcing the need for human oversight. Meanwhile, AI agents — though not yet widely adopted — show early promise. Their use of contextual information positions them as a potentially more reliable and efficient evolution of current AI tooling.

Survey #2: Harness

Harness surveyed 500 engineering leaders and practitioners to assess various parameters, including the impact of AI on developer productivity.

Although the surveyed participants showed overall positive sentiments towards AI tooling and its adoption, 92% also highlighted the associated risks. In an independent related observation, the risks are corroborated.

AI Missteps and Impact Radius. Image courtesy: https://martinfowler.com/articles/exploring-gen-ai/13-role-of-developer-skills.html

Almost two-thirds of respondents mentioned that they spend more time debugging AI-generated code and/or resolving security vulnerabilities.

AI tooling may also generate code that includes outdated dependencies or insecure coding patterns, requiring developers to spend time updating and patching these vulnerabilities.

This significantly increases the developer overhead and potentially offsets a considerable part of the productivity gains with AI tooling.

Two-third of respondent requires more time debugging AI generated code and/or resolving security vulnerabilities. Image courtesy of respective survey results.

About 59%nearly half offsets the gains due to rework or additional efforts

59% of devs experience deployment problems with AI tooling involved

59% of developers experience deployment problems with AI tooling involved. Image courtesy of respective survey results.

Since 60% of the respondents don't evaluate the effectiveness of the tools, it's quite challenging to relate it to developer productivity altogether.

60% of respondents don't evaluate the effectiveness of AI tooling. Image courtesy of respective survey results.

Takeaway: The survey reveals a nuanced picture of AI's impact on developer productivity. While most respondents expressed optimism about AI tooling, but also flagged significant risks. Notably, the majority reported spending more time debugging AI-generated code and addressing security vulnerabilities — contradicting the assumption that AI always boosts efficiency.

Deployment issues further compound the overhead, with many encountering frequent rework. The lack of tool effectiveness evaluation by many respondents underscores the challenge of accurately measuring productivity gains. Overall, the findings highlight that AI adoption demands careful oversight to avoid offsetting its intended benefits.

Conclusion

The studies and surveys analyzed paint a complex picture of AI's role in software development, revealing that perceived productivity gains often mask deeper issues. While AI tools may accelerate coding tasks, they also introduce duplication, churn, and technical debt — especially in large codebases — undermining long-term maintainability.

Trust in AI-generated code remains fragile, with developers frequently needing to debug and refine outputs. This erodes efficiency, offsets gain from faster code generation, and highlights the importance of human oversight.

Crucially, coding represents only a fraction of the overall software delivery cycle. Improvements in cycle time don't necessarily translate to gains in lead time. Sustainable productivity demands more than speed — it requires thoughtful architecture, strategic reuse, and vigilant monitoring of maintainability metrics.

In essence, AI can be a powerful accelerator, but without deliberate human intervention, its benefits risk being short-lived.

References and Further Reads

Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity
GitClear Code Quality Study — 2024 | 2025
Harness — State of Software Delivery
SO Developer Survey 2025
Role of Developer Skills in Agentic Coding

AI Productivity Tool developer productivity

Published at DZone with permission of Ammar Husain. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

Trending