Echoes of AI: Investigating the Downstream Effects of AI Assistants on Software Maintainability
Markus Borg, Dave Hewett, Nadim Hagatulah, Noric Couderc, Emma Söderberg, Donald Graham, Uttam Kini, Dave Farley
TL;DR
The paper investigates whether AI-assisted co-development affects software maintainability, focusing on downstream evolution by having new developers evolve otherwise Phase 1-created code without AI. Using a preregistered two-phase design with 151 participants, it measures completion time, CodeHealth, test coverage, and perceived productivity through frequentist and Bayesian analyses, plus qualitative data. Phase 2 shows no robust evidence that AI-assisted code yields faster evolution or higher code quality, though habitual AI users show small, uncertain CodeHealth gains and Phase 1 results reveal notable speedups. The study highlights risks such as potential code volume growth and cognitive debt, and calls for future work on long-term, agentic-AI impacts and knowledge-management strategies in AI-enabled software teams.
Abstract
[Context] AI assistants, like GitHub Copilot and Cursor, are transforming software engineering. While several studies highlight productivity improvements, their impact on maintainability requires further investigation. [Objective] This study investigates whether co-development with AI assistants affects software maintainability, specifically how easily other developers can evolve the resulting source code. [Method] We conducted a two-phase controlled experiment involving 151 participants, 95% of whom were professional developers. In Phase 1, participants added a new feature to a Java web application, with or without AI assistance. In Phase 2, a randomized controlled trial, new participants evolved these solutions without AI assistance. [Results] Phase 2 revealed no significant differences in subsequent evolution with respect to completion time or code quality. Bayesian analysis suggests that any speed or quality improvements from AI use were at most small and highly uncertain. Observational results from Phase 1 corroborate prior research: using an AI assistant yielded a 30.7% median reduction in completion time, and habitual AI users showed an estimated 55.9% speedup. [Conclusions] Overall, we did not detect systematic maintainability advantages or disadvantages when other developers evolved code co-developed with AI assistants. Within the scope of our tasks and measures, we observed no consistent warning signs of degraded code-level maintainability. Future work should examine risks such as code bloat from excessive code generation and cognitive debt as developers offload more mental effort to assistants.
