Does AI-Assisted Coding Deliver? A Difference-in-Differences Study of Cursor's Impact on Software Projects
Hao He, Courtney Miller, Shyam Agarwal, Christian Kästner, Bogdan Vasilescu
TL;DR
This study asks whether an AI-assisted coding agent (Cursor) delivers sustained project-level productivity and code quality benefits. It uses a staggered adoption difference-in-differences design with propensity-score matching to compare 807 Cursor-adopting GitHub repositories to 1,380 matched controls, analyzing velocity (commits, lines added) and quality (static analysis warnings, code complexity, duplicate lines), followed by panel GMM to explore causality between velocity and quality. The authors find substantial but transient velocity gains, coupled with persistent increases in static analysis warnings and code complexity, and show via causal-path analysis that accumulated technical debt dampens future velocity. These results highlight a velocity–quality trade-off in AI-driven development and call for deliberate quality-assurance integration and design improvements in future AI coding tools to sustain gains.
Abstract
Large language models (LLMs) have demonstrated the promise to revolutionize the field of software engineering. Among other things, LLM agents are rapidly gaining momentum in their application to software development, with practitioners claiming a multifold productivity increase after adoption. Yet, empirical evidence is lacking around these claims. In this paper, we estimate the causal effect of adopting a widely popular LLM agent assistant, namely Cursor, on development velocity and software quality. The estimation is enabled by a state-of-the-art difference-in-differences design comparing Cursor-adopting GitHub projects with a matched control group of similar GitHub projects that do not use Cursor. We find that the adoption of Cursor leads to a significant, large, but transient increase in project-level development velocity, along with a significant and persistent increase in static analysis warnings and code complexity. Further panel generalized method of moments estimation reveals that the increase in static analysis warnings and code complexity acts as a major factor causing long-term velocity slowdown. Our study carries implications for software engineering practitioners, LLM agent assistant designers, and researchers.
