Table of Contents
Fetching ...

Broken Windows: Exploring the Applicability of a Controversial Theory on Code Quality

Diomidis Spinellis, Panos Louridas, Maria Kechagia, Tushar Sharma

TL;DR

The paper investigates whether the broken windows theory, which links environmental order to subsequent behavior, extends to software development by examining how existing code quality affects future changes. Using a large-scale, longitudinal study of C and Java projects (about two million commits across 122 projects), the authors extract time-series quality metrics and smells, applying autocorrelation analyses and KS tests to assess inter-temporal relationships and developer behavior. They find that historical quality does influence future evolution for several internal quality metrics and that developers adjust their committing behavior in relation to file quality for certain metrics, though Java smells do not show a strong contextual effect. The work highlights the signaling role of code quality measures, provides replication data, and discusses practical implications for tooling and process improvements to maintain code hygiene and guide future development.

Abstract

Is the quality of existing code correlated with the quality of subsequent changes? According to the (controversial) broken windows theory, which inspired this study, disorder sets descriptive norms and signals behavior that further increases it. From a large code corpus, we examine whether code history does indeed affect the evolution of code quality. We examine C code quality metrics and Java code smells in specific files, and see whether subsequent commits by developers continue on that path. We check whether developers tailor the quality of their commits based on the quality of the file they commit to. Our results show that history matters, that developers behave differently depending on some aspects of the code quality they encounter, and that programming style inconsistency is not necessarily related to structural qualities. These findings have implications for both software practice and research. Software practitioners can emphasize current quality practices as these influence the code that will be developed in the future. Researchers in the field may replicate and extend the study to improve our understanding of the theory and its practical implications on artifacts, processes, and people.

Broken Windows: Exploring the Applicability of a Controversial Theory on Code Quality

TL;DR

The paper investigates whether the broken windows theory, which links environmental order to subsequent behavior, extends to software development by examining how existing code quality affects future changes. Using a large-scale, longitudinal study of C and Java projects (about two million commits across 122 projects), the authors extract time-series quality metrics and smells, applying autocorrelation analyses and KS tests to assess inter-temporal relationships and developer behavior. They find that historical quality does influence future evolution for several internal quality metrics and that developers adjust their committing behavior in relation to file quality for certain metrics, though Java smells do not show a strong contextual effect. The work highlights the signaling role of code quality measures, provides replication data, and discusses practical implications for tooling and process improvements to maintain code hygiene and guide future development.

Abstract

Is the quality of existing code correlated with the quality of subsequent changes? According to the (controversial) broken windows theory, which inspired this study, disorder sets descriptive norms and signals behavior that further increases it. From a large code corpus, we examine whether code history does indeed affect the evolution of code quality. We examine C code quality metrics and Java code smells in specific files, and see whether subsequent commits by developers continue on that path. We check whether developers tailor the quality of their commits based on the quality of the file they commit to. Our results show that history matters, that developers behave differently depending on some aspects of the code quality they encounter, and that programming style inconsistency is not necessarily related to structural qualities. These findings have implications for both software practice and research. Software practitioners can emphasize current quality practices as these influence the code that will be developed in the future. Researchers in the field may replicate and extend the study to improve our understanding of the theory and its practical implications on artifacts, processes, and people.

Paper Structure

This paper contains 15 sections, 3 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Factors affecting the evolution of internal code quality.
  • Figure 2: Percentage of files with autocorrelation $>0.5$ at each lag for code style metrics.
  • Figure 3: Percentage of files with autocorrelation $>0.5$ at each lag for code structure metrics.
  • Figure 4: Percentage of files with autocorrelation $>0.5$ at each lag for smells.
  • Figure 5: Percentage of developers with different behaviour in top vs bottom C files.