Source Code Hotspots: A Diagnostic Method for Quality Issues
Saleha Muzammil, Mughees Ur Rehman, Zoe Kotti, Diomidis Spinellis
TL;DR
This paper introduces source code hotspots as line-level loci of frequent changes to diagnose maintainability issues in evolving software. It presents a line-level hotspot mining method applied to 91 open-source repositories, yielding 15 hotspot types organized into four categories and revealing that bots account for roughly 74% of hotspot edits. The work provides a practical taxonomy linked to concrete refactoring and CI-based mitigations, highlighting that over half of hotspots occur in administrative files and that many changes are mechanical noise. The findings offer actionable guidance for reducing avoidable churn, improving configurability, stability, and changeability, and establishing a dataset and tooling to advance future research in software maintenance.
Abstract
Software source code often harbours "hotspots": small portions of the code that change far more often than the rest of the project and thus concentrate maintenance activity. We mine the complete version histories of 91 evolving, actively developed GitHub repositories and identify 15 recurring line-level hotspot patterns that explain why these hotspots emerge. The three most prevalent patterns are Pinned Version Bump (26%), revealing brittle release practices; Long Line Change (17%), signalling deficient layout; and Formatting Ping-Pong (9%), indicating missing or inconsistent style automation. Surprisingly, automated accounts generate 74% of all hotspot edits, suggesting that bot activity is a dominant but largely avoidable source of noise in change histories. By mapping each pattern to concrete refactoring guidelines and continuous integration checks, our taxonomy equips practitioners with actionable steps to curb hotspots and systematically improve software quality in terms of configurability, stability, and changeability.
