Process-based Indicators of Vulnerability Re-Introducing Code Changes: An Exploratory Case Study
Samiha Shimmi, Nicholas M. Synovic, Mona Rahimi, George K. Thiruvathukal
TL;DR
This paper tackles vulnerability reintroduction by shifting from file-level analysis to commit-level evaluation of security fixes, using longitudinal software engineering process metrics. Through a case study on ImageMagick, it combines 76 vulnerability-reintroducing commits with CVEs and computes bus factor, issue density, and issue spoilage to reveal socio-technical patterns around reintroductions. The main contributions include a ground-truth dataset of vulnerability reintroducing and fixing commits (extending BigVul and DiverseVul), a methodology that augments SZZ with LLM validation, and insights into how process dynamics relate to vulnerability reintroduction, such as stable bus factor, low issue density, and rising issue spoilage during reintroduction windows. The findings underscore the potential of integrating process metrics with code-level data to predict risky fixes and guide security-aware development practices in open-source projects.
Abstract
Software vulnerabilities often persist or re-emerge even after being fixed, revealing the complex interplay between code evolution and socio-technical factors. While source code metrics provide useful indicators of vulnerabilities, software engineering process metrics can uncover patterns that lead to their introduction. Yet few studies have explored whether process metrics can reveal risky development activities over time -- insights that are essential for anticipating and mitigating software vulnerabilities. This work highlights the critical role of process metrics along with code changes in understanding and mitigating vulnerability reintroduction. We move beyond file-level prediction and instead analyze security fixes at the commit level, focusing not only on whether a single fix introduces a vulnerability but also on the longer sequences of changes through which vulnerabilities evolve and re-emerge. Our approach emphasizes that reintroduction is rarely the result of one isolated action, but emerges from cumulative development activities and socio-technical conditions. To support this analysis, we conducted a case study on the ImageMagick project by correlating longitudinal process metrics such as bus factor, issue density, and issue spoilage with vulnerability reintroduction activities, encompassing 76 instances of reintroduced vulnerabilities. Our findings show that reintroductions often align with increased issue spoilage and fluctuating issue density, reflecting short-term inefficiencies in issue management and team responsiveness. These observations provide a foundation for broader studies that combine process and code metrics to predict risky fixes and strengthen software security.
