An Empirical Study of Vulnerability Handling Times in CPython
Jukka Ruohonen
TL;DR
This paper investigates how CPython vulnerabilities are handled, focusing on two timelines: the time to fix and the time to publish a CVE. It builds a manual dataset of 93 vulnerabilities from the old tracker and applies regression analyses, including an OLS model with a $\ln(x+1)$ transform and a robust $M$-estimator, to identify predictors. The striking finding is that the identity of the reporter largely explains the variation in handling times, while severity, POC, commits, references, and comments do not provide explanatory power, with $R^2$ values reaching up to $0.923$. The results highlight the importance of vulnerability reporting quality and coordination in the Python ecosystem and suggest areas for process improvements in vulnerability disclosure and downstream patch deployment.
Abstract
The paper examines the handling times of software vulnerabilities in CPython, the reference implementation and interpreter for the today's likely most popular programming language, Python. The background comes from the so-called vulnerability life cycle analysis, the literature on bug fixing times, and the recent research on security of Python software. Based on regression analysis, the associated vulnerability fixing times can be explained very well merely by knowing who have reported the vulnerabilities. Severity, proof-of-concept code, commits made to a version control system, comments posted on a bug tracker, and references to other sources do not explain the vulnerability fixing times. With these results, the paper contributes to the recent effort to better understand security of the Python ecosystem.
