Chasing One-day Vulnerabilities Across Open Source Forks
Romain Lefeuvre, Charly Reux, Stefano Zacchiroli, Olivier Barais, Benoit Combemale
TL;DR
This work tackles the challenge of cross-fork vulnerability propagation by introducing a commit-level propagation framework grounded in the global Software Heritage commit graph and OSV vulnerability ranges. It demonstrates how upstream vulnerability information can be propagated to downstream forks to identify one-day vulnerabilities that may persist in popular forks, addressing a blind spot in repository-local analyses. The authors report a large-scale study starting from 7162 upstream repositories, uncovering 2.2 million forks containing potentially vulnerable commits and identifying 356 vulnerability-fork pairs, with 65 forks manually vetted and 3 high-severity one-day vulnerabilities confirmed. The study further demonstrates practical integration opportunities in software development workflows, including a prototype tool for Go modules and Git submodules, and discusses implications for fork maintenance, false-positive reduction, and semantic-versioning mappings. Overall, the approach offers a scalable path to fortify the security of forked ecosystems and informs strategies for automated patch propagation and maintainer notification across the open-source landscape.
Abstract
Tracking vulnerabilities inherited from third-party open-source components is a well-known challenge, often addressed by tracing the threads of dependency information. However, vulnerabilities can also propagate through forking: a repository forked after the introduction of a vulnerability, but before it is patched, may remain vulnerable in the fork well after being fixed in the original project. Current approaches for vulnerability analysis lack the commit-level granularity needed to track vulnerability introductions and fixes across forks, potentially leaving one-day vulnerabilities undetected. This paper presents a novel approach to help developers identify one-day vulnerabilities in forked repositories. Leveraging the global graph of public code, as captured by the Software Heritage archive, the approach propagates vulnerability information at the commit level and performs automated impact analysis. This enables automatic detection of forked projects that have not incorporated fixes, leaving them potentially vulnerable. Starting from 7162 repositories that, according to OSV, include vulnerable commits in their development histories, we identify 2.2 M forks, containing at least one vulnerable commit. Then we perform a strict filtering, allowing us to find 356 ___vulnerability, fork___ pairs impacting active and popular GitHub forks, we manually evaluate 65 pairs, finding 3 high-severity vulnerabilities, demonstrating the impact and applicability of this approach.
