An LLM-based Quantitative Framework for Evaluating High-Stealthy Backdoor Risks in OSS Supply Chains
Zihe Yan, Kai Luo, Haoyu Yang, Yang Yu, Zhuosheng Zhang, Guancheng Li
TL;DR
This work addresses the problem of stealthy backdoor risks in open-source software supply chains by introducing an attacker-centric threat model and a multi-dimensional risk evaluation framework that leverages LLM-based semantic analysis. The proposed HSBR framework combines four risk dimensions—Dependency Impact, Payload Concealment, Community Quality, and Continuous Integration—into a unified scoring system where metrics are quantile-normalized and aggregated as $R_{total}$ using weights that emphasize propagation and governance signals. An end-to-end automated tool, HSBR E, collects metadata, applies dimension-specific scoring (including LLM-assisted CQ/PC analysis), and outputs explainable HSBR scores and rationales. Evaluation on 66 Debian packages with public GitHub repos and a case-study resembling the XZ backdoor demonstrates that Community Quality and CI practices are critical risk surfaces, with strong cross-metric clusters validating the framework’s multi-dimensional approach. The results indicate meaningful, actionable insights for prioritizing defensive efforts and show robustness to weighting and model variations, making the methodology broadly applicable across diverse OSS ecosystems.
Abstract
In modern software development workflows, the open-source software supply chain contributes significantly to efficient and convenient engineering practices. With increasing system complexity, using open-source software as third-party dependencies has become a common practice. However, the lack of maintenance for underlying dependencies and insufficient community auditing create challenges in ensuring source code security and the legitimacy of repository maintainers, especially under high-stealthy backdoor attacks exemplified by the XZ-Util incident. To address these problems, we propose a fine-grained project evaluation framework for backdoor risk assessment in open-source software. The framework models stealthy backdoor attacks from the viewpoint of the attacker and defines targeted metrics for each attack stage. In addition, to overcome the limitations of static analysis in assessing the reliability of repository maintenance activities such as irregular committer privilege escalation and limited participation in reviews, the framework uses large language models (LLMs) to conduct semantic evaluation of code repositories without relying on manually crafted patterns. The framework is evaluated on sixty six high-priority packages in the Debian ecosystem. The experimental results indicate that the current open-source software supply chain is exposed to various security risks.
