Table of Contents
Fetching ...

Argus: A Multi-Agent Sensitive Information Leakage Detection Framework Based on Hierarchical Reference Relationships

Bin Wang, Hui Li, Liyang Zhang, Qijia Zhuang, Ao Yang, Dong Zhang, Xijun Luo, Bing Lin

TL;DR

This paper tackles the challenge of false-positive-heavy sensitive-information leaks in code repositories by introducing Argus, a multi-agent framework that performs three-level contextual semantic analysis (intrinsic semantics, immediate context, and project-wide references) to detect leaks. It combines role specialization (Initial Screening, Basic and Advanced Check Agents) and a three-tier shared memory pool to coordinate evidence gathering and decision making, achieving state-of-the-art performance on two new benchmarks. On CommonLeak, Argus reports $94.86\%$ accuracy, $96.36\%$ precision, and $94.64\%$ recall (F1 $=0.955$), while cost remains modest at $2.21 across 97 real repositories; ablation shows substantial gains from all three levels. The work also provides TrustedFalseSecrets to assess false-positive filtering, demonstrates robustness across languages and secret types, and offers practical deployment guidance and open-source resources for broader adoption.

Abstract

Sensitive information leakage in code repositories has emerged as a critical security challenge. Traditional detection methods that rely on regular expressions, fingerprint features, and high-entropy calculations often suffer from high false-positive rates. This not only reduces detection efficiency but also significantly increases the manual screening burden on developers. Recent advances in large language models (LLMs) and multi-agent collaborative architectures have demonstrated remarkable potential for tackling complex tasks, offering a novel technological perspective for sensitive information detection. In response to these challenges, we propose Argus, a multi-agent collaborative framework for detecting sensitive information. Argus employs a three-tier detection mechanism that integrates key content, file context, and project reference relationships to effectively reduce false positives and enhance overall detection accuracy. To comprehensively evaluate Argus in real-world repository environments, we developed two new benchmarks, one to assess genuine leak detection capabilities and another to evaluate false-positive filtering performance. Experimental results show that Argus achieves up to 94.86% accuracy in leak detection, with a precision of 96.36%, recall of 94.64%, and an F1 score of 0.955. Moreover, the analysis of 97 real repositories incurred a total cost of only 2.2$. All code implementations and related datasets are publicly available at https://github.com/TheBinKing/Argus-Guard for further research and application.

Argus: A Multi-Agent Sensitive Information Leakage Detection Framework Based on Hierarchical Reference Relationships

TL;DR

This paper tackles the challenge of false-positive-heavy sensitive-information leaks in code repositories by introducing Argus, a multi-agent framework that performs three-level contextual semantic analysis (intrinsic semantics, immediate context, and project-wide references) to detect leaks. It combines role specialization (Initial Screening, Basic and Advanced Check Agents) and a three-tier shared memory pool to coordinate evidence gathering and decision making, achieving state-of-the-art performance on two new benchmarks. On CommonLeak, Argus reports accuracy, precision, and recall (F1 ), while cost remains modest at $2.21 across 97 real repositories; ablation shows substantial gains from all three levels. The work also provides TrustedFalseSecrets to assess false-positive filtering, demonstrates robustness across languages and secret types, and offers practical deployment guidance and open-source resources for broader adoption.

Abstract

Sensitive information leakage in code repositories has emerged as a critical security challenge. Traditional detection methods that rely on regular expressions, fingerprint features, and high-entropy calculations often suffer from high false-positive rates. This not only reduces detection efficiency but also significantly increases the manual screening burden on developers. Recent advances in large language models (LLMs) and multi-agent collaborative architectures have demonstrated remarkable potential for tackling complex tasks, offering a novel technological perspective for sensitive information detection. In response to these challenges, we propose Argus, a multi-agent collaborative framework for detecting sensitive information. Argus employs a three-tier detection mechanism that integrates key content, file context, and project reference relationships to effectively reduce false positives and enhance overall detection accuracy. To comprehensively evaluate Argus in real-world repository environments, we developed two new benchmarks, one to assess genuine leak detection capabilities and another to evaluate false-positive filtering performance. Experimental results show that Argus achieves up to 94.86% accuracy in leak detection, with a precision of 96.36%, recall of 94.64%, and an F1 score of 0.955. Moreover, the analysis of 97 real repositories incurred a total cost of only 2.2$. All code implementations and related datasets are publicly available at https://github.com/TheBinKing/Argus-Guard for further research and application.

Paper Structure

This paper contains 30 sections, 3 equations, 6 figures, 9 tables, 1 algorithm.

Figures (6)

  • Figure 2: Overview of the Argus framework and its operational flow
  • Figure 3: Composition of Config and Others
  • Figure 3: Illustration of the Argus workflow: multi-level analysis and agent collaboration
  • Figure 4: Performance comparison of different tools across the overall category and five specific subcategories. The abbreviations used for baseline methods are listed in Table \ref{['table:baseline_tools']}.
  • Figure 5: Performance Comparison of Argus and Baseline Methods for Classification by Secret Types
  • ...and 1 more figures