Table of Contents
Fetching ...

SoK: Towards Effective Automated Vulnerability Repair

Ying Li, Faysal hossain shezan, Bomin wei, Gang Wang, Yuan Tian

TL;DR

The SoK surveys automated vulnerability repair as an end-to-end process comprising vulnerability localization, security patch generation, and patch validation, framing AVR as a specialized subset of automatic program repair driven by security constraints. It classifies patch generation methods into Template-Guided, Search-Based, Constraint-Based, and Learning-Driven categories, and evaluates them across synthetic and real-world benchmarks to reveal strengths, limitations, and the absence of a universally best approach. Key findings show learning-based AVR excels in some contexts but struggles with complex, real-world program understanding, while non-learning methods maintain robustness on real vulnerabilities; hybridization and better specifications emerge as promising directions. The work proposes future research on hybrid SPG, domain-specific AVR, richer benchmarks, interpretability, automated specifications, and verifiers to improve practical impact in software security.

Abstract

The increasing prevalence of software vulnerabilities necessitates automated vulnerability repair (AVR) techniques. This Systematization of Knowledge (SoK) provides a comprehensive overview of the AVR landscape, encompassing both synthetic and real-world vulnerabilities. Through a systematic literature review and quantitative benchmarking across diverse datasets, methods, and strategies, we establish a taxonomy of existing AVR methodologies, categorizing them into template-guided, search-based, constraint-based, and learning-driven approaches. We evaluate the strengths and limitations of these approaches, highlighting common challenges and practical implications. Our comprehensive analysis of existing AVR methods reveals a diverse landscape with no single ``best'' approach. Learning-based methods excel in specific scenarios but lack complete program understanding, and both learning and non-learning methods face challenges with complex vulnerabilities. Additionally, we identify emerging trends and propose future research directions to advance the field of AVR. This SoK serves as a valuable resource for researchers and practitioners, offering a structured understanding of the current state-of-the-art and guiding future research and development in this critical domain.

SoK: Towards Effective Automated Vulnerability Repair

TL;DR

The SoK surveys automated vulnerability repair as an end-to-end process comprising vulnerability localization, security patch generation, and patch validation, framing AVR as a specialized subset of automatic program repair driven by security constraints. It classifies patch generation methods into Template-Guided, Search-Based, Constraint-Based, and Learning-Driven categories, and evaluates them across synthetic and real-world benchmarks to reveal strengths, limitations, and the absence of a universally best approach. Key findings show learning-based AVR excels in some contexts but struggles with complex, real-world program understanding, while non-learning methods maintain robustness on real vulnerabilities; hybridization and better specifications emerge as promising directions. The work proposes future research on hybrid SPG, domain-specific AVR, richer benchmarks, interpretability, automated specifications, and verifiers to improve practical impact in software security.

Abstract

The increasing prevalence of software vulnerabilities necessitates automated vulnerability repair (AVR) techniques. This Systematization of Knowledge (SoK) provides a comprehensive overview of the AVR landscape, encompassing both synthetic and real-world vulnerabilities. Through a systematic literature review and quantitative benchmarking across diverse datasets, methods, and strategies, we establish a taxonomy of existing AVR methodologies, categorizing them into template-guided, search-based, constraint-based, and learning-driven approaches. We evaluate the strengths and limitations of these approaches, highlighting common challenges and practical implications. Our comprehensive analysis of existing AVR methods reveals a diverse landscape with no single ``best'' approach. Learning-based methods excel in specific scenarios but lack complete program understanding, and both learning and non-learning methods face challenges with complex vulnerabilities. Additionally, we identify emerging trends and propose future research directions to advance the field of AVR. This SoK serves as a valuable resource for researchers and practitioners, offering a structured understanding of the current state-of-the-art and guiding future research and development in this critical domain.

Paper Structure

This paper contains 24 sections, 1 equation, 4 figures, 7 tables.

Figures (4)

  • Figure 1: The timeline of vulnerability discovery, patch release, and exploit publication.
  • Figure 2: Taxonomy of Automated Vulnerability Repair approaches. Boxes with blue background present the taxonomy of security patch generation methods. Boxes with green background show the taxonomy of security patch validation approaches.
  • Figure 3: Total vs. (Non-)Learning-Based Repair Counts Across CWE Categories in real-world benchmarks.
  • Figure 4: Inter-procedural processing for $\mathcal{D}_{SARD}$