Table of Contents
Fetching ...

Fine-Grained 1-Day Vulnerability Detection in Binaries via Patch Code Localization

Chaopeng Dong, Jingdong Guo, Shouguo Yang, Yang Xiao, Yi Li, Hong Li, Zhi Li, Limin Sun

TL;DR

The paper tackles the problem of locating 1-day vulnerabilities in stripped binaries by overcoming interference from compilers, optimizations, irrelevant functions, and patch-similar code blocks. It introduces PLocator, a CFG-based patch localization framework that builds stable signatures from patch code and its context using anchors and anchor graphs, coupled with an irrelevants-filtering step and a patch-path verification mechanism. Through large, real-world datasets (73 CVEs across multiple projects) and three evaluation regimes, PLocator achieves high detection performance ($TPR$ around the high 80s and $FPR$ around low teens to single digits) and rapid per-case execution time (~0.14s), outperforming state-of-the-art baselines substantially. The results demonstrate PLocator’s practicality for real-world vulnerability detection, including handling patch-similar code and tiny patches, and highlight its robust behavior across compilers and optimization levels. The work contributes a novel anchor-based methodology, detailed evaluation, ablation studies, and guidance for deploying patch-presence tests in security workflows.

Abstract

1-day vulnerabilities in binaries have become a major threat to software security. Patch presence test is one of the effective ways to detect the vulnerability. However, existing patch presence test works do not perform well in practical scenarios due to the interference from the various compilers and optimizations, patch-similar code blocks, and irrelevant functions in stripped binaries. In this paper, we propose a novel approach named PLocator, which leverages stable values from both the patch code and its context, extracted from the control flow graph, to accurately locate the real patch code in the target function, offering a practical solution for real-world vulnerability detection scenarios. To evaluate the effectiveness of PLocator, we collected 73 CVEs and constructed two comprehensive datasets ($Dataset_{-irr}$ and $Dataset_{+irr}$), comprising 1,090 and 27,250 test cases at four compilation optimization levels and two compilers with three different experiments, i.e., Same, XO (cross-optimizations), and XC (cross-compilers). The results demonstrate that PLocator achieves an average TPR of 88.2% and FPR of 12.9% in a short amount of time, outperforming state-of-the-art approaches by 26.7% and 63.5%, respectively, indicating that PLocator is more practical for the 1-day vulnerability detection task.

Fine-Grained 1-Day Vulnerability Detection in Binaries via Patch Code Localization

TL;DR

The paper tackles the problem of locating 1-day vulnerabilities in stripped binaries by overcoming interference from compilers, optimizations, irrelevant functions, and patch-similar code blocks. It introduces PLocator, a CFG-based patch localization framework that builds stable signatures from patch code and its context using anchors and anchor graphs, coupled with an irrelevants-filtering step and a patch-path verification mechanism. Through large, real-world datasets (73 CVEs across multiple projects) and three evaluation regimes, PLocator achieves high detection performance ( around the high 80s and around low teens to single digits) and rapid per-case execution time (~0.14s), outperforming state-of-the-art baselines substantially. The results demonstrate PLocator’s practicality for real-world vulnerability detection, including handling patch-similar code and tiny patches, and highlight its robust behavior across compilers and optimization levels. The work contributes a novel anchor-based methodology, detailed evaluation, ablation studies, and guidance for deploying patch-presence tests in security workflows.

Abstract

1-day vulnerabilities in binaries have become a major threat to software security. Patch presence test is one of the effective ways to detect the vulnerability. However, existing patch presence test works do not perform well in practical scenarios due to the interference from the various compilers and optimizations, patch-similar code blocks, and irrelevant functions in stripped binaries. In this paper, we propose a novel approach named PLocator, which leverages stable values from both the patch code and its context, extracted from the control flow graph, to accurately locate the real patch code in the target function, offering a practical solution for real-world vulnerability detection scenarios. To evaluate the effectiveness of PLocator, we collected 73 CVEs and constructed two comprehensive datasets ( and ), comprising 1,090 and 27,250 test cases at four compilation optimization levels and two compilers with three different experiments, i.e., Same, XO (cross-optimizations), and XC (cross-compilers). The results demonstrate that PLocator achieves an average TPR of 88.2% and FPR of 12.9% in a short amount of time, outperforming state-of-the-art approaches by 26.7% and 63.5%, respectively, indicating that PLocator is more practical for the 1-day vulnerability detection task.

Paper Structure

This paper contains 52 sections, 5 equations, 14 figures, 4 tables, 1 algorithm.

Figures (14)

  • Figure 1: Workflow of the 1-day vulnerability detection task in stripped binary. The functions in the red, green grey rectangles refer to the vulnerable, fixed, and irrelevant functions, respectively. The yellow code blocks are similar to the green patch blocks.
  • Figure 2: The binary function similarity distribution conducted by jTrans. The left two scatter plots show the BCSD similarity distribution from the binaries under the same compilation setting and across different optimizations, respectively. The x-axis represents different reference vulnerable functions, while the y-axis represents the similarities of top-50 most similar functions from the function pool with 6,000 candidates. The right two figures show the TPR and FPR on two compilation settings under different thresholds. The target function exceeds the threshold will be identified as "vulnerable", otherwise it is "irrelevant".
  • Figure 3: An motivating example of CVE-2014-3470. The yellow code blocks have similar assembly code as the green patch code blocks. The stable values and instructions are highlighted with blue text.
  • Figure 4: The workflow of PLocator, composed of 7 steps, which are patch code mapping(❶), anchor graph construction (❷), anchor path extraction (❸), irrelevant function filtering (❹), anchor path matching (❺), patch path verification (❻), and function classification (❼).
  • Figure 5: Example of anchor graph construction. The left and right graphs are the CFG and the generated AG, respectively. The dotted lines include the legends of the graph components. "INF" is a special flag for comparison between two different variables.
  • ...and 9 more figures

Theorems & Definitions (1)

  • definition 1: 1-Day Vulnerability Detection Task