Table of Contents
Fetching ...

Parf: Adaptive Parameter Refining for Abstract Interpretation

Zhongyi Wang, Linyu Yang, Mingshuai Chen, Yixuan Bu, Zhiyang Li, Qiuye Wang, Shengchao Qin, Xiao Yi, Jianwei Yin

TL;DR

Parf tackles the challenge of configuring abstract interpretation-based static analyzers by introducing a probabilistic, lattice-based framework that treats external parameters as random variables and adaptively refines their distributions within a time budget. It models parameters as a joint space $PS$ with base and delta components, implementing a Sample-Analyze-Refine loop that yields high-accuracy configurations, demonstrated on Frama-C/Eva and Mopsa with strong OSCS and SV-COMP results. The key contributions are the latticed parameter space formulation, the base+delta distribution design, and the incremental refinement strategy that combines incrementality and adaptivity. Practically, Parf enables automated, scalable tuning that reduces false positives and improves static analysis performance on large real-world programs.

Abstract

The core challenge in applying abstract interpretation lies in the configuration of abstraction and analysis strategies encoded by a large number of external parameters of static analysis tools. To attain low false-positive rates (i.e., accuracy) while preserving analysis efficiency, tuning the parameters heavily relies on expert knowledge and is thus difficult to automate. In this paper, we present a fully automated framework called Parf to adaptively tune the external parameters of abstract interpretation-based static analyzers. Parf models various types of parameters as random variables subject to probability distributions over latticed parameter spaces. It incrementally refines the probability distributions based on accumulated intermediate results generated by repeatedly sampling and analyzing, thereby ultimately yielding a set of highly accurate parameter settings within a given time budget. We have implemented Parf on top of Frama-C/Eva - an off-the-shelf open-source static analyzer for C programs - and compared it against the expert refinement strategy and Frama-C/Eva's official configurations over the Frama-C OSCS benchmark. Experimental results indicate that Parf achieves the lowest number of false positives on 34/37 (91.9%) program repositories with exclusively best results on 12/37 (32.4%) cases. In particular, Parf exhibits promising performance for analyzing complex, large-scale real-world programs.

Parf: Adaptive Parameter Refining for Abstract Interpretation

TL;DR

Parf tackles the challenge of configuring abstract interpretation-based static analyzers by introducing a probabilistic, lattice-based framework that treats external parameters as random variables and adaptively refines their distributions within a time budget. It models parameters as a joint space with base and delta components, implementing a Sample-Analyze-Refine loop that yields high-accuracy configurations, demonstrated on Frama-C/Eva and Mopsa with strong OSCS and SV-COMP results. The key contributions are the latticed parameter space formulation, the base+delta distribution design, and the incremental refinement strategy that combines incrementality and adaptivity. Practically, Parf enables automated, scalable tuning that reduces false positives and improves static analysis performance on large real-world programs.

Abstract

The core challenge in applying abstract interpretation lies in the configuration of abstraction and analysis strategies encoded by a large number of external parameters of static analysis tools. To attain low false-positive rates (i.e., accuracy) while preserving analysis efficiency, tuning the parameters heavily relies on expert knowledge and is thus difficult to automate. In this paper, we present a fully automated framework called Parf to adaptively tune the external parameters of abstract interpretation-based static analyzers. Parf models various types of parameters as random variables subject to probability distributions over latticed parameter spaces. It incrementally refines the probability distributions based on accumulated intermediate results generated by repeatedly sampling and analyzing, thereby ultimately yielding a set of highly accurate parameter settings within a given time budget. We have implemented Parf on top of Frama-C/Eva - an off-the-shelf open-source static analyzer for C programs - and compared it against the expert refinement strategy and Frama-C/Eva's official configurations over the Frama-C OSCS benchmark. Experimental results indicate that Parf achieves the lowest number of false positives on 34/37 (91.9%) program repositories with exclusively best results on 12/37 (32.4%) cases. In particular, Parf exhibits promising performance for analyzing complex, large-scale real-world programs.
Paper Structure (21 sections, 14 equations, 7 figures, 5 tables, 2 algorithms)

This paper contains 21 sections, 14 equations, 7 figures, 5 tables, 2 algorithms.

Figures (7)

  • Figure 1: A typical parameter setting (under precision 3) of Frama-C/Evaeva_user_manual with different parameter types: integer, Boolean, string, and set-of-strings.
  • Figure 2: Identifying potential runtime errors in a C program via the abstract interpretation-based static analyzer Frama-C/Eva.
  • Figure 3: Architecture of the Parf framework. Parf adopts a multi-round iterative mechanism: In each iteration, Parf (i) repeatedly samples parameter settings based on the initial or refined probability distribution of parameters, then (ii) uses these parameter settings as inputs to the static analyzer to analyze the program, and finally (iii) utilizes the analysis results to refine the probability distribution of parameters. Parf continues this process until the prescribed time budget is exhausted, upon which it returns the analysis results of the final round together with the final probability distribution of parameters.
  • Figure 4: Constructing $P^i$ from $P^{i}_{\textnormal{base}}$ and $P^{i}_{\textnormal{delta}}$ via $\oplus$. The three columns, from left to right, are $P^{i}_{\textnormal{base}}$, $P^{i}_{\textnormal{delta}}$, and $P^i$, respectively; The four rows, from top to bottom, correspond to an integer parameter, a Boolean parameter with $Pr[P^{i}_{\textnormal{base}}=0]=1$, a Boolean parameter with $Pr[P^{i}_{\textnormal{base}}=1]=1$, and a set-of-strings parameter $P^\textnormal{domains}$ over $\mathcal{P}\{d_1, d_2, d_3, d_4, d_5\}$, respectively.
  • Figure 5: Incremental refinement of $P_{\textnormal{base}}$.
  • ...and 2 more figures

Theorems & Definitions (4)

  • Remark 1
  • Remark 2
  • Remark 3
  • Remark 4