Table of Contents
Fetching ...

Bit-Vector CHC Solving for Binary Analysis and Binary Analysis for Bit-Vector CHC Solving

Aaron Bembenek, Toby Murray

Abstract

For high-assurance software, source-level reasoning is insufficient: we need binary-level guarantees. Despite constrained Horn clause (CHC) solving being one of the most popular forms of automated verification, prior work has not evaluated the viability of CHC solving for binary analysis. To fill this gap, we assemble a pipeline that encodes binary analysis problems as CHCs in the SMT logic of quantifier-free bit vectors, and show that off-the-shelf CHC solvers achieve reasonable success on binaries compiled from 983 C invariant inference benchmarks: a portfolio solves 59.5% and 66.1% of the problems derived from the unoptimized and optimized binaries, respectively -- roughly equal to the success rate of a leading C verifier on the source code (60.1%). Moreover, we show that binary analysis provides a valuable source of bit-vector CHC benchmarks (which are in short supply): binary-derived problems differ from existing benchmarks both structurally and in solver success rates and rankings. Augmenting CHC solving competitions with binary-derived benchmarks will encourage solver developers to improve bit-vector reasoning, in turn making CHC solving a more effective tool for binary analysis.

Bit-Vector CHC Solving for Binary Analysis and Binary Analysis for Bit-Vector CHC Solving

Abstract

For high-assurance software, source-level reasoning is insufficient: we need binary-level guarantees. Despite constrained Horn clause (CHC) solving being one of the most popular forms of automated verification, prior work has not evaluated the viability of CHC solving for binary analysis. To fill this gap, we assemble a pipeline that encodes binary analysis problems as CHCs in the SMT logic of quantifier-free bit vectors, and show that off-the-shelf CHC solvers achieve reasonable success on binaries compiled from 983 C invariant inference benchmarks: a portfolio solves 59.5% and 66.1% of the problems derived from the unoptimized and optimized binaries, respectively -- roughly equal to the success rate of a leading C verifier on the source code (60.1%). Moreover, we show that binary analysis provides a valuable source of bit-vector CHC benchmarks (which are in short supply): binary-derived problems differ from existing benchmarks both structurally and in solver success rates and rankings. Augmenting CHC solving competitions with binary-derived benchmarks will encourage solver developers to improve bit-vector reasoning, in turn making CHC solving a more effective tool for binary analysis.

Paper Structure

This paper contains 27 sections, 4 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: We assembled a pipeline for deriving bit-vector CHC problems from binaries: the binary is lifted to a Boogie program, the Boogie program is encoded as CHCs, and then the CHCs are simplified. The first and third stages use off-the-shelf components (blue); the second stage is our own code (orange).
  • Figure 2: The lifted binaries we encode as CHCs are Boogie programs described by this (stylized) grammar. Boolean values are treated as bit vectors of size one.
  • Figure 3: The three judgments given here define how we encode Boogie expressions, statements, and blocks as bit-vector CHCs.
  • Figure 4: Eldarica and Spacer solve almost the same number of benchmarks, while a portfolio of the two solvers solves 9 more than either individual solver.
  • Figure 5: Each category of bit-vector CHC problems has a unique "fingerprint". Features are ranked by their discriminative power. Positive coefficients indicate the positive association of a feature with a category (the opposite for negative).
  • ...and 1 more figures