Table of Contents
Fetching ...

Optimization of the Context-Free Language Reachability Matrix-Based Algorithm

Ilia Muravev

TL;DR

The paper tackles the scalability of Context-Free Language Reachability (CFL-r) analysis on large interprocedural graphs by reformulating CFL-r as a sparse linear-algebra problem and introducing five optimizations to a matrix-based solver that leverages GraphBLAS:GraphBLAS. Key contributions include a refined update scheme that uses $M_{old}$ and $\Delta M$, dual row/column storage with hyper-sparse multiplication, a tiered set of matrices $\widetilde{M}$ to reduce full reconstructions, exploitation of indexed non-terminals for large grammars, and CFG transformations tailored to specific analyses. Empirical results demonstrate orders-of-magnitude speedups and superior performance over state-of-the-art CFL-r solvers (POCR, Graspan, Gigascale) across four static analyses, albeit with moderate RAM growth. The work has practical impact for scalable static analysis on large graphs by enabling faster CFL-r computations on modern sparse-matrix hardware.

Abstract

Various static analysis problems are reformulated as instances of the Context-Free Language Reachability (CFL-r) problem. One promising way to make solving CFL-r more practical for large-scale interprocedural graphs is to reduce CFL-r to linear algebra operations on sparse matrices, as they are efficiently executed on modern hardware. In this work, we present five optimizations for a matrix-based CFL-r algorithm that utilize the specific properties of both the underlying semiring and the widely-used linear algebra library SuiteSparse:GraphBlas. Our experimental results show that these optimizations result in orders of magnitude speedup, with the optimized matrix-based CFL-r algorithm consistently outperforming state-of-the-art CFL-r solvers across four considered static analyses.

Optimization of the Context-Free Language Reachability Matrix-Based Algorithm

TL;DR

The paper tackles the scalability of Context-Free Language Reachability (CFL-r) analysis on large interprocedural graphs by reformulating CFL-r as a sparse linear-algebra problem and introducing five optimizations to a matrix-based solver that leverages GraphBLAS:GraphBLAS. Key contributions include a refined update scheme that uses and , dual row/column storage with hyper-sparse multiplication, a tiered set of matrices to reduce full reconstructions, exploitation of indexed non-terminals for large grammars, and CFG transformations tailored to specific analyses. Empirical results demonstrate orders-of-magnitude speedups and superior performance over state-of-the-art CFL-r solvers (POCR, Graspan, Gigascale) across four static analyses, albeit with moderate RAM growth. The work has practical impact for scalable static analysis on large graphs by enabling faster CFL-r computations on modern sparse-matrix hardware.

Abstract

Various static analysis problems are reformulated as instances of the Context-Free Language Reachability (CFL-r) problem. One promising way to make solving CFL-r more practical for large-scale interprocedural graphs is to reduce CFL-r to linear algebra operations on sparse matrices, as they are efficiently executed on modern hardware. In this work, we present five optimizations for a matrix-based CFL-r algorithm that utilize the specific properties of both the underlying semiring and the widely-used linear algebra library SuiteSparse:GraphBlas. Our experimental results show that these optimizations result in orders of magnitude speedup, with the optimized matrix-based CFL-r algorithm consistently outperforming state-of-the-art CFL-r solvers across four considered static analyses.
Paper Structure (7 sections, 4 figures, 4 tables, 1 algorithm)

This paper contains 7 sections, 4 figures, 4 tables, 1 algorithm.

Figures (4)

  • Figure 1: CFG for field-sensitive Java points-to analysis
  • Figure 2: CFG for field-insensitive C/C++ alias analysis
  • Figure 3: CFG for field-sensitive C/C++ alias analysis, taken from POCR
  • Figure 4: CFG for context-sensitive C/C++ value-flow analysis, taken from POCR

Theorems & Definitions (2)

  • definition 1: Weak Chomsky Normal Form (WCNF)
  • definition 2: Reachability Semiring of CFG