Table of Contents
Fetching ...

Symbol-Specific Sparsification of Interprocedural Distributive Environment Problems

Kadiray Karakaya, Eric Bodden

TL;DR

This work targets the scalability gap in interprocedural static analysis by generalizing sparsification beyond reachability to value-aware IDE analyses. It introduces Sparse IDE, a symbol-specific on-demand sparsification implemented in SparseHeros, and formalizes its algorithmic foundations, including symbol-specific identity transformers. The approach preserves precision (identical results to the baseline IDE on tested benchmarks) while achieving significant runtime and memory improvements on linear constant propagation over real-world Java libraries. The key finding is that sparsifying by symbols—rather than by values—can yield substantial performance gains even when value domains are infinite, albeit with inherent limits on sparsification with respect to the value domain. Practically, Sparse IDE enables efficient, scalable, and precise interprocedural analyses for a broad class of problems expressible in IDE, with potential synergy when combined with other scalability techniques.

Abstract

Previous work has shown that one can often greatly speed up static analysis by computing data flows not for every edge in the program's control-flow graph but instead only along definition-use chains. This yields a so-called sparse static analysis. Recent work on SparseDroid has shown that specifically taint analysis can be "sparsified" with extraordinary effectiveness because the taint state of one variable does not depend on those of others. This allows one to soundly omit more flow-function computations than in the general case. In this work, we now assess whether this result carries over to the more generic setting of so-called Interprocedural Distributive Environment (IDE) problems. Opposed to taint analysis, IDE comprises distributive problems with large or even infinitely broad domains, such as typestate analysis or linear constant propagation. Specifically, this paper presents Sparse IDE, a framework that realizes sparsification for any static analysis that fits the IDE framework. We implement Sparse IDE in SparseHeros, as an extension to the popular Heros IDE solver, and evaluate its performance on real-world Java libraries by comparing it to the baseline IDE algorithm. To this end, we design, implement and evaluate a linear constant propagation analysis client on top of SparseHeros. Our experiments show that, although IDE analyses can only be sparsified with respect to symbols and not (numeric) values, Sparse IDE can nonetheless yield significantly lower runtimes and often also memory consumptions compared to the original IDE.

Symbol-Specific Sparsification of Interprocedural Distributive Environment Problems

TL;DR

This work targets the scalability gap in interprocedural static analysis by generalizing sparsification beyond reachability to value-aware IDE analyses. It introduces Sparse IDE, a symbol-specific on-demand sparsification implemented in SparseHeros, and formalizes its algorithmic foundations, including symbol-specific identity transformers. The approach preserves precision (identical results to the baseline IDE on tested benchmarks) while achieving significant runtime and memory improvements on linear constant propagation over real-world Java libraries. The key finding is that sparsifying by symbols—rather than by values—can yield substantial performance gains even when value domains are infinite, albeit with inherent limits on sparsification with respect to the value domain. Practically, Sparse IDE enables efficient, scalable, and precise interprocedural analyses for a broad class of problems expressible in IDE, with potential synergy when combined with other scalability techniques.

Abstract

Previous work has shown that one can often greatly speed up static analysis by computing data flows not for every edge in the program's control-flow graph but instead only along definition-use chains. This yields a so-called sparse static analysis. Recent work on SparseDroid has shown that specifically taint analysis can be "sparsified" with extraordinary effectiveness because the taint state of one variable does not depend on those of others. This allows one to soundly omit more flow-function computations than in the general case. In this work, we now assess whether this result carries over to the more generic setting of so-called Interprocedural Distributive Environment (IDE) problems. Opposed to taint analysis, IDE comprises distributive problems with large or even infinitely broad domains, such as typestate analysis or linear constant propagation. Specifically, this paper presents Sparse IDE, a framework that realizes sparsification for any static analysis that fits the IDE framework. We implement Sparse IDE in SparseHeros, as an extension to the popular Heros IDE solver, and evaluate its performance on real-world Java libraries by comparing it to the baseline IDE algorithm. To this end, we design, implement and evaluate a linear constant propagation analysis client on top of SparseHeros. Our experiments show that, although IDE analyses can only be sparsified with respect to symbols and not (numeric) values, Sparse IDE can nonetheless yield significantly lower runtimes and often also memory consumptions compared to the original IDE.
Paper Structure (22 sections, 5 equations, 8 figures, 4 tables, 2 algorithms)

This paper contains 22 sections, 5 equations, 8 figures, 4 tables, 2 algorithms.

Figures (8)

  • Figure 1: Flow functions (reproduced from ifds).
  • Figure 2: Edge functions (reproduced from ide).
  • Figure 3: Original and sparse propagations after applying fact-specific on-demand sparsification.
  • Figure 4: Comparison of the Sparsification Approaches of Sparse IFDS and Sparse IDE
  • Figure 5: Relative runtime of Sparse IDE compared to the baseline original IDE in %, annotated with exact runtimes in seconds, sorted by original IDE's runtime
  • ...and 3 more figures