An Incremental Algorithm for Algebraic Program Analysis
Chenyu Zhou, Yuzhou Fang, Jingbo Wang, Chao Wang
TL;DR
The paper tackles the problem of efficiently updating algebraic program analyses after small code changes. It proposes two main innovations: a tree-based path expression representation with delta-aware incremental updates, and an incremental interpretation mechanism that reuses prior results to update program facts. The approach achieves large speedups over baseline APA and related methods across multiple analyses and DaCapo Java benchmarks, with theoretical guarantees under Kleene-like algebras and star-free variants. This has practical implications for responsive software engineering tools and continuous development workflows, enabling fast re-analysis in the face of frequent changes.
Abstract
We propose a method for conducting algebraic program analysis (APA) incrementally in response to changes of the program under analysis. APA is a program analysis paradigm that consists of two distinct steps: computing a path expression that succinctly summarizes the set of program paths of interest, and interpreting the path expression using a properly-defined semantic algebra to obtain program properties of interest. In this context, the goal of an incremental algorithm is to reduce the analysis time by leveraging the intermediate results computed before the program changes. We have made two main contributions. First, we propose a data structure for efficiently representing path expression as a tree together with a tree-based interpreting method. Second, we propose techniques for efficiently updating the program properties in response to changes of the path expression. We have implemented our method and evaluated it on thirteen Java applications from the DaCapo benchmark suite. The experimental results show that both our method for incrementally computing path expression and our method for incrementally interpreting path expression are effective in speeding up the analysis. Compared to the baseline APA and two state-of-the-art APA methods, the speedup of our method ranges from 160X to 4761X depending on the types of program analyses performed.
