Abstract String Domain Defined with Word Equations as a Reduced Product (Extended Version)
Antonina Nepeivoda, Ilya Afanasyev
TL;DR
This work introduces a string-interval abstract domain built from word equations and word disequalities to bound string values, structured as a reduced product over a string-property lattice. Abstract string objects leverage length-non-increasing morphisms to capture value, length, and additional structural properties (prefixes, suffixes, factors, and forbidden factors), enabling JavaScript string analysis through a unified, lattice-based framework. The paper develops multiple reduction strategies—standalone, cross-reduction, and unary-bound refinements—to guarantee lattice stability while maintaining practical precision, and it demonstrates lightweight abstractions for common string operations such as concatenation, indexing, substring, and replacement. It also provides illustrative examples (split with guards, a looping sanitizer, and letter-preservation effects) and discusses dense-language considerations, complexity (noting NP-hardness for perfect reductions in general), and related work, offering a foundation for precise static analysis of string-manipulating programs and avenues for future work in dynamic languages and solver integration.
Abstract
We introduce a string-interval abstract domain, where string intervals are characterized by systems of word equations (encoding lower bounds on string values) and word disequalities (encoding upper bounds). Building upon the lattice structure of string intervals, we define an abstract string object as a reduced product on a string property semilattice, determined by length-non-increasing morphisms. We consider several reduction strategies for abstract string objects and show that upon these strategies the string object domain forms a lattice. We define basic abstract string operations on the domain, aiming to minimize computational overheads on the reduction, and show how the domain can be used to analyse properties of JavaScript string manipulating programs.
