Semantic foundations of equality saturation
Dan Suciu, Yisu Remy Wang, Yihong Zhang
TL;DR
This paper provides a rigorous semantic foundation for equality saturation by treating E-graphs as reachable, deterministic tree automata and defining EqSat as the least fixpoint of an immediate consequence operator. It uncovers a tight relationship between EqSat and the chase, establishing reductions in both directions and introducing the concept of EGD-fair chase sequences to characterize termination. The work yields precise termination results across single-instance, all-term-instance, and all-E-graph-instance scenarios (including undecidability and Pi2-completeness results) and introduces a practical termination criterion, weak term acyclicity. By connecting EqSat to database theory, it opens avenues for porting chase results to EqSat and guiding the design of more scalable and expressive EqSat tools for program and query optimization.
Abstract
Equality saturation is an emerging technique for program and query optimization developed in the programming language community. It performs term rewriting over an E-graph, a data structure that compactly represents a program space. Despite its popularity, the theory of equality saturation lags behind the practice. In this paper, we define a fixpoint semantics of equality saturation based on tree automata and uncover deep connections between equality saturation and the chase. We characterize the class of chase sequences that correspond to equality saturation. We study the complexities of terminations of equality saturation in three cases: single-instance, all-term-instance, and all-E-graph-instance. Finally, we define a syntactic criterion based on acyclicity that implies equality saturation termination.
