Table of Contents
Fetching ...

E-Graphs as a Persistent Compiler Abstraction

Jules Merckx, Alexandre Lopoukhine, Samuel Coward, Jianyi Cheng, Bjorn De Sutter, Tobias Grosser

TL;DR

This work builds on a Python-based MLIR framework, xDSL, and introduces a new MLIR dialect, eqsat, that represents e-graphs in MLIR code, and shows that this representation expands the scope of equality saturation in the compiler, allowing us to interleave pattern rewriting with other compiler transformations.

Abstract

Recent algorithmic advances have made equality saturation an appealing approach to program optimization because it avoids the phase-ordering problem. Existing work uses external equality saturation libraries, or custom implementations that are deeply tied to the specific application. However, these works only apply equality saturation at a single level of abstraction, or discard the discovered equalities when code is transformed by other compiler passes. We propose an alternative approach that represents an e-graph natively in the compiler's intermediate representation, facilitating the application of constructive compiler passes that maintain the e-graph state throughout the compilation flow. We build on a Python-based MLIR framework, xDSL, and introduce a new MLIR dialect, eqsat, that represents e-graphs in MLIR code. We show that this representation expands the scope of equality saturation in the compiler, allowing us to interleave pattern rewriting with other compiler transformations. The eqsat dialect provides a unified abstraction for compilers to utilize equality saturation across various levels of intermediate representations concurrently within the same MLIR flow.

E-Graphs as a Persistent Compiler Abstraction

TL;DR

This work builds on a Python-based MLIR framework, xDSL, and introduces a new MLIR dialect, eqsat, that represents e-graphs in MLIR code, and shows that this representation expands the scope of equality saturation in the compiler, allowing us to interleave pattern rewriting with other compiler transformations.

Abstract

Recent algorithmic advances have made equality saturation an appealing approach to program optimization because it avoids the phase-ordering problem. Existing work uses external equality saturation libraries, or custom implementations that are deeply tied to the specific application. However, these works only apply equality saturation at a single level of abstraction, or discard the discovered equalities when code is transformed by other compiler passes. We propose an alternative approach that represents an e-graph natively in the compiler's intermediate representation, facilitating the application of constructive compiler passes that maintain the e-graph state throughout the compilation flow. We build on a Python-based MLIR framework, xDSL, and introduce a new MLIR dialect, eqsat, that represents e-graphs in MLIR code. We show that this representation expands the scope of equality saturation in the compiler, allowing us to interleave pattern rewriting with other compiler transformations. The eqsat dialect provides a unified abstraction for compilers to utilize equality saturation across various levels of intermediate representations concurrently within the same MLIR flow.
Paper Structure (43 sections, 7 equations, 10 figures, 2 tables)

This paper contains 43 sections, 7 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: Existing work targets a single abstraction, either: (b) orchestrating external equality saturation libraries, or (c) natively supporting e-graphs for a specialized use-case. By contrast, our persistent equality saturation approach (d) can reuse existing compiler transformations and retain equalities across abstraction levels.
  • Figure 2: An ir that incorporates user-extensibility as a core design feature, combined with properties such as ssa, provides a productive environment for exploring novel compiler techniques.
  • Figure 3: (left) In MLIR's default, destructive pattern rewriting framework, the pdl_interp.get_result and pdl_interp.get_defining_op operations are each other's inverse. (right) In eqsat, this is not the case because each value comes from one of multiple equivalent operations.
  • Figure 4: A comparison between pdl_interp matching code without and with eager semantic checks. The interpreter inserts backtracking points ($\bullet$) at each pdl_interp.get_defining_op . By carrying out semantic checks as soon as possible, an exponential blowup in the number of backtracking points can be avoided.
  • Figure 5: By introducing a choose operation in the matching code, backtracking points from different patterns can be separated, avoiding exponential growth.
  • ...and 5 more figures