Verifying Peephole Rewriting In SSA Compiler IRs
Siddharth Bhat, Alex Keizer, Chris Hughes, Andrés Goens, Tobias Grosser
TL;DR
The paper tackles verifying peephole rewrites across domain-specific SSA-based IRs by introducing a core calculus that supports regions and by implementing it in Lean as LeanMLIR(X) with an MLIR syntax embedding. It provides a verified peephole rewriter and builds two canonical SSA optimizations (DCE and CSE), together with automation tactics to keep proof goals manageable. Three MLIR-based case studies (bitvectors, structured control flow, and fully homomorphic encryption) demonstrate the approach’s extensibility to diverse domains, including a QuotRing IR modeling $R = (\mathbb{Z}/q\mathbb{Z})[X]/(X^{2^n}+1)$. The work enables formally verified rewrites on new domain-specific IRs, offering a practical bridge between automation (SMT-like tools) and interactive theorem proving for compiler reasoning with regions and SSA def-use chains.
Abstract
There is an increasing need for domain-specific reasoning in modern compilers. This has fueled the use of tailored intermediate representations (IRs) based on static single assignment (SSA), like in the MLIR compiler framework. Interactive theorem provers (ITPs) provide strong guarantees for the end-to-end verification of compilers (e.g., CompCert). However, modern compilers and their IRs evolve at a rate that makes proof engineering alongside them prohibitively expensive. Nevertheless, well-scoped push-button automated verification tools such as the Alive peephole verifier for LLVM-IR gained recognition in domains where SMT solvers offer efficient (semi) decision procedures. In this paper, we aim to combine the convenience of automation with the versatility of ITPs for verifying peephole rewrites across domain-specific IRs. We formalize a core calculus for SSA-based IRs that is generic over the IR and covers so-called regions (nested scoping used by many domain-specific IRs in the MLIR ecosystem). Our mechanization in the Lean proof assistant provides a user-friendly frontend for translating MLIR syntax into our calculus. We provide scaffolding for defining and verifying peephole rewrites, offering tactics to eliminate the abstraction overhead of our SSA calculus. We prove correctness theorems about peephole rewriting, as well as two classical program transformations. To evaluate our framework, we consider three use cases from the MLIR ecosystem that cover different levels of abstractions: (1) bitvector rewrites from LLVM, (2) structured control flow, and (3) fully homomorphic encryption. We envision that our mechanization provides a foundation for formally verified rewrites on new domain-specific IRs.
