Denotation-based Compositional Compiler Verification

Zhang Cheng; Jiyang Wu; Di Wang; Qinxiang Cao

Denotation-based Compositional Compiler Verification

Zhang Cheng, Jiyang Wu, Di Wang, Qinxiang Cao

TL;DR

This paper presents a denotation-based framework for compositional compiler verification that maps syntactic components to semantic domains and defines correctness via behavioral refinement. It introduces unified semantic operators and a novel semantic linking construction based on fixed points to achieve module-level compositionality, addressing limitations of power-domain approaches. The method is applied to CompCert front-end languages, re-proving translation correctness from Clight to Csharpminor and from Csharpminor to Cminor, while supporting open modules and modular linking. A refinement algebra using a gamma operator enables modular, algebraic proofs and horizontal/vertical compositionality, with proofs formalized in Coq. The approach promises improved modularity and scalability for real-world compilers compared with traditional small-step operational semantics approaches.

Abstract

A desired but challenging property of compiler verification is compositionality in the sense that the compilation correctness of a program can be deduced from that of its substructures ranging from statements, functions, and modules incrementally. Previously proposed approaches have devoted extensive effort to module-level compositionality based on small-step semantics and simulation theories. This paper proposes a novel compiler verification framework based on denotational semantics for better compositionality. Specifically, our denotational semantics is defined by semantic functions that map a syntactic component to a semantic domain composed of multiple behavioral \emph{sets}, and compiler correctness is defined by the behavioral refinement between semantic domains of the source and the target programs. Therefore, when proving compiler correctness, we can extensively leverage the algebraic properties of sets. Another important contribution is that our formalization of denotational semantics captures the full meaning of a program and bridges the gap between those based on conventional powerdomains and what realistic compiler verification actually needs. We demonstrate our denotation-based framework viable and practical by applying it to the verification of the front-end of CompCert and showing that the compositionality from the compilation correctness of sub-statements to statements, from functions to modules, and from modules to the whole program (i.e., module-level compositionality) can be achieved similarly.

Denotation-based Compositional Compiler Verification

TL;DR

Abstract

Paper Structure (63 sections, 16 theorems, 78 equations, 8 figures)

This paper contains 63 sections, 16 theorems, 78 equations, 8 figures.

Introduction
How to tackle more program features and still keep good algebraic properties?
How to distinguish different program behaviors precisely?
We aim to demonstrate the power of our denotation-based approach by supporting module-level compositional compiler correctness.
Structure of the paper.
Unified Semantic Operators
A Toy Language and its Denotation
Semantic Operators for WHILE
Semantic Operators for Realistic Languages
Semantics of WHILE Loops
Semantics of Procedure Call
Terminating Behavior of PCALL
Semantics of Modules and Semantic Linking
Semantics of open modules
Semantic linking
...and 48 more sections

Key Result

theorem 1

For any group of source modules $S_1, \dots, S_n$ and target modules $T_1, \dots, T_n$, if $\llbracket T_i \rrbracket \sqsubseteq \llbracket S_i \rrbracket$ for each $i$, then

Figures (8)

Figure 1: Backward Simulation Diagrams Used in CompCert
Figure 2: Clight program 1a and Csharpminor program 2a are respectively translated to Csharpminor program 1b and Cminor program 2b, where ( exit$n$) will prematurely terminate ($n+1$) layers of nested blocks. In program 2b, the exiting number in $s_i$ is properly shifted according to the level of blocks it resides in.
Figure 3: Comparison Between the Semantics of Procedures and that of Modules. Here raising edges denote function calls (abbr. cll.), falling edges denote function returns (abbr. ret.), and the dashed line represents the behavior of external procedure $q$. The shaded areas show that for $\llbracket p_1 \rrbracket_{\chi}.\text{({\ttfamily\footnotesize nrm})}$, the behavior of every function call of $p_1$ is interpreted by $\chi$, while for $\llbracket M \rrbracket_{\chi}.\text{({\ttfamily\footnotesize nrm})}$, only the behavior of external calls are interpreted by $\chi$.
Figure 4: An Intuitive Illustration of Lemma \ref{['lemma_co1']}. The right-arrow denotes one step of iterations of $f$ on the x-field with some fixed $y$, i.e., $(x, y) \mapsto (f(x, y), y)$. Furthermore, $\omega$-steps of $f$ is $(x, y) \mapsto (\mu x_0. f(x_0, y), y)$. The up-arrow denotes one step of iterations of $g$ on the y-field with some fixed $x$, i.e., $(x, y) \mapsto (x, g(x, y))$, and $\omega$-steps of $g$ is $(x, y) \mapsto (x, \mu y_0. g(x, y_0))$.
Figure 5: Behavior Refinement for Original CompCert
...and 3 more figures

Theorems & Definitions (24)

theorem 1: Module-level compositionality
definition 1
theorem 2: Kleene fixed point
definition 2: Complete lattice
theorem 3: Knaster-Tarski theorem for greatest fixed point
theorem 4: Equivalence between semantic and syntactic linking
lemma 1: Coincide theorem 1
definition 3: Refinement algebra
definition 4: Refinement between statement denotations
definition 5: Refinement between function denotations
...and 14 more

Denotation-based Compositional Compiler Verification

TL;DR

Abstract

Denotation-based Compositional Compiler Verification

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (24)