Table of Contents
Fetching ...

Denotation-based Compositional Compiler Verification

Zhang Cheng, Jiyang Wu, Di Wang, Qinxiang Cao

TL;DR

This paper presents a denotation-based framework for compositional compiler verification that maps syntactic components to semantic domains and defines correctness via behavioral refinement. It introduces unified semantic operators and a novel semantic linking construction based on fixed points to achieve module-level compositionality, addressing limitations of power-domain approaches. The method is applied to CompCert front-end languages, re-proving translation correctness from Clight to Csharpminor and from Csharpminor to Cminor, while supporting open modules and modular linking. A refinement algebra using a gamma operator enables modular, algebraic proofs and horizontal/vertical compositionality, with proofs formalized in Coq. The approach promises improved modularity and scalability for real-world compilers compared with traditional small-step operational semantics approaches.

Abstract

A desired but challenging property of compiler verification is compositionality in the sense that the compilation correctness of a program can be deduced from that of its substructures ranging from statements, functions, and modules incrementally. Previously proposed approaches have devoted extensive effort to module-level compositionality based on small-step semantics and simulation theories. This paper proposes a novel compiler verification framework based on denotational semantics for better compositionality. Specifically, our denotational semantics is defined by semantic functions that map a syntactic component to a semantic domain composed of multiple behavioral \emph{sets}, and compiler correctness is defined by the behavioral refinement between semantic domains of the source and the target programs. Therefore, when proving compiler correctness, we can extensively leverage the algebraic properties of sets. Another important contribution is that our formalization of denotational semantics captures the full meaning of a program and bridges the gap between those based on conventional powerdomains and what realistic compiler verification actually needs. We demonstrate our denotation-based framework viable and practical by applying it to the verification of the front-end of CompCert and showing that the compositionality from the compilation correctness of sub-statements to statements, from functions to modules, and from modules to the whole program (i.e., module-level compositionality) can be achieved similarly.

Denotation-based Compositional Compiler Verification

TL;DR

This paper presents a denotation-based framework for compositional compiler verification that maps syntactic components to semantic domains and defines correctness via behavioral refinement. It introduces unified semantic operators and a novel semantic linking construction based on fixed points to achieve module-level compositionality, addressing limitations of power-domain approaches. The method is applied to CompCert front-end languages, re-proving translation correctness from Clight to Csharpminor and from Csharpminor to Cminor, while supporting open modules and modular linking. A refinement algebra using a gamma operator enables modular, algebraic proofs and horizontal/vertical compositionality, with proofs formalized in Coq. The approach promises improved modularity and scalability for real-world compilers compared with traditional small-step operational semantics approaches.

Abstract

A desired but challenging property of compiler verification is compositionality in the sense that the compilation correctness of a program can be deduced from that of its substructures ranging from statements, functions, and modules incrementally. Previously proposed approaches have devoted extensive effort to module-level compositionality based on small-step semantics and simulation theories. This paper proposes a novel compiler verification framework based on denotational semantics for better compositionality. Specifically, our denotational semantics is defined by semantic functions that map a syntactic component to a semantic domain composed of multiple behavioral \emph{sets}, and compiler correctness is defined by the behavioral refinement between semantic domains of the source and the target programs. Therefore, when proving compiler correctness, we can extensively leverage the algebraic properties of sets. Another important contribution is that our formalization of denotational semantics captures the full meaning of a program and bridges the gap between those based on conventional powerdomains and what realistic compiler verification actually needs. We demonstrate our denotation-based framework viable and practical by applying it to the verification of the front-end of CompCert and showing that the compositionality from the compilation correctness of sub-statements to statements, from functions to modules, and from modules to the whole program (i.e., module-level compositionality) can be achieved similarly.
Paper Structure (63 sections, 16 theorems, 78 equations, 8 figures)

This paper contains 63 sections, 16 theorems, 78 equations, 8 figures.

Key Result

theorem 1

For any group of source modules $S_1, \dots, S_n$ and target modules $T_1, \dots, T_n$, if $\llbracket T_i \rrbracket \sqsubseteq \llbracket S_i \rrbracket$ for each $i$, then

Figures (8)

  • Figure 1: Backward Simulation Diagrams Used in CompCert
  • Figure 2: Clight program 1a and Csharpminor program 2a are respectively translated to Csharpminor program 1b and Cminor program 2b, where ( exit$n$) will prematurely terminate ($n+1$) layers of nested blocks. In program 2b, the exiting number in $s_i$ is properly shifted according to the level of blocks it resides in.
  • Figure 3: Comparison Between the Semantics of Procedures and that of Modules. Here raising edges denote function calls (abbr. cll.), falling edges denote function returns (abbr. ret.), and the dashed line represents the behavior of external procedure $q$. The shaded areas show that for $\llbracket p_1 \rrbracket_{\chi}.\text{({\ttfamily\footnotesize nrm})}$, the behavior of every function call of $p_1$ is interpreted by $\chi$, while for $\llbracket M \rrbracket_{\chi}.\text{({\ttfamily\footnotesize nrm})}$, only the behavior of external calls are interpreted by $\chi$.
  • Figure 4: An Intuitive Illustration of Lemma \ref{['lemma_co1']}. The right-arrow denotes one step of iterations of $f$ on the x-field with some fixed $y$, i.e., $(x, y) \mapsto (f(x, y), y)$. Furthermore, $\omega$-steps of $f$ is $(x, y) \mapsto (\mu x_0. f(x_0, y), y)$. The up-arrow denotes one step of iterations of $g$ on the y-field with some fixed $x$, i.e., $(x, y) \mapsto (x, g(x, y))$, and $\omega$-steps of $g$ is $(x, y) \mapsto (x, \mu y_0. g(x, y_0))$.
  • Figure 5: Behavior Refinement for Original CompCert
  • ...and 3 more figures

Theorems & Definitions (24)

  • theorem 1: Module-level compositionality
  • definition 1
  • theorem 2: Kleene fixed point
  • definition 2: Complete lattice
  • theorem 3: Knaster-Tarski theorem for greatest fixed point
  • theorem 4: Equivalence between semantic and syntactic linking
  • lemma 1: Coincide theorem 1
  • definition 3: Refinement algebra
  • definition 4: Refinement between statement denotations
  • definition 5: Refinement between function denotations
  • ...and 14 more