Table of Contents
Fetching ...

PolyVer: A Compositional Approach for Polyglot System Modeling and Verification

Pei-Wei Chen, Shaokai Lin, Adwait Godbole, Ramneet Singh, Elizabeth Polgreen, Edward A. Lee, Sanjit A. Seshia

TL;DR

PolyVer tackles the challenge of verifying polyglot software by introducing a compositional contract-based framework that avoids full translation to a single verification language. It defines an intermediate contract language ${ L}^*$ to express pre/post conditions, and wires language-specific verifiers through contracts using a hybrid CEGIS/CEGAR loop that synthesizes and refines these contracts. Implemented on top of UCLID5 with C and Rust backends, PolyVer verifies Lingua Franca programs and multi-language systems, including inter-language calls and ROS2-like services. The results demonstrate automated, scalable verification with formal guarantees, enabling safe cross-language software in domains such as real-time distributed systems and robotics.

Abstract

Several software systems are polyglot; that is, they comprise programs implemented in a combination of programming languages. Verifiers that directly run on mainstream programming languages are currently customized for single languages. Thus, to verify polyglot systems, one usually translates them into a common verification language or formalism on which the verifier runs. In this paper, we present an alternative approach, PolyVer, which employs abstraction, compositional reasoning, and synthesis to directly perform polyglot verification. PolyVer constructs a formal model of the original polyglot system as a transition system where the update functions associated with transitions are implemented in target languages such as C or Rust. To perform verification, PolyVer then connects a model checker for transition systems with language-specific verifiers (e.g., for C or Rust) using pre/post-condition contracts for the update functions. These contracts are automatically generated by synthesis oracles based on syntax-guided synthesis or large language models (LLMs), and checked by the language-specific verifiers. The contracts form abstractions of the update functions using which the model checker verifies the overall system-level property on the polyglot system model. PolyVer iterates between counterexample-guided abstraction-refinement (CEGAR) and counterexample-guided inductive synthesis (CEGIS) until the property is verified or a true system-level counterexample is found. We demonstrate the utility of PolyVer for verifying programs in the Lingua Franca polyglot language using the UCLID5 model checker connected with the CBMC and Kani verifiers for C and Rust respectively.

PolyVer: A Compositional Approach for Polyglot System Modeling and Verification

TL;DR

PolyVer tackles the challenge of verifying polyglot software by introducing a compositional contract-based framework that avoids full translation to a single verification language. It defines an intermediate contract language to express pre/post conditions, and wires language-specific verifiers through contracts using a hybrid CEGIS/CEGAR loop that synthesizes and refines these contracts. Implemented on top of UCLID5 with C and Rust backends, PolyVer verifies Lingua Franca programs and multi-language systems, including inter-language calls and ROS2-like services. The results demonstrate automated, scalable verification with formal guarantees, enabling safe cross-language software in domains such as real-time distributed systems and robotics.

Abstract

Several software systems are polyglot; that is, they comprise programs implemented in a combination of programming languages. Verifiers that directly run on mainstream programming languages are currently customized for single languages. Thus, to verify polyglot systems, one usually translates them into a common verification language or formalism on which the verifier runs. In this paper, we present an alternative approach, PolyVer, which employs abstraction, compositional reasoning, and synthesis to directly perform polyglot verification. PolyVer constructs a formal model of the original polyglot system as a transition system where the update functions associated with transitions are implemented in target languages such as C or Rust. To perform verification, PolyVer then connects a model checker for transition systems with language-specific verifiers (e.g., for C or Rust) using pre/post-condition contracts for the update functions. These contracts are automatically generated by synthesis oracles based on syntax-guided synthesis or large language models (LLMs), and checked by the language-specific verifiers. The contracts form abstractions of the update functions using which the model checker verifies the overall system-level property on the polyglot system model. PolyVer iterates between counterexample-guided abstraction-refinement (CEGAR) and counterexample-guided inductive synthesis (CEGIS) until the property is verified or a true system-level counterexample is found. We demonstrate the utility of PolyVer for verifying programs in the Lingua Franca polyglot language using the UCLID5 model checker connected with the CBMC and Kani verifiers for C and Rust respectively.

Paper Structure

This paper contains 26 sections, 1 theorem, 8 equations, 6 figures, 2 tables, 3 algorithms.

Key Result

Theorem 1

If $\mathcal{M'} \models \phi$, then ${\mathcal{M}} \models \phi$.

Figures (6)

  • Figure 1: An Extended State Machine (ESM) describing a subway traffic control system implemented in Lingua Franca (LF). $a/b$ means that when $a$ is present, $b$ is run. The procedures in blue and orange are Rust and C procedures respectively. Implementations of two procedures are shown on the right. ctx.set and lf_set are LF API calls assigning values to ports.
  • Figure 2: UCLID5 translation of x.saturating_add(1) in Rust. Notice that the translation depends on the type of x. In UCLID5 and SMT-LIB, bitvectors do not have implicit type interpretation. Instead, they are only interpreted when specific operators are applied. For example, < (bvslt) and <_u (bvult) stands for the signed and unsigned less than comparators in UCLID (SMT-LIB), respectively.
  • Figure 3: An intermediate language ${\mathcal{L}^*}$ act as glue between languages. Synthesis oracles generate expressions in ${\mathcal{L}^*}$, which are translated to semantic equivalent expressions in specific languages used by a model checker or verification oracles. $x$ and $y$ are unsigned bitvectors, hence using <_u in UCLID5.
  • Figure 4: Flow diagram of our approach. The process starts from the synthesis oracle. Contracts are synthesized in the inner (dashed) CEGIS loop and refined in the outer CEGAR loop. $X$ and $\overline{X}$ denotes positive and negative example sets.
  • Figure 5: LF program for a subway control system from halbwachs1992programming.
  • ...and 1 more figures

Theorems & Definitions (9)

  • Definition 1: Extended State Machine
  • Definition 2: Language
  • Definition 3: Procedure
  • Definition 4: Polyglot Model
  • Definition 5: Contract
  • Theorem 1
  • Definition 6: Intermediate Language
  • Example 1: Contract synthesis
  • Example 2: Spuriousness check