Table of Contents
Fetching ...

Synthesis and Verification of Transformer Programs

Hongjian Jiang, Matthew Hague, Philipp Rümmer, Anthony Widjaja Lin

TL;DR

This paper investigates automatic verification and synthesis of transformer-oriented programs by using C-RASP as a structurally interpretable surrogate. It shows how to reduce C-RASP verification to Lustre model checking, enabling efficient use of SMT-enabled solvers for properties like language inclusion and emptiness, and introduces a simulated-annealing based method to learn C-RASP programs from labeled examples. The authors implement a unified framework that jointly performs synthesis and verification, achieving high accuracy on benchmark languages and identifying fundamental expressiveness limits of C-RASP where termination is not guaranteed. The work advances practical interpretability and controllability of transformer-like behavior, with implications for program optimization and constrained learning of transformer programs. Overall, the approach demonstrates a viable path to formal analysis and automatic construction of transformer-inspired programs with verifiable guarantees.

Abstract

C-RASP is a simple programming language that was recently shown to capture concepts expressible by transformers. In this paper, we develop new algorithmic techniques for automatically verifying C-RASPs. To this end, we establish a connection to the verification of synchronous dataflow programs in Lustre, which enables us to exploit state-of-the-art model checkers utilizing highly optimized SMT-solvers. Our second contribution addresses learning a C-RASP program in the first place. To this end, we provide a new algorithm for learning a C-RASP from examples using local search. We demonstrate efficacy of our implementation for benchmarks of C-RASPs in the literature, in particular in connection to the following applications: (1) transformer program optimization, and (2) constrained learning of transformer programs (based on a partial specification).

Synthesis and Verification of Transformer Programs

TL;DR

This paper investigates automatic verification and synthesis of transformer-oriented programs by using C-RASP as a structurally interpretable surrogate. It shows how to reduce C-RASP verification to Lustre model checking, enabling efficient use of SMT-enabled solvers for properties like language inclusion and emptiness, and introduces a simulated-annealing based method to learn C-RASP programs from labeled examples. The authors implement a unified framework that jointly performs synthesis and verification, achieving high accuracy on benchmark languages and identifying fundamental expressiveness limits of C-RASP where termination is not guaranteed. The work advances practical interpretability and controllability of transformer-like behavior, with implications for program optimization and constrained learning of transformer programs. Overall, the approach demonstrates a viable path to formal analysis and automatic construction of transformer-inspired programs with verifiable guarantees.

Abstract

C-RASP is a simple programming language that was recently shown to capture concepts expressible by transformers. In this paper, we develop new algorithmic techniques for automatically verifying C-RASPs. To this end, we establish a connection to the verification of synchronous dataflow programs in Lustre, which enables us to exploit state-of-the-art model checkers utilizing highly optimized SMT-solvers. Our second contribution addresses learning a C-RASP program in the first place. To this end, we provide a new algorithm for learning a C-RASP from examples using local search. We demonstrate efficacy of our implementation for benchmarks of C-RASPs in the literature, in particular in connection to the following applications: (1) transformer program optimization, and (2) constrained learning of transformer programs (based on a partial specification).
Paper Structure (46 sections, 4 theorems, 45 equations, 1 table, 1 algorithm)

This paper contains 46 sections, 4 theorems, 45 equations, 1 table, 1 algorithm.

Key Result

Proposition 3.2

Take C-RASP programs $P_1$ and $P_2$. $P_1$ accepts only words accepted by $P_2$ if $T_\subseteq(P_1, P_2)$ is valid.

Theorems & Definitions (10)

  • Example 1
  • Definition 2.1: C-RASP
  • Definition 2.2: Boolean Expressions
  • Definition 2.3: Counting Expressions
  • Example 2
  • Definition 3.1: Lustre (fragment)
  • Proposition 3.2
  • Proposition 3.3
  • Proposition A.1
  • Proposition A.2