A Logic for Expressing Log-Precision Transformers

William Merrill; Ashish Sabharwal

A Logic for Expressing Log-Precision Transformers

William Merrill, Ashish Sabharwal

TL;DR

The paper addresses how to formally characterize the reasoning of transformer classifiers that operate with log-precision on context lengths, proposing first-order logic augmented with majority quantifiers, $FO(M)$, as a target descriptive framework. It proves that any log-precision transformer can be translated into an $FO(M)$ sentence, establishing the tightest known upper bound and connecting transformer expressiveness to ${ t TC}^0$ circuit classes. By contrasting with finite-precision transformers, it highlights the necessity of increasing precision to capture uniform attention and broad input dependencies, while offering a mechanistic interpretability angle via a simple logical specification. The results pave the way for a logic-based view of transformer computation and suggest avenues for transformer-complete programming languages and systematic interpretability of neural nets.

Abstract

One way to interpret the reasoning power of transformer-based language models is to describe the types of logical rules they can resolve over some input text. Recently, Chiang et al. (2023) showed that finite-precision transformers can be equivalently expressed in a generalization of first-order logic. However, finite-precision transformers are a weak transformer variant because, as we show, a single head can only attend to a constant number of tokens and, in particular, cannot represent uniform attention. Since attending broadly is a core capability for transformers, we ask whether a minimally more expressive model that can attend universally can also be characterized in logic. To this end, we analyze transformers whose forward pass is computed in $\log n$ precision on contexts of length $n$. We prove that any log-precision transformer can be equivalently expressed as a first-order logic sentence that, in addition to standard universal and existential quantifiers, may also contain majority-vote quantifiers. This is the tightest known upper bound and first logical characterization of log-precision transformers.

A Logic for Expressing Log-Precision Transformers

TL;DR

, as a target descriptive framework. It proves that any log-precision transformer can be translated into an

sentence, establishing the tightest known upper bound and connecting transformer expressiveness to

circuit classes. By contrasting with finite-precision transformers, it highlights the necessity of increasing precision to capture uniform attention and broad input dependencies, while offering a mechanistic interpretability angle via a simple logical specification. The results pave the way for a logic-based view of transformer computation and suggest avenues for transformer-complete programming languages and systematic interpretability of neural nets.

Abstract

precision on contexts of length

. We prove that any log-precision transformer can be equivalently expressed as a first-order logic sentence that, in addition to standard universal and existential quantifiers, may also contain majority-vote quantifiers. This is the tightest known upper bound and first logical characterization of log-precision transformers.

Paper Structure (33 sections, 24 theorems, 18 equations, 2 figures, 1 table, 2 algorithms)

This paper contains 33 sections, 24 theorems, 18 equations, 2 figures, 1 table, 2 algorithms.

Introduction
Upper bound.
Mechanistic interpretability.
Contributions.
Preliminaries: Transformers and FO(M)
Transformers
First-Order Logic with Majority
Extensions.
Examples
Finite Precision Transformers Cannot Attend Universally
Main Result: Expressing Log-Precision Transformers in FO(M)
Preliminaries for Proving Theorem \ref{['thm:main-fixed']}
Computation Graphs
Computation Graph Families
Proof of Theorem \ref{['thm:main-fixed']}
...and 18 more sections

Key Result

Theorem 1

For any log-precision transformer $\mathcal{T}$, there exists an $\mathsf{FO(M)}$ sentence $\phi$ that computes the same function as $\mathcal{T}$, i.e., $\phi(x) = \mathcal{T}(x)$ for any input string $x$.

Figures (2)

Figure 1: A first-order logic with majority ($\mathsf{FO(M)}$) sentence for $\texttt{a}^m\texttt{b}^m$. In addition to standard $\forall$ and $\exists$ quantifiers over string indices, $\mathsf{FO(M)}$ allows majority quantifiers ($\mathsf M$) that take a majority-vote across indices. $\texttt{a}(i)$ indicates whether token $i$ is $\texttt{a}$ (and analogously for $\texttt{b}$). We prove $\mathsf{FO(M)}$ can express any function computed by a log-precision transformer.
Figure : $\mathsf{node}_{\mathcal{C}}(n, i)$ Return the type of gate $i$ in circuit $C_n$.

Theorems & Definitions (50)

Theorem 1: Informal version of \ref{['thm:main-fixed']}
Definition 1
Definition 2
Definition 3: $\mathsf{FO(M)}$ index
Definition 4: $\mathsf{FO(M)}$ formula
Example 1: Bigram matching
Example 2: Skip-bigram matching
Example 3: Majority
Example 4: $1$-Dyck
Example 5: Integer Arithmetic
...and 40 more

A Logic for Expressing Log-Precision Transformers

TL;DR

Abstract

A Logic for Expressing Log-Precision Transformers

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (50)