Statistical Guarantees for Reasoning Probes on Looped Boolean Circuits

Anastasis Kratsios; Giulia Livieri; A. Martina Neuman

Statistical Guarantees for Reasoning Probes on Looped Boolean Circuits

Anastasis Kratsios, Giulia Livieri, A. Martina Neuman

TL;DR

The paper addresses the problem of statistically evaluating reasoning probes that interrogate looped Boolean circuits with partial observability. It introduces a GCN-based probing framework where probe outputs live in the interior of the $m$-simplex and uncertainty is modeled via the Aitchison geometry, coupled with a hitting-probability metric on a strongly connected digraph derived from looped execution. The main result proves a transductive generalization bound: with $N$ observed nodes, the worst-case generalization error decays at the optimal rate $\mathcal{O}\big(\sqrt{\log(2/\delta)}/\sqrt{N}\big)$ with probability at least $1-\delta$, and this rate is independent of the graph size thanks to a one-dimensional snowflake embedding of the induced graph metric. The work also provides Lipschitz estimates for GCNs on digraphs and develops a metric-embedding-based proof strategy, offering a principled link between circuit structure and statistical efficiency under partial access.

Abstract

We study the statistical behaviour of reasoning probes in a stylized model of looped reasoning, given by Boolean circuits whose computational graph is a perfect $ν$-ary tree ($ν\ge 2$) and whose output is appended to the input and fed back iteratively for subsequent computation rounds. A reasoning probe has access to a sampled subset of internal computation nodes, possibly without covering the entire graph, and seeks to infer which $ν$-ary Boolean gate is executed at each queried node, representing uncertainty via a probability distribution over a fixed collection of $\mathtt{m}$ admissible $ν$-ary gates. This partial observability induces a generalization problem, which we analyze in a realizable, transductive setting. We show that, when the reasoning probe is parameterized by a graph convolutional network (GCN)-based hypothesis class and queries $N$ nodes, the worst-case generalization error attains the optimal rate $\mathcal{O}(\sqrt{\log(2/δ)}/\sqrt{N})$ with probability at least $1-δ$, for $δ\in (0,1)$. Our analysis combines snowflake metric embedding techniques with tools from statistical optimal transport. A key insight is that this optimal rate is achievable independently of graph size, owing to the existence of a low-distortion one-dimensional snowflake embedding of the induced graph metric. As a consequence, our results provide a sharp characterization of how structural properties of the computational graph govern the statistical efficiency of reasoning under partial access.

Statistical Guarantees for Reasoning Probes on Looped Boolean Circuits

TL;DR

-simplex and uncertainty is modeled via the Aitchison geometry, coupled with a hitting-probability metric on a strongly connected digraph derived from looped execution. The main result proves a transductive generalization bound: with

observed nodes, the worst-case generalization error decays at the optimal rate

with probability at least

, and this rate is independent of the graph size thanks to a one-dimensional snowflake embedding of the induced graph metric. The work also provides Lipschitz estimates for GCNs on digraphs and develops a metric-embedding-based proof strategy, offering a principled link between circuit structure and statistical efficiency under partial access.

Abstract

We study the statistical behaviour of reasoning probes in a stylized model of looped reasoning, given by Boolean circuits whose computational graph is a perfect

-ary tree (

) and whose output is appended to the input and fed back iteratively for subsequent computation rounds. A reasoning probe has access to a sampled subset of internal computation nodes, possibly without covering the entire graph, and seeks to infer which

-ary Boolean gate is executed at each queried node, representing uncertainty via a probability distribution over a fixed collection of

admissible

-ary gates. This partial observability induces a generalization problem, which we analyze in a realizable, transductive setting. We show that, when the reasoning probe is parameterized by a graph convolutional network (GCN)-based hypothesis class and queries

nodes, the worst-case generalization error attains the optimal rate

with probability at least

, for

. Our analysis combines snowflake metric embedding techniques with tools from statistical optimal transport. A key insight is that this optimal rate is achievable independently of graph size, owing to the existence of a low-distortion one-dimensional snowflake embedding of the induced graph metric. As a consequence, our results provide a sharp characterization of how structural properties of the computational graph govern the statistical efficiency of reasoning under partial access.

Paper Structure (22 sections, 6 theorems, 104 equations, 4 figures)

This paper contains 22 sections, 6 theorems, 104 equations, 4 figures.

Introduction
Reasoning probe and probe uncertainty
How looped reasoning is formalized
Contributions
Organization of paper
Preliminaries
Strongly connected digraphs
Strongly connected digraph as metric space
Graph Laplacians
Graph convolutional networks on digraphs
Aitchison geometry
Setup and main result
Interpretation and proof strategy of main theorem
Uniform graph-size-independent generalization rates
Contextual discussion on coverage property of i.i.d. sampling
...and 7 more sections

Key Result

Theorem 3.1

Let $\alpha\in (0,1)$. Let $t,N\in \mathbb{N}$. For every $\delta\in (0,1)$, the following event holds with probability at least $1-\delta$

Figures (4)

Figure 2: Illustration of a one-dimensional embedding of a finite metric space; the objective is to keep metric distortion small.
Figure : (a) Looped reasoning model
Figure : (a) Looped reasoning model
Figure : (b) Strongly connected digraph$\mathcal{G}^{\mathrm{time}}/\mathbb{N}_{\ge 0}$

Theorems & Definitions (19)

Remark 2.1
Definition 2.1
Theorem 3.1: Main result
Proposition 4.1
Lemma A.1
Proposition A.1: Low distortion snowflake-embedding into $\mathbb{R}$
Proposition A.2
proof : Proof of Proposition \ref{['prop:independent embedding']}
proof : Proof of Lemma \ref{['lem:New_Convergence__SuperAssouad']}
proof : Proof of Theorem \ref{['thrm:main_result']}
...and 9 more

Statistical Guarantees for Reasoning Probes on Looped Boolean Circuits

TL;DR

Abstract

Statistical Guarantees for Reasoning Probes on Looped Boolean Circuits

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (19)