Statistical Guarantees for Reasoning Probes on Looped Boolean Circuits
Anastasis Kratsios, Giulia Livieri, A. Martina Neuman
TL;DR
The paper addresses the problem of statistically evaluating reasoning probes that interrogate looped Boolean circuits with partial observability. It introduces a GCN-based probing framework where probe outputs live in the interior of the $m$-simplex and uncertainty is modeled via the Aitchison geometry, coupled with a hitting-probability metric on a strongly connected digraph derived from looped execution. The main result proves a transductive generalization bound: with $N$ observed nodes, the worst-case generalization error decays at the optimal rate $\mathcal{O}\big(\sqrt{\log(2/\delta)}/\sqrt{N}\big)$ with probability at least $1-\delta$, and this rate is independent of the graph size thanks to a one-dimensional snowflake embedding of the induced graph metric. The work also provides Lipschitz estimates for GCNs on digraphs and develops a metric-embedding-based proof strategy, offering a principled link between circuit structure and statistical efficiency under partial access.
Abstract
We study the statistical behaviour of reasoning probes in a stylized model of looped reasoning, given by Boolean circuits whose computational graph is a perfect $ν$-ary tree ($ν\ge 2$) and whose output is appended to the input and fed back iteratively for subsequent computation rounds. A reasoning probe has access to a sampled subset of internal computation nodes, possibly without covering the entire graph, and seeks to infer which $ν$-ary Boolean gate is executed at each queried node, representing uncertainty via a probability distribution over a fixed collection of $\mathtt{m}$ admissible $ν$-ary gates. This partial observability induces a generalization problem, which we analyze in a realizable, transductive setting. We show that, when the reasoning probe is parameterized by a graph convolutional network (GCN)-based hypothesis class and queries $N$ nodes, the worst-case generalization error attains the optimal rate $\mathcal{O}(\sqrt{\log(2/δ)}/\sqrt{N})$ with probability at least $1-δ$, for $δ\in (0,1)$. Our analysis combines snowflake metric embedding techniques with tools from statistical optimal transport. A key insight is that this optimal rate is achievable independently of graph size, owing to the existence of a low-distortion one-dimensional snowflake embedding of the induced graph metric. As a consequence, our results provide a sharp characterization of how structural properties of the computational graph govern the statistical efficiency of reasoning under partial access.
