Self-Improving AI Agents through Self-Play
Przemyslaw Chojecki
TL;DR
<3-5 sentence high-level summary> This work develops a geometric framework for autonomous self-improvement in AI agents by modeling an agent as a flow on a parameter manifold and introducing the Generator-Verifier-Updater (GVU) operator as the catalytic engine of self-improvement. It derives the Variance Inequality, a spectral condition linking alignment between generation and verification, noise levels, curvature, and step size to positive capability gain, thereby explaining when self-improvement will actually occur and highlighting the central role of verification strength. The authors unify diverse self-improvement paradigms (Language Self-Play, Self-Correction, Synthetic Data bootstrapping) as topological realizations of GVU across moduli fibers, and propose design levers (ensembles, oracle verifiers, GRPO, diagonal vs ensemble GVU) to widen the stable self-improvement window. They also provide an operational protocol for empirically estimating the self-improvement rate, and discuss practical implications, including the risks of Goodhart drift and AI slop, stressing that robust, high-SNR verification is key to ignition across domains. The framework thus offers a principled, geometry-grounded lens to guide the development of self-improving AI systems toward reliable, cross-domain ignition of autonomous capability growth.
Abstract
We extend the moduli-theoretic framework of psychometric batteries to the domain of dynamical systems. While previous work established the AAI capability score as a static functional on the space of agent representations, this paper formalizes the agent as a flow $ν_r$ parameterized by computational resource $r$, governed by a recursive Generator-Verifier-Updater (GVU) operator. We prove that this operator generates a vector field on the parameter manifold $Θ$, and we identify the coefficient of self-improvement $κ$ as the Lie derivative of the capability functional along this flow. The central contribution of this work is the derivation of the Variance Inequality, a spectral condition that is sufficient (under mild regularity) for the stability of self-improvement. We show that a sufficient condition for $κ> 0$ is that, up to curvature and step-size effects, the combined noise of generation and verification must be small enough. We then apply this formalism to unify the recent literature on Language Self-Play (LSP), Self-Correction, and Synthetic Data bootstrapping. We demonstrate that architectures such as STaR, SPIN, Reflexion, GANs and AlphaZero are specific topological realizations of the GVU operator that satisfy the Variance Inequality through filtration, adversarial discrimination, or grounding in formal systems.
