Table of Contents
Fetching ...

Self-Improving AI Agents through Self-Play

Przemyslaw Chojecki

TL;DR

<3-5 sentence high-level summary> This work develops a geometric framework for autonomous self-improvement in AI agents by modeling an agent as a flow on a parameter manifold and introducing the Generator-Verifier-Updater (GVU) operator as the catalytic engine of self-improvement. It derives the Variance Inequality, a spectral condition linking alignment between generation and verification, noise levels, curvature, and step size to positive capability gain, thereby explaining when self-improvement will actually occur and highlighting the central role of verification strength. The authors unify diverse self-improvement paradigms (Language Self-Play, Self-Correction, Synthetic Data bootstrapping) as topological realizations of GVU across moduli fibers, and propose design levers (ensembles, oracle verifiers, GRPO, diagonal vs ensemble GVU) to widen the stable self-improvement window. They also provide an operational protocol for empirically estimating the self-improvement rate, and discuss practical implications, including the risks of Goodhart drift and AI slop, stressing that robust, high-SNR verification is key to ignition across domains. The framework thus offers a principled, geometry-grounded lens to guide the development of self-improving AI systems toward reliable, cross-domain ignition of autonomous capability growth.

Abstract

We extend the moduli-theoretic framework of psychometric batteries to the domain of dynamical systems. While previous work established the AAI capability score as a static functional on the space of agent representations, this paper formalizes the agent as a flow $ν_r$ parameterized by computational resource $r$, governed by a recursive Generator-Verifier-Updater (GVU) operator. We prove that this operator generates a vector field on the parameter manifold $Θ$, and we identify the coefficient of self-improvement $κ$ as the Lie derivative of the capability functional along this flow. The central contribution of this work is the derivation of the Variance Inequality, a spectral condition that is sufficient (under mild regularity) for the stability of self-improvement. We show that a sufficient condition for $κ> 0$ is that, up to curvature and step-size effects, the combined noise of generation and verification must be small enough. We then apply this formalism to unify the recent literature on Language Self-Play (LSP), Self-Correction, and Synthetic Data bootstrapping. We demonstrate that architectures such as STaR, SPIN, Reflexion, GANs and AlphaZero are specific topological realizations of the GVU operator that satisfy the Variance Inequality through filtration, adversarial discrimination, or grounding in formal systems.

Self-Improving AI Agents through Self-Play

TL;DR

<3-5 sentence high-level summary> This work develops a geometric framework for autonomous self-improvement in AI agents by modeling an agent as a flow on a parameter manifold and introducing the Generator-Verifier-Updater (GVU) operator as the catalytic engine of self-improvement. It derives the Variance Inequality, a spectral condition linking alignment between generation and verification, noise levels, curvature, and step size to positive capability gain, thereby explaining when self-improvement will actually occur and highlighting the central role of verification strength. The authors unify diverse self-improvement paradigms (Language Self-Play, Self-Correction, Synthetic Data bootstrapping) as topological realizations of GVU across moduli fibers, and propose design levers (ensembles, oracle verifiers, GRPO, diagonal vs ensemble GVU) to widen the stable self-improvement window. They also provide an operational protocol for empirically estimating the self-improvement rate, and discuss practical implications, including the risks of Goodhart drift and AI slop, stressing that robust, high-SNR verification is key to ignition across domains. The framework thus offers a principled, geometry-grounded lens to guide the development of self-improving AI systems toward reliable, cross-domain ignition of autonomous capability growth.

Abstract

We extend the moduli-theoretic framework of psychometric batteries to the domain of dynamical systems. While previous work established the AAI capability score as a static functional on the space of agent representations, this paper formalizes the agent as a flow parameterized by computational resource , governed by a recursive Generator-Verifier-Updater (GVU) operator. We prove that this operator generates a vector field on the parameter manifold , and we identify the coefficient of self-improvement as the Lie derivative of the capability functional along this flow. The central contribution of this work is the derivation of the Variance Inequality, a spectral condition that is sufficient (under mild regularity) for the stability of self-improvement. We show that a sufficient condition for is that, up to curvature and step-size effects, the combined noise of generation and verification must be small enough. We then apply this formalism to unify the recent literature on Language Self-Play (LSP), Self-Correction, and Synthetic Data bootstrapping. We demonstrate that architectures such as STaR, SPIN, Reflexion, GANs and AlphaZero are specific topological realizations of the GVU operator that satisfy the Variance Inequality through filtration, adversarial discrimination, or grounding in formal systems.

Paper Structure

This paper contains 39 sections, 10 theorems, 108 equations.

Key Result

Theorem 3.6

Assume the regularity conditions of Definition def:fisher, so that $G(\theta)$ is positive definite for all $\theta$ in a region of interest. Let $v : \Theta \to T\Theta$ be a smooth vector field, e.g. the velocity $v(\theta_r) = \dot{\theta}_r$ of an autonomous flow $\gamma : r \mapsto \theta_r$ on such that where $s_\theta$ is the score function from Definition def:fisher. In particular, $v(\th

Theorems & Definitions (53)

  • Definition 2.1: Battery
  • Definition 2.2: Trace and Observables
  • Definition 2.3: Input and Output Spaces
  • Definition 2.4: Parameter Manifold $\Theta$
  • Remark 2.5: Statistical manifold
  • Definition 2.6: Policy Space
  • Definition 2.7: Architecture $\Pi _{\Theta}$
  • Definition 2.8: Representation Map $\rho_{\mathcal{B}}$
  • Definition 2.9: Capability Functional $\Phi_{\mathcal{B}}$ and Commutative Diagram
  • Definition 3.1: External score and internal potential
  • ...and 43 more