Table of Contents
Fetching ...

An Operational Kardashev-Style Scale for Autonomous AI - Towards AGI and Superintelligence

Przemyslaw Chojecki

TL;DR

The paper proposes a Kardashev-style, operational Autonomous AI (AAI) Scale to quantify progress from fixed automation to AGI and beyond using ten normalized axes and a composite AAI-Index. It introduces a Self-Improvement Coefficient $\kappa$ and two closure properties (maintenance and expansion) to render self-improvement falsifiable, alongside OWA-Bench, an open-world agency benchmark for long-horizon, tool-using, persistent agents. A formal progression framework links AAI-3 to AAI-5 through curvature and link-step progression on a fixed battery, with a CHC-based alignment crosswalk and a non-compensatory battery ensuring breadth and reliability. The approach emphasizes auditable Gates, drift-robust benchmarking, and a dynamics-aware Delegability Frontier to guide governance and deployment decisions, culminating in a Domain Annex for software agents and an Embodiment Annex for robotics. Collectively, the work provides a rigorous, auditable, and scalable framework to evaluate autonomous AI deployment, self-improvement, and progression toward superintelligence while mitigating gaming and enabling reproducibility across domains.

Abstract

We propose a Kardashev-inspired yet operational Autonomous AI (AAI) Scale that measures the progression from fixed robotic process automation (AAI-0) to full artificial general intelligence (AAI-4) and beyond. Unlike narrative ladders, our scale is multi-axis and testable. We define ten capability axes (Autonomy, Generality, Planning, Memory/Persistence, Tool Economy, Self-Revision, Sociality/Coordination, Embodiment, World-Model Fidelity, Economic Throughput) aggregated by a composite AAI-Index (a weighted geometric mean). We introduce a measurable Self-Improvement Coefficient $κ$ (capability growth per unit of agent-initiated resources) and two closure properties (maintenance and expansion) that convert ``self-improving AI'' into falsifiable criteria. We specify OWA-Bench, an open-world agency benchmark suite that evaluates long-horizon, tool-using, persistent agents. We define level gates for AAI-0\ldots AAI-4 using thresholds on the axes, $κ$, and closure proofs. Synthetic experiments illustrate how present-day systems map onto the scale and how the delegability frontier (quality vs.\ autonomy) advances with self-improvement. We also prove a theorem that AAI-3 agent becomes AAI-5 over time with sufficient conditions, formalizing "baby AGI" becomes Superintelligence intuition.

An Operational Kardashev-Style Scale for Autonomous AI - Towards AGI and Superintelligence

TL;DR

The paper proposes a Kardashev-style, operational Autonomous AI (AAI) Scale to quantify progress from fixed automation to AGI and beyond using ten normalized axes and a composite AAI-Index. It introduces a Self-Improvement Coefficient and two closure properties (maintenance and expansion) to render self-improvement falsifiable, alongside OWA-Bench, an open-world agency benchmark for long-horizon, tool-using, persistent agents. A formal progression framework links AAI-3 to AAI-5 through curvature and link-step progression on a fixed battery, with a CHC-based alignment crosswalk and a non-compensatory battery ensuring breadth and reliability. The approach emphasizes auditable Gates, drift-robust benchmarking, and a dynamics-aware Delegability Frontier to guide governance and deployment decisions, culminating in a Domain Annex for software agents and an Embodiment Annex for robotics. Collectively, the work provides a rigorous, auditable, and scalable framework to evaluate autonomous AI deployment, self-improvement, and progression toward superintelligence while mitigating gaming and enabling reproducibility across domains.

Abstract

We propose a Kardashev-inspired yet operational Autonomous AI (AAI) Scale that measures the progression from fixed robotic process automation (AAI-0) to full artificial general intelligence (AAI-4) and beyond. Unlike narrative ladders, our scale is multi-axis and testable. We define ten capability axes (Autonomy, Generality, Planning, Memory/Persistence, Tool Economy, Self-Revision, Sociality/Coordination, Embodiment, World-Model Fidelity, Economic Throughput) aggregated by a composite AAI-Index (a weighted geometric mean). We introduce a measurable Self-Improvement Coefficient (capability growth per unit of agent-initiated resources) and two closure properties (maintenance and expansion) that convert ``self-improving AI'' into falsifiable criteria. We specify OWA-Bench, an open-world agency benchmark suite that evaluates long-horizon, tool-using, persistent agents. We define level gates for AAI-0\ldots AAI-4 using thresholds on the axes, , and closure proofs. Synthetic experiments illustrate how present-day systems map onto the scale and how the delegability frontier (quality vs.\ autonomy) advances with self-improvement. We also prove a theorem that AAI-3 agent becomes AAI-5 over time with sufficient conditions, formalizing "baby AGI" becomes Superintelligence intuition.

Paper Structure

This paper contains 85 sections, 1 theorem, 96 equations, 1 figure, 3 tables.

Key Result

Theorem 3.1

Under the assumptions above, there exist finite resource increments $\Delta R_4<\infty$ and $\Delta R_5<\infty$ such that the agent reaches AAI-4 at $R_4:=R_0+\Delta R_4$ and AAI-5 at $R_5:=R_4+\Delta R_5$. Consequently, with $r(t)\ge r_{\min}$, these transitions occur in finite time.

Figures (1)

  • Figure 1: Let $Q^*$ be the target KPI quality threshold. The delegable region at time $\tau$ is $\mathcal{D}_{Q^{*}}(\tau)=\{(a,q): Q^{*}\le q\le q^{\star}(a,\tau)\}$. Progress is observed as the frontier $F_{Q^{*}}(\tau)$ (the graph of $q^{\star}$) at $\tau_1$ dominating that at $\tau_0$. The later frontier $\tau_{1}$ (red) dominates the earlier $\tau_{0}$ (blue). The dashed line marks $Q^{*}$. Green shading shows improvement between $\tau_{1}$ and $\tau_{0}$; red shading highlights the portion above $Q^{*}$ but below $\tau_{0}$.

Theorems & Definitions (1)

  • Theorem 3.1: Monotone progression