Table of Contents
Fetching ...

The Price equation reveals a universal force-metric-bias law of algorithmic learning and natural selection

Steven A. Frank

TL;DR

This work presents the force-metric-bias (FMB) law, derived from the Price equation, as a universal framework unifying natural selection and a wide range of learning algorithms. Updates take the form $Δ\bar{\bm{\theta}} = \mathbf{M}\,\mathbf{f} + \mathbf{b} + \mathbf{\xi}$, where $\mathbf{M}$ encodes inverse-curvature geometry, $\mathbf{f}$ is the slope of performance with respect to parameters, $\mathbf{b}$ collects momentum and frame changes, and $\mathbf{\xi}$ represents exploration noise. The paper connects this law to Fisher information, KL divergence, and information geometry, and demonstrates how classic methods (Newton, gradient descent, stochastic gradient descent, Adam, CMA-ES, Kalman filtering, Gaussian processes) are special cases within the same underlying structure. It further develops the spectrum from population-based to single-vector updates and extends the framework to hierarchical learning and multilevel selection. Overall, the FMB law provides a principled, geometry-driven foundation for understanding, comparing, and designing learning dynamics across disciplines. The approach highlights the deep connections between information geometry, Bayesian inference, and optimization theory, with practical implications for algorithm design and theoretical insight into learning processes.

Abstract

Diverse learning algorithms, optimization methods, and natural selection share a common mathematical structure, despite their apparent differences. Here I show that a simple notational partitioning of change by the Price equation reveals a universal force-metric-bias (FMB) law: $Δ\mathbfθ = \mathbf{M}\,\mathbf{f} + \mathbf{b} + \mathbfξ$. The force $\mathbf{f}$ drives improvement in parameters, $Δ\mathbfθ$, in proportion to the slope of performance with respect to the parameters. The metric $\mathbf{M}$ rescales movement by inverse curvature. The bias $\mathbf{b}$ adds momentum or changes in the frame of reference. The noise $\mathbfξ$ enables exploration. This framework unifies natural selection, Bayesian updating, Newton's method, stochastic gradient descent, stochastic Langevin dynamics, Adam optimization, and most other algorithms as special cases of the same underlying process. The Price equation also reveals why Fisher information, Kullback-Leibler divergence, and d'Alembert's principle arise naturally in learning dynamics. By exposing this common structure, the FMB law provides a principled foundation for understanding, comparing, and designing learning algorithms across disciplines.

The Price equation reveals a universal force-metric-bias law of algorithmic learning and natural selection

TL;DR

This work presents the force-metric-bias (FMB) law, derived from the Price equation, as a universal framework unifying natural selection and a wide range of learning algorithms. Updates take the form , where encodes inverse-curvature geometry, is the slope of performance with respect to parameters, collects momentum and frame changes, and represents exploration noise. The paper connects this law to Fisher information, KL divergence, and information geometry, and demonstrates how classic methods (Newton, gradient descent, stochastic gradient descent, Adam, CMA-ES, Kalman filtering, Gaussian processes) are special cases within the same underlying structure. It further develops the spectrum from population-based to single-vector updates and extends the framework to hierarchical learning and multilevel selection. Overall, the FMB law provides a principled, geometry-driven foundation for understanding, comparing, and designing learning dynamics across disciplines. The approach highlights the deep connections between information geometry, Bayesian inference, and optimization theory, with practical implications for algorithm design and theoretical insight into learning processes.

Abstract

Diverse learning algorithms, optimization methods, and natural selection share a common mathematical structure, despite their apparent differences. Here I show that a simple notational partitioning of change by the Price equation reveals a universal force-metric-bias (FMB) law: . The force drives improvement in parameters, , in proportion to the slope of performance with respect to the parameters. The metric rescales movement by inverse curvature. The bias adds momentum or changes in the frame of reference. The noise enables exploration. This framework unifies natural selection, Bayesian updating, Newton's method, stochastic gradient descent, stochastic Langevin dynamics, Adam optimization, and most other algorithms as special cases of the same underlying process. The Price equation also reveals why Fisher information, Kullback-Leibler divergence, and d'Alembert's principle arise naturally in learning dynamics. By exposing this common structure, the FMB law provides a principled foundation for understanding, comparing, and designing learning algorithms across disciplines.

Paper Structure

This paper contains 57 sections, 111 equations, 1 figure.

Figures (1)

  • Figure 1: Geometry of change by direct forces, $\Delta_{\bm{\mathrm{f}}}$. (a) Divergence between the initial population with probabilities, $\bm{\mathrm{q}}$, and the altered population with probabilities, $\bm{\mathrm{q}}'$. For discrete changes, the probabilities are normalized by the square root of the probabilities in the initial set. The distance can equivalently be described by the various expressions shown, in which $V_w$ is the variance in fitness from population biology, $\mathcal{J}$ is the Jeffreys divergence from information theory, and $\mathcal{F}$ is the squared Fisher-Rao step length. The symbol "$\rightarrow$" denotes the limit for small changes. (b) When changes are small, the same geometry and distances can be described more elegantly in unitary square root coordinates, $\bm{\mathrm{r}}=\sqrt{\bm{\mathrm{q}}}$, which sets $\lVert\bm{\mathrm{r}}\rVert=1$, and $\dot{\bm{\mathrm{r}}}\equiv\textrm{d}\bm{\mathrm{r}}=\textrm{d}\sqrt{\bm{\mathrm{q}}}=\left(\textrm{d}\bm{\mathrm{q}}\,/\sqrt{\bm{\mathrm{q}}}\right)/2$. From Frank frank18the-price.