Table of Contents
Fetching ...

Neural Network Perturbation Theory (NNPT): Learning Residual Corrections from Exact Solutions

Zhenhao Chen, Mutian Shen, Boris Fain, Zohar Nussinov

TL;DR

This work presents Neural Network Perturbation Theory (NNPT), a strategy to learn only residual perturbations after subtracting analytically solvable baselines, applied to the planar circular restricted three-body problem to probe how dynamical complexity governs neural capacity. By constraining the network to model Jupiter’s perturbation on top of the Keplerian baseline, the authors achieve substantial parameter efficiency and demonstrate a sharp capacity transition near chaos onset that scales with the resonance-overlap criterion and precedes geometric chaos signatures. The study also employs autoencoder analyses to quantify intrinsic dimensionality, revealing a torus-dominated structure in integrable regimes that collapses into higher-dimensional chaotic attractors as dynamics become chaotic, though decoder overhead can temper latent-space benefits for trajectory prediction. Collectively, NNPT provides a general framework for building physics-informed, capacity-aware surrogates and highlights fundamental limits on fixed-architecture networks in chaotic regimes, with implications across quantum, fluid, and plasma systems.

Abstract

Many complex physical systems admit natural decomposition into an exactly solvable component and a perturbative correction. Rather than training neural networks to learn complete trajectories from scratch, we introduce Neural Network Perturbation Theory (NNPT), where networks predict only residual perturbations after analytically subtracting known exact solutions. We validate this framework through systematic comparison: using identical 2x32 architectures, correction learning achieves 28-54x lower validation error compared to networks trained on complete trajectories. Using the gravitational three-body problem as a test bed, we investigate capacity transitions in fixed-architecture multilayer perceptrons as Jovian mass varies from 0.05 to 30 times its physical value. An equalized-accuracy protocol reveals that both minimal network capacity and training time exhibit sharp transitions at f_c = 15.6+-1.0, where the system enters a strongly chaotic regime. At this transition, minimal capacity jumps approximately sevenfold from ~1,200 to ~8,600 parameters (architectures 2x32 and 3x64). Preliminary exploration of sequential two-stage corrections suggests that first-stage networks already capture dominant perturbative features. Our symplectic integrator maintains relative energy conservation below 2x10^-7 throughout, confirming that transitions reflect physical complexity rather than numerical error. Our results establish correction learning as a general strategy for parameter-efficient surrogates and demonstrate that physical complexity imposes fundamental capacity barriers on fixed-architecture networks at chaos onset.

Neural Network Perturbation Theory (NNPT): Learning Residual Corrections from Exact Solutions

TL;DR

This work presents Neural Network Perturbation Theory (NNPT), a strategy to learn only residual perturbations after subtracting analytically solvable baselines, applied to the planar circular restricted three-body problem to probe how dynamical complexity governs neural capacity. By constraining the network to model Jupiter’s perturbation on top of the Keplerian baseline, the authors achieve substantial parameter efficiency and demonstrate a sharp capacity transition near chaos onset that scales with the resonance-overlap criterion and precedes geometric chaos signatures. The study also employs autoencoder analyses to quantify intrinsic dimensionality, revealing a torus-dominated structure in integrable regimes that collapses into higher-dimensional chaotic attractors as dynamics become chaotic, though decoder overhead can temper latent-space benefits for trajectory prediction. Collectively, NNPT provides a general framework for building physics-informed, capacity-aware surrogates and highlights fundamental limits on fixed-architecture networks in chaotic regimes, with implications across quantum, fluid, and plasma systems.

Abstract

Many complex physical systems admit natural decomposition into an exactly solvable component and a perturbative correction. Rather than training neural networks to learn complete trajectories from scratch, we introduce Neural Network Perturbation Theory (NNPT), where networks predict only residual perturbations after analytically subtracting known exact solutions. We validate this framework through systematic comparison: using identical 2x32 architectures, correction learning achieves 28-54x lower validation error compared to networks trained on complete trajectories. Using the gravitational three-body problem as a test bed, we investigate capacity transitions in fixed-architecture multilayer perceptrons as Jovian mass varies from 0.05 to 30 times its physical value. An equalized-accuracy protocol reveals that both minimal network capacity and training time exhibit sharp transitions at f_c = 15.6+-1.0, where the system enters a strongly chaotic regime. At this transition, minimal capacity jumps approximately sevenfold from ~1,200 to ~8,600 parameters (architectures 2x32 and 3x64). Preliminary exploration of sequential two-stage corrections suggests that first-stage networks already capture dominant perturbative features. Our symplectic integrator maintains relative energy conservation below 2x10^-7 throughout, confirming that transitions reflect physical complexity rather than numerical error. Our results establish correction learning as a general strategy for parameter-efficient surrogates and demonstrate that physical complexity imposes fundamental capacity barriers on fixed-architecture networks at chaos onset.

Paper Structure

This paper contains 28 sections, 13 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: Neural Network Perturbation Theory (NNPT) framework for the three-body problem. (Left) Schematic of the Sun--Earth--Jupiter system showing Earth's perturbed orbit (solid blue curve) deviating from the exact Keplerian circular orbit (dashed gray). The shaded region $\mathbf{y}(t)$ represents the residual perturbation—the difference between the three-body trajectory and the analytical two-body solution. (Right) Multilayer perceptron architecture (simplified visualization: 3 hidden layers with 6--8 neurons per layer) that learns only the residual correction $\mathbf{y}(t)$ from normalized time input $t$. The network receives time $t \in [0,1]$ and outputs the two-dimensional perturbation $\mathbf{y}(t)$, focusing representational capacity exclusively on genuine three-body dynamics rather than redundantly approximating the known Keplerian baseline.
  • Figure 2: Relative energy drift $|\Delta E|/|E_0|$ vs. Jovian mass factor $f$ for the symplectic velocity-Verlet integrator. The drift increases smoothly and remains below $2 \times 10^{-7}$ across all masses, demonstrating excellent numerical stability. The dashed horizontal line marks tolerance threshold $10^{-6}$. This validates that observed transitions in network capacity reflect genuine physical complexity rather than numerical artifacts.
  • Figure 3: Baseline network performance using fixed 10$\times$128 architecture (10 hidden layers, 128 units per layer, 168,322 trainable parameters) learning correction signal $\mathbf{y}(t)$ across mass factors. Networks predict only the residual after subtracting the analytical Keplerian solution, not the full three-body trajectory. Faint lines show individual seeds (2025, 2026, 2027); bold line with error bars indicates mean $\pm$ SD ($n=3$). The MSE exhibits a U-shape, achieving minimum near $f = 8$ before rising sharply for $f > 15$ as chaos emerges in the correction signal. Horizontal dashed line marks target MSE $\mathcal{E}_\star = 9.64 \times 10^{-3}$ AU$^2$ used for the equalized-accuracy protocol.
  • Figure 4: Sharp capacity and training transitions at chaos onset. (a) Minimal network capacity vs. Jovian mass factor under equalized-accuracy protocol (orange circles: mean $\pm$ SD over three random seeds; vertical dashed line: breakpoint $\hat{f}_c = 15.6$; gray band: 68% bootstrap CI). A sharp jump from ${\sim}1{,}200$ to ${\sim}8{,}600$ parameters ($\approx 7\times$) occurs at the transition, corresponding to architectures 2$\times$32 (2 hidden layers, 32 units per layer, 1,186 parameters) for $f \leq 15$ and 3$\times$64 (3 hidden layers, 64 units per layer, 8,578 parameters) for $f \geq 20$. The intermediate region ($f = 16$--$18$) exhibits seed-dependent variability with large error bars—itself a signature of chaos onset where subtle initialization differences lead to divergent capacity requirements. (b) Training effort (first-hit epoch) vs. mass factor (green squares: mean $\pm$ SD). Training time increases approximately threefold beyond the transition. The synchronized transitions indicate that the chaotic regime imposes a double penalty: networks require both more parameters and more training time to achieve fixed accuracy.
  • Figure 5: Validation of correction learning framework using fixed 2$\times$32 architecture (1,186 parameters). (Left) Validation MSE for networks trained on complete trajectories (blue) versus correction signals (orange). Direct learning exhibits nearly constant MSE $\approx$ 0.5 AU$^2$, while correction learning achieves MSE $\sim$ 0.01 AU$^2$. (Right) Improvement factor shows 51--54$\times$ advantage in integrable regime, declining to 28$\times$ in chaotic regime as residual complexity increases.
  • ...and 4 more figures