Table of Contents
Fetching ...

Minimum-Action Learning: Energy-Constrained Symbolic Model Selection for Physical Law Identification from Noisy Data

Martin G. Frasch

Abstract

Identifying physical laws from noisy observational data is a central challenge in scientific machine learning. We present Minimum-Action Learning (MAL), a framework that selects symbolic force laws from a pre-specified basis library by minimizing a Triple-Action functional combining trajectory reconstruction, architectural sparsity, and energy-conservation enforcement. A wide-stencil acceleration-matching technique reduces noise variance by 10,000x, transforming an intractable problem (SNR ~0.02) into a learnable one (SNR ~1.6); this preprocessing is the critical enabler shared by all methods tested, including SINDy variants. On two benchmarks -- Kepler gravity and Hooke's law -- MAL recovers the correct force law with Kepler exponent p = 3.01 +/- 0.01 at ~0.07 kWh (40% reduction vs. prediction-error-only baselines). The raw correct-basis rate is 40% for Kepler and 90% for Hooke; an energy-conservation-based criterion discriminates the true force law in all cases, yielding 100% pipeline-level identification. Basis library sensitivity experiments show that near-confounders degrade selection (20% with added r^{-2.5} and r^{-1.5}), while distant additions are harmless, and the conservation diagnostic remains informative even when the correct basis is absent. Direct comparison with noise-robust SINDy variants, Hamiltonian Neural Networks, and Lagrangian Neural Networks confirms MAL's distinct niche: interpretable, energy-constrained model selection that combines symbolic basis identification with dynamical rollout validation.

Minimum-Action Learning: Energy-Constrained Symbolic Model Selection for Physical Law Identification from Noisy Data

Abstract

Identifying physical laws from noisy observational data is a central challenge in scientific machine learning. We present Minimum-Action Learning (MAL), a framework that selects symbolic force laws from a pre-specified basis library by minimizing a Triple-Action functional combining trajectory reconstruction, architectural sparsity, and energy-conservation enforcement. A wide-stencil acceleration-matching technique reduces noise variance by 10,000x, transforming an intractable problem (SNR ~0.02) into a learnable one (SNR ~1.6); this preprocessing is the critical enabler shared by all methods tested, including SINDy variants. On two benchmarks -- Kepler gravity and Hooke's law -- MAL recovers the correct force law with Kepler exponent p = 3.01 +/- 0.01 at ~0.07 kWh (40% reduction vs. prediction-error-only baselines). The raw correct-basis rate is 40% for Kepler and 90% for Hooke; an energy-conservation-based criterion discriminates the true force law in all cases, yielding 100% pipeline-level identification. Basis library sensitivity experiments show that near-confounders degrade selection (20% with added r^{-2.5} and r^{-1.5}), while distant additions are harmless, and the conservation diagnostic remains informative even when the correct basis is absent. Direct comparison with noise-robust SINDy variants, Hamiltonian Neural Networks, and Lagrangian Neural Networks confirms MAL's distinct niche: interpretable, energy-constrained model selection that combines symbolic basis identification with dynamical rollout validation.
Paper Structure (1 section, 18 equations, 10 figures, 9 tables)

This paper contains 1 section, 18 equations, 10 figures, 9 tables.

Table of Contents

  1. Acknowledgments.

Figures (10)

  • Figure 1: Trajectory reconstruction and Hamiltonian conservation. (A) Comparison of ground-truth Keplerian orbits (blue) vs. MAL reconstruction (red) for a test orbit rolled out over 5 orbital periods from initial conditions using the identified $r^{-2}$ force law. Slight enlargement is attributable to the 6% deficit in recovered $GM$. (B) Energy conservation error $\Delta H$ remains bounded, enforced by the Noether-symmetry term $\mathcal{L}_{\mathrm{Symmetry}}$ in the Triple-Action functional, which constrains the parameter trajectory $\theta(t)$ to satisfy $dH/dt \approx 0$. Implementation: the force is computed via NoetherForceBasis.forward(), which evaluates $f(r) = \sum_i A_i \theta_i \phi_i(r)$ with gates $A_i = \mathrm{softmax}(\texttt{A\_logits}/\tau)$.
  • Figure 2: Soft-to-discrete architectural crystallization. (A) Evolution of gate activation probabilities $A_i$ from uniform (epoch 1, soft state) to one-hot selection of the $r^{-2}$ basis (epoch 200, discrete state). This transition occurs on a soft-to-discrete architecture manifold where A_logits are sharpened via softmax temperature decay ($\tau: 1 \to 0.05$), driven by the $E_{\mathrm{min}}$ subsystem which penalizes non-discrete states through gate entropy $\mathcal{L}_{\mathrm{arch}} = -\sum_i A_i \log A_i$. (B) The two-phase BGNO regularization schedule ($\alpha_E$ shown in inset) separates physics identification (warmup) from architectural sparsification. Shown for a representative seed (seed 0); variability across 10 seeds is reported in SI Table S1.
  • Figure 3: Energy efficiency and training dynamics. (A) Training loss components vs. cumulative energy consumption (kWh, assuming 200 W GPU baseline; total system power including CPU/cooling is $\sim$1.5$\times$ this value). Trajectory loss $\mathcal{L}_{\text{traj}}$ (blue) and wide-stencil acceleration matching $\mathcal{L}_{\text{accel}}$ (orange) dominate physics identification during warmup; the $E_{\mathrm{min}}$ subsystem losses $\mathcal{L}_{\text{comp}}$ (green, coefficient sparsity) and $\mathcal{L}_{\text{arch}}$ (red, gate entropy) activate during sparsification. (B) Temperature schedule $\tau$ (purple) and regularization weight $\alpha_E$ (brown) implement the BGNO protocol. The $E_{\mathrm{min}}$ subsystem penalizes high-entropy architectural configurations, driving the soft-to-discrete transition shown in Fig. \ref{['fig:architecture']}. Implementation: loss components are computed in minaction_loss(), with $\mathcal{L}_{\text{comp}} = \langle |A_i \theta_i| \rangle$ and $\mathcal{L}_{\text{arch}} = -\sum_i A_i \log A_i$.
  • Figure 4: Energy-conservation-based model selection and schedule geometry. (Left) Phase-space trajectory of $(\alpha_E, \tau)$ during training, color-coded by epoch. Red diamonds mark epochs where the ratio $\alpha_E / \tau$ passes through integer values (3:1, 2:1, 1:1), coinciding with major transitions in gate selectivity (onset, sparsification, crystallization). These coincidences arise from the designed schedule geometry; whether they reflect deeper dynamical principles remains an open question (see Discussion). (Right) Energy conservation $\sigma_H$ by selected basis: correct $r^{-2}$ models conserve $H$ over long-horizon rollouts, while incorrect bases violate Hamiltonian dynamics despite matching short-term trajectories. This provides an energy-conservation-based diagnostic for model selection.
  • Figure S1: Robustness: Seed 137 orbit reconstruction. Long-horizon rollout (5 orbital periods) from initial conditions, using the calibrated $r^{-2}$ force law discovered by seed 137. Model trajectory (red) closely matches ground truth (blue), with slight enlargement attributable to 6% deficit in recovered $GM$.
  • ...and 5 more figures