Sequential-in-time training of nonlinear parametrizations for solving time-dependent partial differential equations

Huan Zhang; Yifan Chen; Eric Vanden-Eijnden; Benjamin Peherstorfer

Sequential-in-time training of nonlinear parametrizations for solving time-dependent partial differential equations

Huan Zhang, Yifan Chen, Eric Vanden-Eijnden, Benjamin Peherstorfer

TL;DR

The paper establishes a unifying framework for sequential-in-time training of nonlinear parametrizations in time-dependent PDEs by separating OtD (optimize-then-discretize) and DtO (discretize-then-optimize) schemes. It provides a posteriori error and stability analyses, highlights the tangent-space collapse phenomenon in OtD, and shows that DtO schemes are robust to this collapse at the cost of solving more challenging, nonconvex optimizations. A key insight is that OtD dynamics project the PDE onto the parametrization manifold, connecting to Dirac-Frenkel variational principles, while DtO treats time discretization first, leading to boundary-value-like optimization steps; under one-step Gauss-Newton, OtD approximates DtO to first order. The authors further relate OtD to gradient flows and natural gradient descent, showing that metric choices influence convergence properties and suggesting directions for designing efficient algorithms that leverage these geometric interpretations. Overall, the work clarifies how these two broad strategies interact, informs practical algorithm design, and opens avenues for integrating OtD and DtO ideas with gradient-flow and information-geometric perspectives.

Abstract

Sequential-in-time methods solve a sequence of training problems to fit nonlinear parametrizations such as neural networks to approximate solution trajectories of partial differential equations over time. This work shows that sequential-in-time training methods can be understood broadly as either optimize-then-discretize (OtD) or discretize-then-optimize (DtO) schemes, which are well known concepts in numerical analysis. The unifying perspective leads to novel stability and a posteriori error analysis results that provide insights into theoretical and numerical aspects that are inherent to either OtD or DtO schemes such as the tangent space collapse phenomenon, which is a form of over-fitting. Additionally, the unified perspective facilitates establishing connections between variants of sequential-in-time training methods, which is demonstrated by identifying natural gradient descent methods on energy functionals as OtD schemes applied to the corresponding gradient flows.

Sequential-in-time training of nonlinear parametrizations for solving time-dependent partial differential equations

TL;DR

Abstract

Paper Structure (40 sections, 9 theorems, 70 equations)

This paper contains 40 sections, 9 theorems, 70 equations.

Introduction
Simulating time-dependent processes and systems
Limitations of linear parametrizations
Nonlinear parametrizations and global versus sequential-in-time training
Literature review of sequential-in-time training methods for nonlinear parametrizations
OtD and DtO schemes and summary of contributions
Outline of the paper
Nonlinear parametrizations for time-dependent PDEs
Setup
Nonlinear time-dependent parametrizations
Optimize-then-Discretize (OtD) schemes
Description of OtD schemes
Residual function in OtD schemes
Optimality conditions in OtD schemes
OtD schemes and the Dirac-Frenkel variational principle
...and 25 more sections

Key Result

Proposition 1

(See 9073ba01-c8c8-3f30-b15c-e4b52a44e9da.) Consider the time-dependent PDE eq:Prelim:PDE and let $\boldsymbol{\theta}(t)$ solve the continuous OtD dynamics eq:projected_dynamics so that $\hat{u}({\boldsymbol{\theta}}(t),\cdot)$ approximates $u$. Assume that there exists a non-negative constant $C$ Furthermore, assume that there exists a function $\varepsilon: [0,T]\to [0,\infty)$ so that Then,

Theorems & Definitions (18)

Proposition 1
proof
Proposition 2
proof
Proposition 3
proof
Proposition 4
proof
Proposition 5
proof
...and 8 more

Sequential-in-time training of nonlinear parametrizations for solving time-dependent partial differential equations

TL;DR

Abstract

Sequential-in-time training of nonlinear parametrizations for solving time-dependent partial differential equations

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (18)