Stability in Training PINNs for Stiff PDEs: Why Initial Conditions Matter

Baoli Hao; Ulisses Braga-Neto; Chun Liu; Lifan Wang; Ming Zhong

Stability in Training PINNs for Stiff PDEs: Why Initial Conditions Matter

Baoli Hao, Ulisses Braga-Neto, Chun Liu, Lifan Wang, Ming Zhong

TL;DR

The paper tackles the instability of Physics-Informed Neural Networks (PINNs) on stiff time-dependent PDEs by isolating initial-condition enforcement as the key factor in training stability. It introduces a hard-constraint PINN (HC-PINN) via the transformation $ ilde{u} = oldsymbol{Cpsi} + oldsymbol{Cphi}u_{nn}$ that exactly embeds initial and boundary data, reducing the multi-objective loss to a single PDE-residual loss and enabling implicit-time-stepping-like training behavior. A Neural Tangent Kernel (NTK) analysis provides a theoretical basis for reduced spectral bias under hard constraints, complemented by well-conditioning conditions on $oldsymbol{Cpsi}$ and $oldsymbol{Cphi}$. Empirical results across seven stiff PDEs, including 2D Allen-Cahn, demonstrate substantial improvements in accuracy and stability over baseline PINNs and other variants, supporting the claim that IC enforcement is essential for reliable PINN solvers in stiff regimes. The work also outlines how HC-PINN can be combined with other techniques (e.g., causal PINNs, residual-based attention, time-marching) to further enhance performance.

Abstract

Training Physics-Informed Neural Networks (PINNs) on stiff time-dependent PDEs remains highly unstable. Through rigorous ablation studies, we identify a surprisingly critical factor: the enforcement of initial conditions. We present the first systematic ablation of two core strategies, hard initial-condition constraints and adaptive loss weighting. Across challenging benchmarks (sharp transitions, higher-order derivatives, coupled systems, and high frequency modes), we find that exact enforcement of initial conditions (ICs) is not optional but essential. Our study demonstrates that stability and efficiency in PINN training fundamentally depend on ICs, paving the way toward more reliable PINN solvers in stiff regimes.

Stability in Training PINNs for Stiff PDEs: Why Initial Conditions Matter

TL;DR

that exactly embeds initial and boundary data, reducing the multi-objective loss to a single PDE-residual loss and enabling implicit-time-stepping-like training behavior. A Neural Tangent Kernel (NTK) analysis provides a theoretical basis for reduced spectral bias under hard constraints, complemented by well-conditioning conditions on

and

. Empirical results across seven stiff PDEs, including 2D Allen-Cahn, demonstrate substantial improvements in accuracy and stability over baseline PINNs and other variants, supporting the claim that IC enforcement is essential for reliable PINN solvers in stiff regimes. The work also outlines how HC-PINN can be combined with other techniques (e.g., causal PINNs, residual-based attention, time-marching) to further enhance performance.

Abstract

Paper Structure (26 sections, 5 theorems, 79 equations, 11 figures, 8 tables)

This paper contains 26 sections, 5 theorems, 79 equations, 11 figures, 8 tables.

Introduction
Related Work
Methodology
Hard Constraints
Other Training Enhancements
Theoretical Foundation for Hard Constraints
Well Condition of the Transformation
Why Initial Conditions Matter?
How Hard Constraints Work?
Examples
Conclusion
Methodology: Additional Details
Mini-batching
Self-Adaptive PINNs
Performance Measures
...and 11 more sections

Key Result

Theorem 1

The transformation is well defined when $\phi$ and $\psi$ is $C^1$ in time and have $K^{th}$ order partial derivatives w.r.t $\mathbf{x}$. Moreover, $\phi_t(0, \mathbf{x}) \neq 0$. Then $u_{nn}$ will satisfy a new PDE

Figures (11)

Figure 1: $7$ Benchmark Results: Truth vs HC-PINN vs Absolute Error.
Figure 2: Reference, HC-PINN predicted solution, and absolute error of the 2D Allen Cahn equation at different time snapshots (a) $t = 0$ , (b) $t = 0.5$ , (c) $t = 1$.
Figure 3: Sensitivity of HC-PINNs to the number of Fourier modes $m$ in the Allen–Cahn equations. The plot shows the relative $L^2$ error as a function of $m$.
Figure 4: Best solution: Allen-Cahn.
Figure 5: Best solution: Kuramoto-Sivashinsky.
...and 6 more figures

Theorems & Definitions (9)

Theorem 1: Well Conditioning
Remark 1
Theorem 2
Remark 2
Theorem : Well Conditioning
proof
Theorem
proof
Theorem 3

Stability in Training PINNs for Stiff PDEs: Why Initial Conditions Matter

TL;DR

Abstract

Stability in Training PINNs for Stiff PDEs: Why Initial Conditions Matter

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (9)