Table of Contents
Fetching ...

Gradient Alignment in Physics-informed Neural Networks: A Second-Order Optimization Perspective

Sifan Wang, Ananyae Kumar Bhartari, Bowen Li, Paris Perdikaris

TL;DR

This paper exposes directional conflicts among loss terms in physics-informed neural networks and introduces gradient-alignment metrics to quantify them. It shows that quasi-second-order optimization, especially SOAP, implicitly preconditions the loss landscape and dramatically improves gradient alignment, effectively connecting SOAP to Newton’s method. Across 10 PDE benchmarks, including turbulent flows at Reynolds numbers up to $10^4$, SOAP achieves state-of-the-art accuracy with 2–10× improvements over strong baselines, albeit with modestly increased training time. The results argue for wider adoption of gradient-alignment-aware, second-order preconditioning in multi-objective neural PDE solvers and suggest directions for scalable, robust optimizers in physics-informed machine learning.

Abstract

Multi-task learning through composite loss functions is fundamental to modern deep learning, yet optimizing competing objectives remains challenging. We present new theoretical and practical approaches for addressing directional conflicts between loss terms, demonstrating their effectiveness in physics-informed neural networks (PINNs) where such conflicts are particularly challenging to resolve. Through theoretical analysis, we demonstrate how these conflicts limit first-order methods and show that second-order optimization naturally resolves them through implicit gradient alignment. We prove that SOAP, a recently proposed quasi-Newton method, efficiently approximates the Hessian preconditioner, enabling breakthrough performance in PINNs: state-of-the-art results on 10 challenging PDE benchmarks, including the first successful application to turbulent flows with Reynolds numbers up to 10,000, with 2-10x accuracy improvements over existing methods. We also introduce a novel gradient alignment score that generalizes cosine similarity to multiple gradients, providing a practical tool for analyzing optimization dynamics. Our findings establish frameworks for understanding and resolving gradient conflicts, with broad implications for optimization beyond scientific computing.

Gradient Alignment in Physics-informed Neural Networks: A Second-Order Optimization Perspective

TL;DR

This paper exposes directional conflicts among loss terms in physics-informed neural networks and introduces gradient-alignment metrics to quantify them. It shows that quasi-second-order optimization, especially SOAP, implicitly preconditions the loss landscape and dramatically improves gradient alignment, effectively connecting SOAP to Newton’s method. Across 10 PDE benchmarks, including turbulent flows at Reynolds numbers up to , SOAP achieves state-of-the-art accuracy with 2–10× improvements over strong baselines, albeit with modestly increased training time. The results argue for wider adoption of gradient-alignment-aware, second-order preconditioning in multi-objective neural PDE solvers and suggest directions for scalable, robust optimizers in physics-informed machine learning.

Abstract

Multi-task learning through composite loss functions is fundamental to modern deep learning, yet optimizing competing objectives remains challenging. We present new theoretical and practical approaches for addressing directional conflicts between loss terms, demonstrating their effectiveness in physics-informed neural networks (PINNs) where such conflicts are particularly challenging to resolve. Through theoretical analysis, we demonstrate how these conflicts limit first-order methods and show that second-order optimization naturally resolves them through implicit gradient alignment. We prove that SOAP, a recently proposed quasi-Newton method, efficiently approximates the Hessian preconditioner, enabling breakthrough performance in PINNs: state-of-the-art results on 10 challenging PDE benchmarks, including the first successful application to turbulent flows with Reynolds numbers up to 10,000, with 2-10x accuracy improvements over existing methods. We also introduce a novel gradient alignment score that generalizes cosine similarity to multiple gradients, providing a practical tool for analyzing optimization dynamics. Our findings establish frameworks for understanding and resolving gradient conflicts, with broad implications for optimization beyond scientific computing.

Paper Structure

This paper contains 50 sections, 12 theorems, 146 equations, 23 figures, 8 tables.

Key Result

Proposition 1

For n=2, the alignment score $\mathcal{A}(v_1, v_2)$ equals the cosine similarity between $v_1$ and $v_2$:

Figures (23)

  • Figure 1: Gradient conflicts and their impact on PINNs optimization. The irregular green trajectory illustrates how the optimization struggles when facing two types of gradient conflicts: Type I, where gradients have similar directions but vastly different magnitudes, and Type II, where gradients have similar magnitudes but opposing directions. The red trajectory shows how appropriate preconditioning through second-order information could mitigate these conflicts by aligning gradients both within and between optimization steps, enabling efficient convergence.
  • Figure 2: Gradient alignment scores and test errors during PINN training for solving the Navier-Stokes equations with different optimizers. Additional benchmarks are provided in Figure \ref{['fig:grad_align_score']}, where we observe the consistent phenomenon that first-order optimizers exhibit poor gradient alignment and slow convergence of test errors.
  • Figure 3: Simulating complex fluid dynamics using PINNs with SOAP optimization. (a) Kolmogorov flow at Re=10,000: comparison between reference solution and PINN predictions demonstrates accurate capture of turbulent structures across multiple time steps. (b) Spectral energy distribution showing PINN's superior resolution of fine-scale dynamics compared to traditional numerical solutions at various grid resolutions. (c) Lid-driven cavity flow at Re=5,000: streamlines and centerline velocity profiles show excellent agreement with benchmark data from GHIA1982387. (d) Kuramoto-Sivashinsky equation: PINNs accurately predicts complex spatiotemporal patterns and chaotic dynamics. (e) Rayleigh-Taylor instability (Pr=0.71, Ra=$10^6$): evolution of temperature field shows precise capture of interface dynamics and mushroom-shaped structures characteristic of this flow.
  • Figure 4: Optimizer performance comparison and ablation studies. Top: Relative $L^2$ error across PDE benchmarks using different optimizers. Bottom left: Relative $L^2$ error for varying preconditioner update frequencies in SOAP optimizer. Bottom right: Relative $L^2$ error with different momentum values in SOAP optimizer.
  • Figure 5: Gradient alignment scores and test errors obtained by training PINNs with different optimizers across different PDEs. From left to right: ground truth PDE solution, intra-step gradient alignment scores (Eq. \ref{['eq: intra_align']}), inter-step gradient alignment scores (Eq. \ref{['eq: inter_align']}), and test error convergence during training.
  • ...and 18 more figures

Theorems & Definitions (23)

  • Definition 1
  • Proposition 1
  • Definition 2
  • Proposition 2
  • Proposition 3
  • Proposition 4
  • Lemma 1
  • proof
  • proof
  • proof
  • ...and 13 more