Self-adaptive weights based on balanced residual decay rate for physics-informed neural networks and deep operator networks
Wenqian Chen, Amanda A. Howard, Panos Stinis
TL;DR
This work identifies that plain PINNs and PIDeepONets can fail when residual convergence rates differ dramatically across training points, with the slowest rate dominating the overall solution. It introduces inverse residual decay rate (irdr) and a balanced residual decay rate (BRDR) weighting scheme that proportionally emphasizes slowly converging residuals while enforcing a mean weight of 1, plus an adaptive scaling factor to keep the effective learning rate near its maximum. BRDR is extended to mini-batch training and shown to yield bounded, smooth weights, faster convergence, lower final errors, and reduced training uncertainty across PINN and PIDeepONet benchmarks, including 2D Helmholtz, 1D Allen–Cahn, 1D Burgers, and operator-learning tasks. The approach competes with or outperforms state-of-the-art adaptive weighting methods (SA, RBA, NTK, CK) while incurring modest computational overhead and reduced hyperparameter tuning, with code available for reproducibility.
Abstract
Physics-informed deep learning has emerged as a promising alternative for solving partial differential equations. However, for complex problems, training these networks can still be challenging, often resulting in unsatisfactory accuracy and efficiency. In this work, we demonstrate that the failure of plain physics-informed neural networks arises from the significant discrepancy in the convergence rate of residuals at different training points, where the slowest convergence rate dominates the overall solution convergence. Based on these observations, we propose a pointwise adaptive weighting method that balances the residual decay rate across different training points. The performance of our proposed adaptive weighting method is compared with current state-of-the-art adaptive weighting methods on benchmark problems for both physics-informed neural networks and physics-informed deep operator networks. Through extensive numerical results we demonstrate that our proposed approach of balanced residual decay rates offers several advantages, including bounded weights, high prediction accuracy, fast convergence rate, low training uncertainty, low computational cost, and ease of hyperparameter tuning.
