Variational operator learning: A unified paradigm marrying training neural operators and solving partial differential equations
Tengfei Xu, Dachuan Liu, Peng Hao, Bo Wang
TL;DR
Variational operator learning (VOL) presents a unified framework that fuses neural operator training with solving PDEs through the variational (weak) form. By employing Ritz and Galerkin approaches with finite element discretization in a matrix-free fashion, VOL minimizes residuals rather than direct functionals, enabling label-free training with a small labeled shift set. The method demonstrates data-efficiency, resolution-robustness, and generalization benefits across variable heat sources, Darcy flow, and variable stiffness elasticity, while offering two optimization strategies (direct minimization and iterative updates). This approach paves the way for physics-informed neural operators that leverage classical variational principles to reduce data requirements and improve solver integration in scientific computing.
Abstract
Neural operators as novel neural architectures for fast approximating solution operators of partial differential equations (PDEs), have shown considerable promise for future scientific computing. However, the mainstream of training neural operators is still data-driven, which needs an expensive ground-truth dataset from various sources (e.g., solving PDEs' samples with the conventional solvers, real-world experiments) in addition to training stage costs. From a computational perspective, marrying operator learning and specific domain knowledge to solve PDEs is an essential step in reducing dataset costs and label-free learning. We propose a novel paradigm that provides a unified framework of training neural operators and solving PDEs with the variational form, which we refer to as the variational operator learning (VOL). Ritz and Galerkin approach with finite element discretization are developed for VOL to achieve matrix-free approximation of system functional and residual, then direct minimization and iterative update are proposed as two optimization strategies for VOL. Various types of experiments based on reasonable benchmarks about variable heat source, Darcy flow, and variable stiffness elasticity are conducted to demonstrate the effectiveness of VOL. With a label-free training set and a 5-label-only shift set, VOL learns solution operators with its test errors decreasing in a power law with respect to the amount of unlabeled data. To the best of the authors' knowledge, this is the first study that integrates the perspectives of the weak form and efficient iterative methods for solving sparse linear systems into the end-to-end operator learning task.
