Ensemble and Mixture-of-Experts DeepONets For Operator Learning
Ramansh Sharma, Varun Shankar
TL;DR
The paper tackles operator learning for PDEs by enriching DeepONet trunks through an ensemble approach and a spatially local PoU-MoE trunk. It formalizes an ensemble DeepONet with multiple trunks and a single branch, and proves universal approximation properties for the ensemble, including a PoU-MoE variant that blends local trunks via partition-of-unity weights. Empirical results on 2D and 3D problems with sharp gradients show 2–4x relative improvements over standard DeepONets, with POD-PoU often delivering the best accuracy across challenging tasks, at the cost of increased training time. These findings highlight the value of combining global and local basis functions and motivate further work on adaptive partitioning and parallel implementations to manage computational overhead while preserving accuracy gains.
Abstract
We present a novel deep operator network (DeepONet) architecture for operator learning, the ensemble DeepONet, that allows for enriching the trunk network of a single DeepONet with multiple distinct trunk networks. This trunk enrichment allows for greater expressivity and generalization capabilities over a range of operator learning problems. We also present a spatial mixture-of-experts (MoE) DeepONet trunk network architecture that utilizes a partition-of-unity (PoU) approximation to promote spatial locality and model sparsity in the operator learning problem. We first prove that both the ensemble and PoU-MoE DeepONets are universal approximators. We then demonstrate that ensemble DeepONets containing a trunk ensemble of a standard trunk, the PoU-MoE trunk, and/or a proper orthogonal decomposition (POD) trunk can achieve 2-4x lower relative $\ell_2$ errors than standard DeepONets and POD-DeepONets on both standard and challenging new operator learning problems involving partial differential equations (PDEs) in two and three dimensions. Our new PoU-MoE formulation provides a natural way to incorporate spatial locality and model sparsity into any neural network architecture, while our new ensemble DeepONet provides a powerful and general framework for incorporating basis enrichment in scientific machine learning architectures for operator learning.
