Table of Contents
Fetching ...

MODNO: Multi Operator Learning With Distributed Neural Operators

Zecheng Zhang

TL;DR

MODNO introduces a distributed training framework that enables a single Chen-Chen-type neural operator to learn multiple operators by pairing operator-specific output bases with a centralized input-encoding network. The method balances local (operator-specific) and global (shared) training losses to reduce parameters and cost while preserving or improving accuracy relative to training separate SOL models. Across five numerical experiments, MODNO frequently outperforms independent SOLs, even with reduced data, and demonstrates robustness to time extrapolation in several cases. This approach offers practical efficiency advantages for multi-operator tasks and suggests avenues for extending to discretization-invariant and physics-informed variants.

Abstract

The study of operator learning involves the utilization of neural networks to approximate operators. Traditionally, the focus has been on single-operator learning (SOL). However, recent advances have rapidly expanded this to include the approximation of multiple operators using foundation models equipped with millions or billions of trainable parameters, leading to the research of multi-operator learning (MOL). In this paper, we present a novel distributed training approach aimed at enabling a single neural operator with significantly fewer parameters to effectively tackle multi-operator learning challenges, all without incurring additional average costs. Our method is applicable to various neural operators, such as Deep Operator Neural Networks (DON). The core idea is to independently learn the output basis functions for each operator using its dedicated data, while simultaneously centralizing the learning of the input function encoding shared by all operators using the entire dataset. Through a systematic study of five numerical examples, we compare the accuracy and cost of training a single neural operator for each operator independently versus training a MOL model using our proposed method. Our results demonstrate enhanced efficiency and satisfactory accuracy. Moreover, our approach illustrates that some operators with limited data can be more effectively constructed with the aid of data from analogous operators through MOL learning. This highlights another MOL's potential to bolster operator learning.

MODNO: Multi Operator Learning With Distributed Neural Operators

TL;DR

MODNO introduces a distributed training framework that enables a single Chen-Chen-type neural operator to learn multiple operators by pairing operator-specific output bases with a centralized input-encoding network. The method balances local (operator-specific) and global (shared) training losses to reduce parameters and cost while preserving or improving accuracy relative to training separate SOL models. Across five numerical experiments, MODNO frequently outperforms independent SOLs, even with reduced data, and demonstrates robustness to time extrapolation in several cases. This approach offers practical efficiency advantages for multi-operator tasks and suggests avenues for extending to discretization-invariant and physics-informed variants.

Abstract

The study of operator learning involves the utilization of neural networks to approximate operators. Traditionally, the focus has been on single-operator learning (SOL). However, recent advances have rapidly expanded this to include the approximation of multiple operators using foundation models equipped with millions or billions of trainable parameters, leading to the research of multi-operator learning (MOL). In this paper, we present a novel distributed training approach aimed at enabling a single neural operator with significantly fewer parameters to effectively tackle multi-operator learning challenges, all without incurring additional average costs. Our method is applicable to various neural operators, such as Deep Operator Neural Networks (DON). The core idea is to independently learn the output basis functions for each operator using its dedicated data, while simultaneously centralizing the learning of the input function encoding shared by all operators using the entire dataset. Through a systematic study of five numerical examples, we compare the accuracy and cost of training a single neural operator for each operator independently versus training a MOL model using our proposed method. Our results demonstrate enhanced efficiency and satisfactory accuracy. Moreover, our approach illustrates that some operators with limited data can be more effectively constructed with the aid of data from analogous operators through MOL learning. This highlights another MOL's potential to bolster operator learning.
Paper Structure (21 sections, 18 equations, 6 figures, 5 tables, 1 algorithm)

This paper contains 21 sections, 18 equations, 6 figures, 5 tables, 1 algorithm.

Figures (6)

  • Figure 1: Stacked version DON lu2021learning, a fundamental work in the area. $\bigotimes$ denotes the inner product in $\mathbb{R}^K$.
  • Figure 2: The terminal solutions of the Wave equation (\ref{['eqn_wave']}), Klein Gordon equation (\ref{['eqn_klein']}) and Sine-Gordon equation (\ref{['eqn_sine']}) with one same initial condition.
  • Figure 3: The terminal solutions of three porous media equations (\ref{['eqn_porous_media']}) with different degrees (two, three, and four) with one same initial condition.
  • Figure 4: The terminal solutions of the parabolic equation (\ref{['eqn_parabolic']}), Viscous Burgers equation (\ref{['eqn_viscous']}) and Burger equation (\ref{['eqn_burgers']}) with the same initial conditions. Three equations' terminal simulation times are $0.5$, $1$ and $0.4$ respectively.
  • Figure 5: The terminal solutions of the KDV equation (\ref{['eqn_kdv']}), Cahn-Hilliard (\ref{['eqn_ch']}) and Advection equation (\ref{['eqn_adv']}) with the same initial conditions. Three equations' terminal simulation times are $1$, $1$ and $0.1$ respectively.
  • ...and 1 more figures