Table of Contents
Fetching ...

M$^{2}$M: Learning controllable Multi of experts and multi-scale operators are the Partial Differential Equations need

Aoming Liang, Zhaoyang Mu, Pengxiao Lin, Cong Wang, Mingming Ge, Ling Shao, Dixia Fan, Hao Tang

TL;DR

A framework of multi-scale and multi-expert neural operators designed to simulate and learn PDEs efficiently and incorporate a controllable prior gating mechanism that determines the selection rights of experts, enhancing the model's efficiency.

Abstract

Learning the evolutionary dynamics of Partial Differential Equations (PDEs) is critical in understanding dynamic systems, yet current methods insufficiently learn their representations. This is largely due to the multi-scale nature of the solution, where certain regions exhibit rapid oscillations while others evolve more slowly. This paper introduces a framework of multi-scale and multi-expert (M$^2$M) neural operators designed to simulate and learn PDEs efficiently. We employ a divide-and-conquer strategy to train a multi-expert gated network for the dynamic router policy. Our method incorporates a controllable prior gating mechanism that determines the selection rights of experts, enhancing the model's efficiency. To optimize the learning process, we have implemented a PI (Proportional, Integral) control strategy to adjust the allocation rules precisely. This universal controllable approach allows the model to achieve greater accuracy. We test our approach on benchmark 2D Navier-Stokes equations and provide a custom multi-scale dataset. M$^2$M can achieve higher simulation accuracy and offer improved interpretability compared to baseline methods.

M$^{2}$M: Learning controllable Multi of experts and multi-scale operators are the Partial Differential Equations need

TL;DR

A framework of multi-scale and multi-expert neural operators designed to simulate and learn PDEs efficiently and incorporate a controllable prior gating mechanism that determines the selection rights of experts, enhancing the model's efficiency.

Abstract

Learning the evolutionary dynamics of Partial Differential Equations (PDEs) is critical in understanding dynamic systems, yet current methods insufficiently learn their representations. This is largely due to the multi-scale nature of the solution, where certain regions exhibit rapid oscillations while others evolve more slowly. This paper introduces a framework of multi-scale and multi-expert (MM) neural operators designed to simulate and learn PDEs efficiently. We employ a divide-and-conquer strategy to train a multi-expert gated network for the dynamic router policy. Our method incorporates a controllable prior gating mechanism that determines the selection rights of experts, enhancing the model's efficiency. To optimize the learning process, we have implemented a PI (Proportional, Integral) control strategy to adjust the allocation rules precisely. This universal controllable approach allows the model to achieve greater accuracy. We test our approach on benchmark 2D Navier-Stokes equations and provide a custom multi-scale dataset. MM can achieve higher simulation accuracy and offer improved interpretability compared to baseline methods.

Paper Structure

This paper contains 28 sections, 21 equations, 16 figures, 6 tables, 1 algorithm.

Figures (16)

  • Figure 1: Framework of the proposed Multi-scale and Multi-experts (M$^2$M). The Experts net has different models, $f_{gate}^{policy}$ decides which spatial domain is needed to allocate the different models in the roll-out predications. For more details, please refer to sec. \ref{['Proposed Method']}.
  • Figure 2: Figure (a) shows the router policy in the training. Figure (b) shows the framework of the PI controller in the M$^2$M. By designing the target and feedback in the loop, $\lambda$ can be adjusted.
  • Figure 3: Results of one-step prediction on the multi-scale custom dataset at different epochs: 1, 10, and 100. Our scale is set to $4$, and the ablation on the multi-scale study is shown in appendix \ref{['appendix:multi-sclae']}.
  • Figure 4: Dynamic weight distribution of router, the figure (a) and (b) are the distribution of the output on the 1th and 100th epoch. The prior is set to [0000] means none of any prior on the router. The TOP$_{k}$ is set 2.
  • Figure 5: Results of NS datasets in PID-M$^2$M. The number of scale is 1.
  • ...and 11 more figures