Pretraining Codomain Attention Neural Operators for Solving Multiphysics PDEs
Md Ashiqur Rahman, Robert Joseph George, Mogab Elleithy, Daniel Leibovici, Zongyi Li, Boris Bonev, Colin White, Julius Berner, Raymond A. Yeh, Jean Kossaifi, Kamyar Azizzadenesheli, Anima Anandkumar
TL;DR
This work addresses the challenge of solving multiphysics PDEs under limited high-resolution data by introducing Codomain Attention Neural Operator (CoDA-NO), a transformer-inspired neural operator that tokenizes input functions along the codomain and extends attention, normalization, and positional encoding to function spaces. The model processes variable-wise tokens with K, Q, V mappings implemented as Fourier neural operators, uses permutation-equivariant integration, and employs Variable-Specific Positional Encoding along with a graph-based encoder/decoder framework to handle irregular meshes. CoDA-NO is trained in a self-supervised pretraining phase followed by supervised fine-tuning, enabling rapid adaptation to new variables and geometries with minimal parameter updates. Empirically, CoDA-NO achieves state-of-the-art or competitive results on coupled NS-EW and Rayleigh-Bénard systems, and across PDEBench, with strong data efficiency, zero-shot super-resolution capabilities, and substantially fewer parameters than competing models, illustrating its potential as a foundation model for multiphysics PDEs.
Abstract
Existing neural operator architectures face challenges when solving multiphysics problems with coupled partial differential equations (PDEs) due to complex geometries, interactions between physical variables, and the limited amounts of high-resolution training data. To address these issues, we propose Codomain Attention Neural Operator (CoDA-NO), which tokenizes functions along the codomain or channel space, enabling self-supervised learning or pretraining of multiple PDE systems. Specifically, we extend positional encoding, self-attention, and normalization layers to function spaces. CoDA-NO can learn representations of different PDE systems with a single model. We evaluate CoDA-NO's potential as a backbone for learning multiphysics PDEs over multiple systems by considering few-shot learning settings. On complex downstream tasks with limited data, such as fluid flow simulations, fluid-structure interactions, and Rayleigh-Bénard convection, we found CoDA-NO to outperform existing methods by over 36%.
