Learning in latent spaces improves the predictive accuracy of deep neural operators
Katiana Kontolati, Somdatta Goswami, George Em Karniadakis, Michael D. Shields
TL;DR
The paper tackles the difficulty of learning neural operators for high-dimensional PDE data by introducing Latent-DeepONet (L-DeepONet), which compresses inputs and outputs into a latent space via autoencoders and trains a DeepONet on this reduced representation before decoding back to the physical space. Across diverse time-dependent PDEs—brittle fracture, Rayleigh–Bénard convection, and shallow-water dynamics—the method achieves higher accuracy and substantially lower training costs than standard DeepONet and Fourier-based operators, with small latent dimensions (d ≤ 100) sufficing for effective modeling. The work discusses advantages such as improved generalization and efficiency, as well as limitations including the need for separate DR models for heterogeneous quantities and the lack of spatial interpolation or physics-informed constraints in latent space, outlining avenues for future enhancements.
Abstract
Operator regression provides a powerful means of constructing discretization-invariant emulators for partial-differential equations (PDEs) describing physical systems. Neural operators specifically employ deep neural networks to approximate mappings between infinite-dimensional Banach spaces. As data-driven models, neural operators require the generation of labeled observations, which in cases of complex high-fidelity models result in high-dimensional datasets containing redundant and noisy features, which can hinder gradient-based optimization. Mapping these high-dimensional datasets to a low-dimensional latent space of salient features can make it easier to work with the data and also enhance learning. In this work, we investigate the latent deep operator network (L-DeepONet), an extension of standard DeepONet, which leverages latent representations of high-dimensional PDE input and output functions identified with suitable autoencoders. We illustrate that L-DeepONet outperforms the standard approach in terms of both accuracy and computational efficiency across diverse time-dependent PDEs, e.g., modeling the growth of fracture in brittle materials, convective fluid flows, and large-scale atmospheric flows exhibiting multiscale dynamical features.
