Table of Contents
Fetching ...

Learning in latent spaces improves the predictive accuracy of deep neural operators

Katiana Kontolati, Somdatta Goswami, George Em Karniadakis, Michael D. Shields

TL;DR

The paper tackles the difficulty of learning neural operators for high-dimensional PDE data by introducing Latent-DeepONet (L-DeepONet), which compresses inputs and outputs into a latent space via autoencoders and trains a DeepONet on this reduced representation before decoding back to the physical space. Across diverse time-dependent PDEs—brittle fracture, Rayleigh–Bénard convection, and shallow-water dynamics—the method achieves higher accuracy and substantially lower training costs than standard DeepONet and Fourier-based operators, with small latent dimensions (d ≤ 100) sufficing for effective modeling. The work discusses advantages such as improved generalization and efficiency, as well as limitations including the need for separate DR models for heterogeneous quantities and the lack of spatial interpolation or physics-informed constraints in latent space, outlining avenues for future enhancements.

Abstract

Operator regression provides a powerful means of constructing discretization-invariant emulators for partial-differential equations (PDEs) describing physical systems. Neural operators specifically employ deep neural networks to approximate mappings between infinite-dimensional Banach spaces. As data-driven models, neural operators require the generation of labeled observations, which in cases of complex high-fidelity models result in high-dimensional datasets containing redundant and noisy features, which can hinder gradient-based optimization. Mapping these high-dimensional datasets to a low-dimensional latent space of salient features can make it easier to work with the data and also enhance learning. In this work, we investigate the latent deep operator network (L-DeepONet), an extension of standard DeepONet, which leverages latent representations of high-dimensional PDE input and output functions identified with suitable autoencoders. We illustrate that L-DeepONet outperforms the standard approach in terms of both accuracy and computational efficiency across diverse time-dependent PDEs, e.g., modeling the growth of fracture in brittle materials, convective fluid flows, and large-scale atmospheric flows exhibiting multiscale dynamical features.

Learning in latent spaces improves the predictive accuracy of deep neural operators

TL;DR

The paper tackles the difficulty of learning neural operators for high-dimensional PDE data by introducing Latent-DeepONet (L-DeepONet), which compresses inputs and outputs into a latent space via autoencoders and trains a DeepONet on this reduced representation before decoding back to the physical space. Across diverse time-dependent PDEs—brittle fracture, Rayleigh–Bénard convection, and shallow-water dynamics—the method achieves higher accuracy and substantially lower training costs than standard DeepONet and Fourier-based operators, with small latent dimensions (d ≤ 100) sufficing for effective modeling. The work discusses advantages such as improved generalization and efficiency, as well as limitations including the need for separate DR models for heterogeneous quantities and the lack of spatial interpolation or physics-informed constraints in latent space, outlining avenues for future enhancements.

Abstract

Operator regression provides a powerful means of constructing discretization-invariant emulators for partial-differential equations (PDEs) describing physical systems. Neural operators specifically employ deep neural networks to approximate mappings between infinite-dimensional Banach spaces. As data-driven models, neural operators require the generation of labeled observations, which in cases of complex high-fidelity models result in high-dimensional datasets containing redundant and noisy features, which can hinder gradient-based optimization. Mapping these high-dimensional datasets to a low-dimensional latent space of salient features can make it easier to work with the data and also enhance learning. In this work, we investigate the latent deep operator network (L-DeepONet), an extension of standard DeepONet, which leverages latent representations of high-dimensional PDE input and output functions identified with suitable autoencoders. We illustrate that L-DeepONet outperforms the standard approach in terms of both accuracy and computational efficiency across diverse time-dependent PDEs, e.g., modeling the growth of fracture in brittle materials, convective fluid flows, and large-scale atmospheric flows exhibiting multiscale dynamical features.
Paper Structure (6 sections, 1 theorem, 24 equations, 12 figures, 5 tables)

This paper contains 6 sections, 1 theorem, 24 equations, 12 figures, 5 tables.

Key Result

Theorem 1

Suppose that $X$ is a Banach space, $K_1 \subset X$, $K_2 \subset \mathbb{R}^d$ are two compact sets in $X$ and $\mathbb{R}^d$, respectively, $V$ is a compact set in $C(K_1)$. Assume that: $\mathcal{G}: V \rightarrow C(K_2)$ is a nonlinear continuous operator. Then, for any $\epsilon > 0$, there exi holds for all $\mathbf{x} \in V$ and $\zeta \in K_2$, where $\langle \cdot, \cdot \rangle$ denotes

Figures (12)

  • Figure 1: Latent DeepONet (L-DeepONet) framework for learning deep neural operators on latent spaces. In the first step, a multi-layer autoencoder is trained using a combined dataset of the high-dimensional input and output realizations of a PDE model, $\{\mathbf{x}_i, \mathbf{y}_i\}_{i=1}^{N}$, respectively. The trained encoder projects the data onto a latent space $\mathbb{R}^d$ and the dataset on the latent space, $\{\mathbf{x}^r_i, \mathbf{y}^r_i\}_{i=1}^{N}$ is then used to train a DeepONet model and learn the operator $\mathcal{G}_{\theta}$, where $\theta$ denotes the trainable parameters of the network. Finally, to evaluate the performance of the model on the original PDE outputs and perform inference, the pre-trained decoder is employed to map predicted samples back to physically-interpretable space.
  • Figure 2: Left: Results for all applications of the multi-layer autoencoders (MLAE) for different values of the latent dimensionality. Right: Results for all applications of the neural operators for all studied models. Violin plots represent $5$ independent training of the models using different random seed numbers.
  • Figure 3: Brittle fracture in a plate loaded in shear: results of a representative sample with $y_c = 0.55$ and $l_c = 0.6$ for all neural operators. The results of the L-DeepONet model consider the latent dimension, $d=64$. The neural operator is trained to approximate the growth of the crack for five time steps from a given initial location of the defect.
  • Figure 4: Rayleigh-Bénard convective flow: results of the temperature field of a representative sample for all neural operators. The results of the L-DeepONet model consider the latent dimension, $d=100$. The neural operator is trained to approximate the growth of the evolution of the temperature field from a realization of the initial temperature field for seven time steps.
  • Figure 5: Shallow water equations: results of the evolution of the velocity field through eight time steps for all the operator models considered in this work, for a representative realization of the initial perturbation to the height field. The results of the L-DeepONet model consider the latent dimension, $d=81$.
  • ...and 7 more figures

Theorems & Definitions (1)

  • Theorem 1: Generalized Universal Approximation Theorem for Operators.