Table of Contents
Fetching ...

Adversarial Autoencoders in Operator Learning

Dustin Enyeart, Guang Lin

TL;DR

This work studies adversarial augmentation for autoencoder-based neural operators, focusing on DeepONets and Koopman autoencoders. By adding a latent-space discriminator and training to fool it, the encoders are encouraged to utilize the latent space more fully, improving operator learning when data are scarce. Across five differential-equation tasks (ODEs: $\theta$-pendulum, Lorenz, fluid attractor; PDEs: Burger’s, KdV), the approach yields notable accuracy gains (up to $26.5\%$ in some cases) with small training sets, while maintaining modest computational costs. The results support adversarial latent regularization as a practical, data-efficient enhancement for neural operators, with code available publicly for replication, and the key mechanism expressed as $s_n \approx R \circ K^{n} \circ E(s_0)$ in Koopman settings and $y = E_u(u)^{\top} E_x(x)$ in DeepONet settings.

Abstract

DeepONets and Koopman autoencoders are two prevalent neural operator architectures. These architectures are autoencoders. An adversarial addition to an autoencoder have improved performance of autoencoders in various areas of machine learning. In this paper, the use an adversarial addition for these two neural operator architectures is studied.

Adversarial Autoencoders in Operator Learning

TL;DR

This work studies adversarial augmentation for autoencoder-based neural operators, focusing on DeepONets and Koopman autoencoders. By adding a latent-space discriminator and training to fool it, the encoders are encouraged to utilize the latent space more fully, improving operator learning when data are scarce. Across five differential-equation tasks (ODEs: -pendulum, Lorenz, fluid attractor; PDEs: Burger’s, KdV), the approach yields notable accuracy gains (up to in some cases) with small training sets, while maintaining modest computational costs. The results support adversarial latent regularization as a practical, data-efficient enhancement for neural operators, with code available publicly for replication, and the key mechanism expressed as in Koopman settings and in DeepONet settings.

Abstract

DeepONets and Koopman autoencoders are two prevalent neural operator architectures. These architectures are autoencoders. An adversarial addition to an autoencoder have improved performance of autoencoders in various areas of machine learning. In this paper, the use an adversarial addition for these two neural operator architectures is studied.

Paper Structure

This paper contains 13 sections, 10 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: An autoencoder and an adversarial autoencoder: The left is an autoencoder, where an encoder maps the input $x$ into a latent space, and a decoder maps this encoding to the output $y$. The right is an adversarial autoencoder, which extends the autoencoder to include a discriminator that maps an encoding to an output $z$.
  • Figure 2: The sigmoid function
  • Figure 3: The DeepONet architecture: The input $u$ is the input function, and the input $x$ is the point where the output function is evaluated. Their encodings are denoted by $E_u$ and $E_x$, respectively. The output is denoted by $y$.
  • Figure 4: Discretization of the Koopman formulation into a numerical scheme: The physical states at successive time points are denoted by $s_0$, $s_1$, $\dots$, $s_{n-1}$ and $s_n$. The encoded states at successive time points are denoted by $e_0$, $e_1$, $\dots$, $e_{n-1}$ and $e_n$. The function $f$ is the true time evolution of the physical state by the time step. The discretized Koopman operator, encoder and decoder are denoted by $K$, $E$ and $R$, respectively.
  • Figure 5: Comparison of an adversarial addition to a DeepONet for the KdV equation: Both plots are of the final time slices of the target and prediction. The left plot is without an adversarial addition. The right plot is with an adversarial addition. The model with the adversarial addition is visibly more accurate, and it particularly hugs the true solution more at the main bend. Both models are still relatively inaccurate compared to models trained with a large amount of data.