Table of Contents
Fetching ...

A Deep Learning approach for parametrized and time dependent Partial Differential Equations using Dimensionality Reduction and Neural ODEs

Alessandro Longhi, Danny Lathouwers, Zoltán Perkó

TL;DR

The main outcome of this work is the importance of exploiting DR as opposed to the recent trend of building large and complex architectures: it is shown that by leveraging DR the authors can deliver not only more accurate predictions, but also a considerably lighter and faster DL model compared to existing methodologies.

Abstract

Partial Differential Equations (PDEs) are central to science and engineering. Since solving them is computationally expensive, a lot of effort has been put into approximating their solution operator via both traditional and recently increasingly Deep Learning (DL) techniques. A conclusive methodology capable of accounting both for (continuous) time and parameter dependency in such DL models however is still lacking. In this paper, we propose an autoregressive and data-driven method using the analogy with classical numerical solvers for time-dependent, parametric and (typically) nonlinear PDEs. We present how Dimensionality Reduction (DR) can be coupled with Neural Ordinary Differential Equations (NODEs) in order to learn the solution operator of arbitrary PDEs. The idea of our work is that it is possible to map the high-fidelity (i.e., high-dimensional) PDE solution space into a reduced (low-dimensional) space, which subsequently exhibits dynamics governed by a (latent) Ordinary Differential Equation (ODE). Solving this (easier) ODE in the reduced space allows avoiding solving the PDE in the high-dimensional solution space, thus decreasing the computational burden for repeated calculations for e.g., uncertainty quantification or design optimization purposes. The main outcome of this work is the importance of exploiting DR as opposed to the recent trend of building large and complex architectures: we show that by leveraging DR we can deliver not only more accurate predictions, but also a considerably lighter and faster DL model compared to existing methodologies.

A Deep Learning approach for parametrized and time dependent Partial Differential Equations using Dimensionality Reduction and Neural ODEs

TL;DR

The main outcome of this work is the importance of exploiting DR as opposed to the recent trend of building large and complex architectures: it is shown that by leveraging DR the authors can deliver not only more accurate predictions, but also a considerably lighter and faster DL model compared to existing methodologies.

Abstract

Partial Differential Equations (PDEs) are central to science and engineering. Since solving them is computationally expensive, a lot of effort has been put into approximating their solution operator via both traditional and recently increasingly Deep Learning (DL) techniques. A conclusive methodology capable of accounting both for (continuous) time and parameter dependency in such DL models however is still lacking. In this paper, we propose an autoregressive and data-driven method using the analogy with classical numerical solvers for time-dependent, parametric and (typically) nonlinear PDEs. We present how Dimensionality Reduction (DR) can be coupled with Neural Ordinary Differential Equations (NODEs) in order to learn the solution operator of arbitrary PDEs. The idea of our work is that it is possible to map the high-fidelity (i.e., high-dimensional) PDE solution space into a reduced (low-dimensional) space, which subsequently exhibits dynamics governed by a (latent) Ordinary Differential Equation (ODE). Solving this (easier) ODE in the reduced space allows avoiding solving the PDE in the high-dimensional solution space, thus decreasing the computational burden for repeated calculations for e.g., uncertainty quantification or design optimization purposes. The main outcome of this work is the importance of exploiting DR as opposed to the recent trend of building large and complex architectures: we show that by leveraging DR we can deliver not only more accurate predictions, but also a considerably lighter and faster DL model compared to existing methodologies.

Paper Structure

This paper contains 36 sections, 34 equations, 14 figures, 6 tables.

Figures (14)

  • Figure 1: Workings of our proposed method at testing time. The initial condition $s^0_r$ is mapped trough the Encoder $\varphi_\theta$ into its latent representation $\varepsilon_0^{\pmb{\mu}}$. Subsequently the vector $\varepsilon_0^{\pmb{\mu}}$ is advanced in time autoregressively by repeated evaluation of the processor $\pi_\theta$, conditioned to the vector of parameters $\pmb{\mu}$ and to the size of the temporal jump $\Delta t_{i+i,i}$. The Decoder $\psi_\theta$ is used to map back each predicted latent vector $\varepsilon_{i}^{\pmb{\mu},i}$ into the corresponding field $\tilde{s}_r(\mathbf{x},t_i|\pmb{\mu})$. Notice that $\varphi_\theta$ is applied only to the initial condition $s_r^0$.
  • Figure 2: A representation of the training procedure. a) The time series of fields $s_r(\mathbf{x},t_i|\pmb{\mu})$, with $i\in\{0,F\}$, is processed by the Encoder $\varphi_\theta$ and the corresponding latent vectors $\varepsilon_i^{\pmb{\mu}}$ are obtained; these are subsequently mapped back to the full space by means of the Decoder $\psi_\theta$ which generates the time series of reconstructed fields $\tilde{s}_r(\mathbf{x},t_i|\pmb{\mu})$, allowing for the computation of $\mathcal{L}_1$. b) The Processor $\pi_\theta$ receives as input the sequence of latent vectors $\varepsilon_i^{\pmb{\mu}}$ with $i\in\{0,F-1\}$ and predicts the latent vectors $\varepsilon_{i}^{\pmb{\mu},1}$ with $i\in\{1,F\}$. $\mathcal{L}_2^{T,1}$, where $T$ stands for Teacher-Forcing, is thus computed with inputs $\varepsilon_i^{\pmb{\mu}}$ and $\varepsilon_{i}^{\pmb{\mu},1}$. c) The Processor $\pi_\theta$ is applied autoregressively to the initial latent vector $\varepsilon_0^{\pmb{\mu}}$ and the whole time series of vectors $\varepsilon_{i}^{\pmb{\mu},i}$ is reconstructed with $i\in\{1,F\}$; $\mathcal{L}_2^{A,k_2}$, where $A$ stands for Autoregressive, is thus computed with inputs $\varepsilon_i^{\pmb{\mu}}$ and $\varepsilon_{i}^{\pmb{\mu},i}$. d) The Processor $\pi_\theta$ takes as input the sequence of latent vectors $\varepsilon_i^{\pmb{\mu}}$ with $i\in\{0,F-1\}$ and outputs for each $\varepsilon_i^{\pmb{\mu}}$ an intermediate vector $\varepsilon^{\pmb{\mu},1}_{m}$ with a time-step $\Delta t_{m,i-1}$ randomly sampled from $[0,\Delta t_{i,i-1}]$. Last, $\pi_\theta$ advances in time each $\varepsilon^{\pmb{\mu},1}_{m}$ with a time-step of $\Delta t_{i,i-1}-\Delta t_{m,i}$ to get the predicted vectors $\tilde{\varepsilon}_{i}^{\pmb{\mu}}$; $\mathcal{L}_3$ is thus computed with inputs $\varepsilon_i^{\pmb{\mu}}$ and $\tilde{\varepsilon}_{i}^{\pmb{\mu}}$.
  • Figure 3: Distribution of the nRMSE across the test sample for the parametric 1D Advection. Regular font on the x axes refers to training parameter values, while bald ones to testing parameters (but in both cases testing initial conditions). We compare our methodology (red on right image) with other published methods (left image, taken from vcnef-hagnberger:2024).
  • Figure 4: Distribution of the nRMSE across the test sample for the parametric 1D Burgers'. Regular font on x axes refers to training parameters, while bald ones to testing parameters (but in both cases testing initial conditions). We compare our methodology (red on right image) with other published methods (left image, taken from vcnef-hagnberger:2024).
  • Figure 5: Comparison (on the test dataset) for the Molenkamp application of the nRMSE over time $t$ between our model (red) and the VCNeF (green). We study the difference when applying at inference the same $\Delta t$ used for the training ($\Delta t = 0.05\,s$) and when applying a smaller one $\Delta t = 0.02\,s$. The nRMSE of our model slightly increases when decreasing the $\Delta t$, while VCNeF struggles with inference at intermediate time-steps.
  • ...and 9 more figures