Table of Contents
Fetching ...

A practical existence theorem for reduced order models based on convolutional autoencoders

Nicola Rares Franco, Simone Brugiapaglia

TL;DR

The work provides a practical existence theorem for CNN-based autoencoder DL-ROMs by embedding them in a holomorphic-regularity framework and leveraging sparse high-dimensional approximation. It proves that there exist encoder/decoder networks and a reduced map with provable error bounds that separate a sampling term (exponentially decaying with training size) from an approximation term tied to latent dimension and smoothness. The theory formalizes practical training choices, such as latent regularization and convolutional decoders, and demonstrates applicability to a parametric diffusion problem in 1D, yielding explicit error bounds and architectural guidelines. While the results offer substantial theoretical grounding and align with practitioner heuristics, they highlight limitations in higher dimensions and the need for further research on extending the framework and improving scalability.

Abstract

In recent years, deep learning has gained increasing popularity in the fields of Partial Differential Equations (PDEs) and Reduced Order Modeling (ROM), providing domain practitioners with new powerful data-driven techniques such as Physics-Informed Neural Networks (PINNs), Neural Operators, Deep Operator Networks (DeepONets) and Deep-Learning based ROMs (DL-ROMs). In this context, deep autoencoders based on Convolutional Neural Networks (CNNs) have proven extremely effective, outperforming established techniques, such as the reduced basis method, when dealing with complex nonlinear problems. However, despite the empirical success of CNN-based autoencoders, there are only a few theoretical results supporting these architectures, usually stated in the form of universal approximation theorems. In particular, although the existing literature provides users with guidelines for designing convolutional autoencoders, the subsequent challenge of learning the latent features has been barely investigated. Furthermore, many practical questions remain unanswered, e.g., the number of snapshots needed for convergence or the neural network training strategy. In this work, using recent techniques from sparse high-dimensional function approximation, we fill some of these gaps by providing a new practical existence theorem for CNN-based autoencoders when the parameter-to-solution map is holomorphic. This regularity assumption arises in many relevant classes of parametric PDEs, such as the parametric diffusion equation, for which we discuss an explicit application of our general theory.

A practical existence theorem for reduced order models based on convolutional autoencoders

TL;DR

The work provides a practical existence theorem for CNN-based autoencoder DL-ROMs by embedding them in a holomorphic-regularity framework and leveraging sparse high-dimensional approximation. It proves that there exist encoder/decoder networks and a reduced map with provable error bounds that separate a sampling term (exponentially decaying with training size) from an approximation term tied to latent dimension and smoothness. The theory formalizes practical training choices, such as latent regularization and convolutional decoders, and demonstrates applicability to a parametric diffusion problem in 1D, yielding explicit error bounds and architectural guidelines. While the results offer substantial theoretical grounding and align with practitioner heuristics, they highlight limitations in higher dimensions and the need for further research on extending the framework and improving scalability.

Abstract

In recent years, deep learning has gained increasing popularity in the fields of Partial Differential Equations (PDEs) and Reduced Order Modeling (ROM), providing domain practitioners with new powerful data-driven techniques such as Physics-Informed Neural Networks (PINNs), Neural Operators, Deep Operator Networks (DeepONets) and Deep-Learning based ROMs (DL-ROMs). In this context, deep autoencoders based on Convolutional Neural Networks (CNNs) have proven extremely effective, outperforming established techniques, such as the reduced basis method, when dealing with complex nonlinear problems. However, despite the empirical success of CNN-based autoencoders, there are only a few theoretical results supporting these architectures, usually stated in the form of universal approximation theorems. In particular, although the existing literature provides users with guidelines for designing convolutional autoencoders, the subsequent challenge of learning the latent features has been barely investigated. Furthermore, many practical questions remain unanswered, e.g., the number of snapshots needed for convergence or the neural network training strategy. In this work, using recent techniques from sparse high-dimensional function approximation, we fill some of these gaps by providing a new practical existence theorem for CNN-based autoencoders when the parameter-to-solution map is holomorphic. This regularity assumption arises in many relevant classes of parametric PDEs, such as the parametric diffusion equation, for which we discuss an explicit application of our general theory.
Paper Structure (12 sections, 6 theorems, 116 equations, 2 figures)

This paper contains 12 sections, 6 theorems, 116 equations, 2 figures.

Key Result

Theorem 1

There are universal constants $c_{0},c_{1},c_{2},c_{3},c_{4}>0$ such that the following holds. Let $p\in\mathbb{N}$, $p\ge1$, and $\epsilon,\gamma>0$. Let $\varrho$ be the uniform probability distribution over $\Theta:=[-1,1]^{p}$. Let $\Omega=(0,1)$ and be a (nonlinear) map belonging to $\mathcal{HA}_{\gamma,\epsilon,s}(\Theta)$, where $s\ge1$ (see Definition def:hidden-sobolev). Fix a training

Figures (2)

  • Figure 1: A 2D convolutional layer acting on a given input (simplified setting: 1 channel at input/output, no activation nor bias). The action of the convolutional layer can be visualized either in terms of a moving filter (A) or using the equivalent matrix representation (B): in both cases, despite mapping from $\mathbb{R}^{9}$ onto $\mathbb{R}^{4}$, the layer only comes with 4 learnable parameters, instead of $9 \cdot 4=36$.
  • Figure 2: Visualization of the transformation $f\mapsto \tilde{f}$ used in Lemma \ref{['lemma:T']}. The signal $f$ is duplicated and a polynomial perturbation is added to ensure (smooth) periodicity.

Theorems & Definitions (28)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5
  • Definition 6
  • Definition 7
  • Definition 8
  • Definition 9
  • Definition 10
  • ...and 18 more