Table of Contents
Fetching ...

Neural-HSS: Hierarchical Semi-Separable Neural PDE Solver

Pietro Sittoni, Emanuele Zangrando, Angelo A. Casulli, Nicola Guglielmi, Francesco Tudisco

TL;DR

This work introduces Neural-HSS, a parameter-efficient architecture built upon the Hierarchical Semi-Separable (HSS) matrix structure that is provably data-efficient for a broad class of PDEs and demonstrates its capability to learn from data arising from a broad class of PDEs in diverse domains, including electromagnetism, fluid dynamics, and biology.

Abstract

Deep learning-based methods have shown remarkable effectiveness in solving PDEs, largely due to their ability to enable fast simulations once trained. However, despite the availability of high-performance computing infrastructure, many critical applications remain constrained by the substantial computational costs associated with generating large-scale, high-quality datasets and training models. In this work, inspired by studies on the structure of Green's functions for elliptic PDEs, we introduce Neural-HSS, a parameter-efficient architecture built upon the Hierarchical Semi-Separable (HSS) matrix structure that is provably data-efficient for a broad class of PDEs. We theoretically analyze the proposed architecture, proving that it satisfies exactness properties even in very low-data regimes. We also investigate its connections with other architectural primitives, such as the Fourier neural operator layer and convolutional layers. We experimentally validate the data efficiency of Neural-HSS on the three-dimensional Poisson equation over a grid of two million points, demonstrating its superior ability to learn from data generated by elliptic PDEs in the low-data regime while outperforming baseline methods. Finally, we demonstrate its capability to learn from data arising from a broad class of PDEs in diverse domains, including electromagnetism, fluid dynamics, and biology.

Neural-HSS: Hierarchical Semi-Separable Neural PDE Solver

TL;DR

This work introduces Neural-HSS, a parameter-efficient architecture built upon the Hierarchical Semi-Separable (HSS) matrix structure that is provably data-efficient for a broad class of PDEs and demonstrates its capability to learn from data arising from a broad class of PDEs in diverse domains, including electromagnetism, fluid dynamics, and biology.

Abstract

Deep learning-based methods have shown remarkable effectiveness in solving PDEs, largely due to their ability to enable fast simulations once trained. However, despite the availability of high-performance computing infrastructure, many critical applications remain constrained by the substantial computational costs associated with generating large-scale, high-quality datasets and training models. In this work, inspired by studies on the structure of Green's functions for elliptic PDEs, we introduce Neural-HSS, a parameter-efficient architecture built upon the Hierarchical Semi-Separable (HSS) matrix structure that is provably data-efficient for a broad class of PDEs. We theoretically analyze the proposed architecture, proving that it satisfies exactness properties even in very low-data regimes. We also investigate its connections with other architectural primitives, such as the Fourier neural operator layer and convolutional layers. We experimentally validate the data efficiency of Neural-HSS on the three-dimensional Poisson equation over a grid of two million points, demonstrating its superior ability to learn from data generated by elliptic PDEs in the low-data regime while outperforming baseline methods. Finally, we demonstrate its capability to learn from data arising from a broad class of PDEs in diverse domains, including electromagnetism, fluid dynamics, and biology.
Paper Structure (47 sections, 2 theorems, 37 equations, 9 figures, 9 tables, 1 algorithm)

This paper contains 47 sections, 2 theorems, 37 equations, 9 figures, 9 tables, 1 algorithm.

Key Result

Theorem 2.2

(Convolutional kernels are HSS approximable) Let $D\geq 1$, $\Omega\subseteq \mathbb R^D$ be a compact set, let $k:\Omega \to \mathbb R$ be an asymptotically smooth convolutional kernel (def:asympt_smoothness). Consider the operator For a set of basis functions $\{\phi_j\}_{j=1}^K$, consider the discretization matrix Then, for every $\eta$-admissible pair (def:strong_admissibility) of well-separ

Figures (9)

  • Figure 1: Model overview. The lifting and projection layers can be implemented either as HSS layers or as full-rank layers. We illustrate, as an example, the weight matrix structure with two different hierarchical levels.
  • Figure 2: Train size vs relative test error for different models. The models are trained on a 1D Poisson equation.
  • Figure 3: Left: Train size vs relative test error for different models. The models are trained on a 3D Poisson equation. Right: Timing of forward and backward for the different models.
  • Figure 4: Train size vs relative test error for different models. The models are trained on a 3D Poisson equation
  • Figure 5: Ablation study on rank, outer rank, and levels for the Gray-Scott problem, where the color indicates the test error after training (rescaled from maximum to minimum observed, notice the scale in the colorbars).
  • ...and 4 more figures

Theorems & Definitions (8)

  • Definition 2.1
  • Theorem 2.2
  • Theorem 2.3
  • Definition 1.1
  • Definition 1.2: Cluster tree
  • Definition 1.3
  • Definition 1.4
  • proof