Table of Contents
Fetching ...

Adaptive quadratures for nonlinear approximation of low-dimensional PDEs using smooth neural networks

Alexandre Magueresse, Santiago Badia

TL;DR

This paper tackles the accuracy of numerical integration in physics-informed neural networks for low-dimensional PDEs by introducing adaptive quadratures built from CPWL approximations of smooth activations. By decomposing the domain into regions where the network is almost linear, the authors construct Gaussian quadratures on convex cells to evaluate the loss with high precision, and they prove a quadratic convergence rate for CPWL activation approximations. They address the non-smoothness of CPWL networks by smoothing activations and provide a rigorous error analysis that bounds integration error in terms of derivative bounds. Numerical experiments on Poisson problems in 1D and 2D show that the adaptive quadrature reduces the number of integration points needed, improves robustness to initialization, and speeds up convergence compared to Monte Carlo, especially for weak formulations and complex domains. The approach also enables handling polygonal domains without relying on tensor-product meshes, offering a practical pathway toward more reliable NN-based PDE solvers.

Abstract

Physics-informed neural networks (PINNs) and their variants have recently emerged as alternatives to traditional partial differential equation (PDE) solvers, but little literature has focused on devising accurate numerical integration methods for neural networks (NNs), which is essential for getting accurate solutions. In this work, we propose adaptive quadratures for the accurate integration of neural networks and apply them to loss functions appearing in low-dimensional PDE discretisations. We show that at opposite ends of the spectrum, continuous piecewise linear (CPWL) activation functions enable one to bound the integration error, while smooth activations ease the convergence of the optimisation problem. We strike a balance by considering a CPWL approximation of a smooth activation function. The CPWL activation is used to obtain an adaptive decomposition of the domain into regions where the network is almost linear, and we derive an adaptive global quadrature from this mesh. The loss function is then obtained by evaluating the smooth network (together with other quantities, e.g., the forcing term) at the quadrature points. We propose a method to approximate a class of smooth activations by CPWL functions and show that it has a quadratic convergence rate. We then derive an upper bound for the overall integration error of our proposed adaptive quadrature. The benefits of our quadrature are evaluated on a strong and weak formulation of the Poisson equation in dimensions one and two. Our numerical experiments suggest that compared to Monte-Carlo integration, our adaptive quadrature makes the convergence of NNs quicker and more robust to parameter initialisation while needing significantly fewer integration points and keeping similar training times.

Adaptive quadratures for nonlinear approximation of low-dimensional PDEs using smooth neural networks

TL;DR

This paper tackles the accuracy of numerical integration in physics-informed neural networks for low-dimensional PDEs by introducing adaptive quadratures built from CPWL approximations of smooth activations. By decomposing the domain into regions where the network is almost linear, the authors construct Gaussian quadratures on convex cells to evaluate the loss with high precision, and they prove a quadratic convergence rate for CPWL activation approximations. They address the non-smoothness of CPWL networks by smoothing activations and provide a rigorous error analysis that bounds integration error in terms of derivative bounds. Numerical experiments on Poisson problems in 1D and 2D show that the adaptive quadrature reduces the number of integration points needed, improves robustness to initialization, and speeds up convergence compared to Monte Carlo, especially for weak formulations and complex domains. The approach also enables handling polygonal domains without relying on tensor-product meshes, offering a practical pathway toward more reliable NN-based PDE solvers.

Abstract

Physics-informed neural networks (PINNs) and their variants have recently emerged as alternatives to traditional partial differential equation (PDE) solvers, but little literature has focused on devising accurate numerical integration methods for neural networks (NNs), which is essential for getting accurate solutions. In this work, we propose adaptive quadratures for the accurate integration of neural networks and apply them to loss functions appearing in low-dimensional PDE discretisations. We show that at opposite ends of the spectrum, continuous piecewise linear (CPWL) activation functions enable one to bound the integration error, while smooth activations ease the convergence of the optimisation problem. We strike a balance by considering a CPWL approximation of a smooth activation function. The CPWL activation is used to obtain an adaptive decomposition of the domain into regions where the network is almost linear, and we derive an adaptive global quadrature from this mesh. The loss function is then obtained by evaluating the smooth network (together with other quantities, e.g., the forcing term) at the quadrature points. We propose a method to approximate a class of smooth activations by CPWL functions and show that it has a quadratic convergence rate. We then derive an upper bound for the overall integration error of our proposed adaptive quadrature. The benefits of our quadrature are evaluated on a strong and weak formulation of the Poisson equation in dimensions one and two. Our numerical experiments suggest that compared to Monte-Carlo integration, our adaptive quadrature makes the convergence of NNs quicker and more robust to parameter initialisation while needing significantly fewer integration points and keeping similar training times.
Paper Structure (51 sections, 5 theorems, 51 equations, 11 figures, 8 tables, 1 algorithm)

This paper contains 51 sections, 5 theorems, 51 equations, 11 figures, 8 tables, 1 algorithm.

Key Result

Lemma 1

For all $\alpha > 0$ and $\rho \in \mathcal{A}^\alpha$,

Figures (11)

  • Figure 1: Examples of regularising families for the $\mathop{\mathrm{ReLU}}\nolimits$ function and corresponding first and second derivatives.
  • Figure 2: Visualisation of $\pi_7(\tanh)$ and convergence plot of our proposed method in $L^2$ norm for $\mathop{\mathrm{ReLU_{{\mathgroup-1 \varepsilon}}}}\nolimits$ and $\tanh$. On \ref{['fig:cpwl_tanh']}, the square pink points are fixed, while the circle green points are chosen to minimise the $L^2$ norm of $\tanh - \pi_7[\tanh]$.
  • Figure 3: Example of a mesh extraction, corresponding to a neural network with architecture $(2, m)$ with $m \geq 4$, that is $u = \rho \circ \bm{\Theta}$, where $\bm{\Theta}: \bm{x} \mapsto \bm{W} \bm{x} + \bm{b}$, $\bm{W} \in \mathbb{R}^{m \times 2}$ and $\bm{b} \in \mathbb{R}^{m}$. \ref{['fig:mesh_1']} Parallel hyperplanes associated with different breakpoints $\xi_1, \xi_2$ of $\pi[\rho]$ for one of the output coordinates of $\bm{\Theta}$ (they are orthogonal to one of the row vectors of $\bm{W}$). \ref{['fig:mesh_2']} Hyperplanes corresponding to all the output coordinates of $\bm{\Theta}$. \ref{['fig:mesh_3']} Clipping of the hyperplanes against the region boundary. \ref{['fig:mesh_4']} Pairwise intersection of the hyperplanes.
  • Figure 4: Learning curve \ref{['fig:1_l2']} and pointwise error \ref{['fig:1_rel']} for the strong Poisson problem with the $\tanh$ activation. The learning rate is set to $10^{-2}$ and the points are resampled every $1$ epoch.
  • Figure 5: Learning curve \ref{['fig:2_l2']} and pointwise error \ref{['fig:2_rel']} for the strong Poisson problem with the $\mathop{\mathrm{ReLU_{{\mathgroup-1 \varepsilon}}}}\nolimits$ activation. The learning rate is $\eta = 10^{-2}$ and the points are resampled every $10$ epochs.
  • ...and 6 more figures

Theorems & Definitions (16)

  • Lemma 1: Some properties of functions in $\mathcal{A}^\alpha$
  • proof
  • Definition 1: Partition induced by a function
  • Lemma 2: Distance between a function and its tangents
  • proof
  • Proposition 1: Best approximation in $\mathcal{A}_n$
  • proof
  • Lemma 3: Convexity of the mesh
  • proof
  • Proposition 2: Distance between two neural networks
  • ...and 6 more