Table of Contents
Fetching ...

Power-Enhanced Residual Network for Function Approximation and Physics-Informed Inverse Problems

Amir Noorizadegan, D. L. Young, Y. C. Hon, C. S. Chen

TL;DR

The paper tackles instability in training deep neural networks arising from forward weight updates and backpropagation gradients. It introduces the Power-Enhancing SkipResNet, a power-augmented residual architecture that combines ideas from highway networks and ResNets, incorporating a power term $x^{(l-1),p}$ to stabilize gradient flow and enhance expressivity. Through 2D/3D function interpolation and a physics-informed PINN approach for the inverse Burgers' equation, the authors demonstrate that the proposed architecture achieves higher accuracy and faster convergence than plain NNs, particularly for non-smooth functions, and yields stable weight dynamics across various depths. The work provides extensive empirical evidence on synthetic functions and real-world data like Mt. Eden and the Stanford Bunny, and releases code at the given GitHub URL to facilitate adoption and replication.

Abstract

In this study, we investigate how the updating of weights during forward operation and the computation of gradients during backpropagation impact the optimization process, training procedure, and overall performance of the neural network, particularly the multi-layer perceptrons (MLPs). This paper introduces a novel neural network structure called the Power-Enhancing residual network, inspired by highway network and residual network, designed to improve the network's capabilities for both smooth and non-smooth functions approximation in 2D and 3D settings. By incorporating power terms into residual elements, the architecture enhances the stability of weight updating, thereby facilitating better convergence and accuracy. The study explores network depth, width, and optimization methods, showing the architecture's adaptability and performance advantages. Consistently, the results emphasize the exceptional accuracy of the proposed Power-Enhancing residual network, particularly for non-smooth functions. Real-world examples also confirm its superiority over plain neural network in terms of accuracy, convergence, and efficiency. Moreover, the proposed architecture is also applied to solving the inverse Burgers' equation, demonstrating superior performance. In conclusion, the Power-Enhancing residual network offers a versatile solution that significantly enhances neural network capabilities by emphasizing the importance of stable weight updates for effective training in deep neural networks. The codes implemented are available at: \url{https://github.com/CMMAi/ResNet_for_PINN}.

Power-Enhanced Residual Network for Function Approximation and Physics-Informed Inverse Problems

TL;DR

The paper tackles instability in training deep neural networks arising from forward weight updates and backpropagation gradients. It introduces the Power-Enhancing SkipResNet, a power-augmented residual architecture that combines ideas from highway networks and ResNets, incorporating a power term to stabilize gradient flow and enhance expressivity. Through 2D/3D function interpolation and a physics-informed PINN approach for the inverse Burgers' equation, the authors demonstrate that the proposed architecture achieves higher accuracy and faster convergence than plain NNs, particularly for non-smooth functions, and yields stable weight dynamics across various depths. The work provides extensive empirical evidence on synthetic functions and real-world data like Mt. Eden and the Stanford Bunny, and releases code at the given GitHub URL to facilitate adoption and replication.

Abstract

In this study, we investigate how the updating of weights during forward operation and the computation of gradients during backpropagation impact the optimization process, training procedure, and overall performance of the neural network, particularly the multi-layer perceptrons (MLPs). This paper introduces a novel neural network structure called the Power-Enhancing residual network, inspired by highway network and residual network, designed to improve the network's capabilities for both smooth and non-smooth functions approximation in 2D and 3D settings. By incorporating power terms into residual elements, the architecture enhances the stability of weight updating, thereby facilitating better convergence and accuracy. The study explores network depth, width, and optimization methods, showing the architecture's adaptability and performance advantages. Consistently, the results emphasize the exceptional accuracy of the proposed Power-Enhancing residual network, particularly for non-smooth functions. Real-world examples also confirm its superiority over plain neural network in terms of accuracy, convergence, and efficiency. Moreover, the proposed architecture is also applied to solving the inverse Burgers' equation, demonstrating superior performance. In conclusion, the Power-Enhancing residual network offers a versatile solution that significantly enhances neural network capabilities by emphasizing the importance of stable weight updates for effective training in deep neural networks. The codes implemented are available at: \url{https://github.com/CMMAi/ResNet_for_PINN}.
Paper Structure (13 sections, 21 equations, 17 figures, 2 tables)

This paper contains 13 sections, 21 equations, 17 figures, 2 tables.

Figures (17)

  • Figure 1: schematic of an MLP.
  • Figure 2: The neural network (interpolation stage) + physics (inverse Burger's equation). Here, $x$ and $t$ represent two dimensions, each including $n$ samples.
  • Figure 3: Three neural network architectures: (a) plain neural network (Plain NN), (b) residual network (ResNet), (c) power-enhanced SkipResNet, and (d) Unraveled SQR-SkipResNet (plot (c) with $p=2$) where $\odot$ denotes element-wise multiplication.
  • Figure 4: The profile of the F1, F2, and F3
  • Figure 5: The profiles of training on F1 for different number of collocation points $n$. Dotted-line curves denote training error, and solid-line curves denote validation error.
  • ...and 12 more figures

Theorems & Definitions (4)

  • Example 1
  • Example 2
  • Example 3
  • Example 4