Table of Contents
Fetching ...

Quadratic neural networks for solving inverse problems

Leon Frischauf, Otmar Scherzer, Cong Shi

TL;DR

This work addresses the inverse problem $F:{\bf X}\to{\bf Y}$ by exploring neural-network ansatz functions with generalized, higher-order decision functions, focusing on shallow networks that include quadratic and radial forms. The authors establish universal approximation results for SBQNNs and CUNNs and derive $\mathcal{L}^1$-convergence rates for radial quadratic networks (RQNNs) using wavelet-frame constructions and approximation-to-identity (AtI) theory, yielding explicit $L^2$-error bounds that scale as $(N+1)^{-1/2}$ with the number of terms. They show that Gauss–Newton convergence is tractable for RQNNs under suitable nondegeneracy conditions, and they argue that higher-order, shallower architectures can offer clearer, more tractable convergence analyses than deep affine networks. The results suggest that quadratic or radial higher-order networks can achieve comparable approximation quality with fewer components and allow more transparent analysis for ill-posed inverse problems, with implications for practical reconstruction tasks and training dynamics.

Abstract

In this paper we investigate the solution of inverse problems with neural network ansatz functions with generalized decision functions. The relevant observation for this work is that such functions can approximate typical test cases, such as the Shepp-Logan phantom, better, than standard neural networks. Moreover, we show that the convergence analysis of numerical methods for solving inverse problems with shallow generalized neural network functions leads to more intuitive convergence conditions, than for deep affine linear neural networks.

Quadratic neural networks for solving inverse problems

TL;DR

This work addresses the inverse problem by exploring neural-network ansatz functions with generalized, higher-order decision functions, focusing on shallow networks that include quadratic and radial forms. The authors establish universal approximation results for SBQNNs and CUNNs and derive -convergence rates for radial quadratic networks (RQNNs) using wavelet-frame constructions and approximation-to-identity (AtI) theory, yielding explicit -error bounds that scale as with the number of terms. They show that Gauss–Newton convergence is tractable for RQNNs under suitable nondegeneracy conditions, and they argue that higher-order, shallower architectures can offer clearer, more tractable convergence analyses than deep affine networks. The results suggest that quadratic or radial higher-order networks can achieve comparable approximation quality with fewer components and allow more transparent analysis for ill-posed inverse problems, with implications for practical reconstruction tasks and training dynamics.

Abstract

In this paper we investigate the solution of inverse problems with neural network ansatz functions with generalized decision functions. The relevant observation for this work is that such functions can approximate typical test cases, such as the Shepp-Logan phantom, better, than standard neural networks. Moreover, we show that the convergence analysis of numerical methods for solving inverse problems with shallow generalized neural network functions leads to more intuitive convergence conditions, than for deep affine linear neural networks.
Paper Structure (8 sections, 11 theorems, 55 equations)

This paper contains 8 sections, 11 theorems, 55 equations.

Key Result

Theorem 1

Let $\sigma:\mathbb{R} \to \mathbb{R}$ be a continuous discriminatory function. Then, for every function $g \in C([0,1]^n)$ and every $\epsilon>0$, there exists a function satisfying

Theorems & Definitions (32)

  • Definition 1: Affine linear neural network functions
  • Definition 2: Shallow generalized neural network function
  • Definition 3: Neural networks with generalized decision functions
  • Remark 1
  • Definition 4: Discriminatory function
  • Example 1
  • Theorem 1: Cyb89
  • Theorem 2: Generalized universal approximation theorem
  • Proof
  • Corollary 1: Universal approximation properties of SBQNNs and CUNNs
  • ...and 22 more