Convergence and Recovery Guarantees of Unsupervised Neural Networks for Inverse Problems

Nathan Buskulic; Jalal Fadili; Yvain Quéau

Convergence and Recovery Guarantees of Unsupervised Neural Networks for Inverse Problems

Nathan Buskulic, Jalal Fadili, Yvain Quéau

TL;DR

This work provides deterministic convergence and recovery guarantees for the class of unsupervised feedforward multilayer neural networks trained to solve inverse problems and derives overparametrization bounds under which a two-layer Deep Inverse Prior network with smooth activation function will benefit from these guarantees.

Abstract

Neural networks have become a prominent approach to solve inverse problems in recent years. While a plethora of such methods was developed to solve inverse problems empirically, we are still lacking clear theoretical guarantees for these methods. On the other hand, many works proved convergence to optimal solutions of neural networks in a more general setting using overparametrization as a way to control the Neural Tangent Kernel. In this work we investigate how to bridge these two worlds and we provide deterministic convergence and recovery guarantees for the class of unsupervised feedforward multilayer neural networks trained to solve inverse problems. We also derive overparametrization bounds under which a two-layers Deep Inverse Prior network with smooth activation function will benefit from our guarantees.

Convergence and Recovery Guarantees of Unsupervised Neural Networks for Inverse Problems

TL;DR

Abstract

Paper Structure (30 sections, 16 theorems, 84 equations, 3 figures)

This paper contains 30 sections, 16 theorems, 84 equations, 3 figures.

Introduction
Problem Statement
Contributions
Relation to Prior Work
Data-Driven Methods to Solve Inverse Problems
Deep Inverse Prior
Theory of Overparametrized Networks
Paper organization
Preliminaries
General Notations
Multilayer Neural Networks
KL Functions
Recovery Guarantees
Main Assumptions
Well-posedness
...and 15 more sections

Key Result

Proposition 3.1

Assume that ass:l_smooth, ass:phi_diff and ass:F_diff hold. There there exists $T(\pmb{\theta}_0) \in ]0,+\infty]$ and a unique maximal solution $\pmb{\theta}(\cdot) \in \mathcal{C}^0([0,T(\pmb{\theta}_0)[)$ of eq:gradflow, and $\pmb{\theta}(\cdot)$ is $\mathcal{C}^1$ on every compact set of the int

Figures (3)

Figure 1: Probability of converging to a zero-loss solution for networks with different architecture parameters confirming our theoretical predictions: linear dependency between $k$ and $m$ and at least quadratic dependency between $k$ and $n$. The blue line is a quadratic function representing the phase transition fitted on the data.
Figure 2: Effect of the noise on both the signal and the loss convergence in different contexts.
Figure 3: Convergence profile of different losses parametrized by $p$. The mean loss values at each iteration of 50 networks are plotted.

Theorems & Definitions (46)

Definition 2.1
Definition 2.2: KL inequality
Example 2.3: Convex functions with sufficient growth
Example 2.4: Uniformly convex functions
Example 2.5
Proposition 3.1
proof
Theorem 3.2
proof
Corollary 3.3
...and 36 more

Convergence and Recovery Guarantees of Unsupervised Neural Networks for Inverse Problems

TL;DR

Abstract

Convergence and Recovery Guarantees of Unsupervised Neural Networks for Inverse Problems

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (46)