On the existence of optimal shallow feedforward networks with ReLU activation

Steffen Dereich; Sebastian Kassing

On the existence of optimal shallow feedforward networks with ReLU activation

Steffen Dereich, Sebastian Kassing

TL;DR

This work proves existence of global minima in the loss landscape for the approximation of continuous target functions using shallow feedforward artificial neural networks with ReLU activation and proposes a kind of closure of the search space so that in the extended space minimizers exist.

Abstract

We prove existence of global minima in the loss landscape for the approximation of continuous target functions using shallow feedforward artificial neural networks with ReLU activation. This property is one of the fundamental artifacts separating ReLU from other commonly used activation functions. We propose a kind of closure of the search space so that in the extended space minimizers exist. In a second step, we show under mild assumptions that the newly added functions in the extension perform worse than appropriate representable ReLU networks. This then implies that the optimal response in the extended target space is indeed the response of a ReLU network.

On the existence of optimal shallow feedforward networks with ReLU activation

TL;DR

Abstract

Paper Structure (3 sections, 4 theorems, 80 equations)

This paper contains 3 sections, 4 theorems, 80 equations.

Introduction
Generalized response of neural networks
Strict generalized responses are not better than representable ones

Key Result

Theorem 1.1

Let $d_{\mathrm{in}}, d \in {\mathbb N}$, $p > 1$ and $\mathfrak{d} = ( d_{\mathrm{in}} + 2 ) d + 1$. Let $f \colon {\mathbb R}^{ d_{\mathrm{in}} } \to {\mathbb R}$ and $h \colon {\mathbb R}^{ d_{\mathrm{in}} } \to [0,\infty)$ be continuous functions and assume that $h^{ - 1 }( (0,\infty) )$ is Then there exists $\theta \in {\mathbb R}^{ \mathfrak{d} }$ such that $\mathrm{err}( \theta ) = \

Theorems & Definitions (10)

Theorem 1.1
Theorem 1.2
Example 1.3: Regression problem
Definition 2.1
Remark 2.2
Definition 2.3
Proposition 2.4
proof
Proposition 3.1
proof

On the existence of optimal shallow feedforward networks with ReLU activation

TL;DR

Abstract

On the existence of optimal shallow feedforward networks with ReLU activation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (10)