Table of Contents
Fetching ...

The Boundaries of Verifiable Accuracy, Robustness, and Generalisation in Deep Learning

Alexander Bastounis, Alexander N. Gorban, Anders C. Hansen, Desmond J. Higham, Danil Prokhorov, Oliver Sutton, Ivan Y. Tyukin, Qinghua Zhou

TL;DR

There is a large family of tasks for which computing and verifying ideal stable and accurate neural networks in the above settings is extremely challenging, if at all possible, even when such ideal solutions exist within the given class of neural architectures.

Abstract

In this work, we assess the theoretical limitations of determining guaranteed stability and accuracy of neural networks in classification tasks. We consider classical distribution-agnostic framework and algorithms minimising empirical risks and potentially subjected to some weights regularisation. We show that there is a large family of tasks for which computing and verifying ideal stable and accurate neural networks in the above settings is extremely challenging, if at all possible, even when such ideal solutions exist within the given class of neural architectures.

The Boundaries of Verifiable Accuracy, Robustness, and Generalisation in Deep Learning

TL;DR

There is a large family of tasks for which computing and verifying ideal stable and accurate neural networks in the above settings is extremely challenging, if at all possible, even when such ideal solutions exist within the given class of neural architectures.

Abstract

In this work, we assess the theoretical limitations of determining guaranteed stability and accuracy of neural networks in classification tasks. We consider classical distribution-agnostic framework and algorithms minimising empirical risks and potentially subjected to some weights regularisation. We show that there is a large family of tasks for which computing and verifying ideal stable and accurate neural networks in the above settings is extremely challenging, if at all possible, even when such ideal solutions exist within the given class of neural architectures.
Paper Structure (14 sections, 1 theorem, 25 equations)

This paper contains 14 sections, 1 theorem, 25 equations.

Key Result

Theorem 1

Consider the class of networks with architecture where $N_1\geq 2n$ and $N_2,\dots, N_{L-1}\geq 1$, and activation functions $g_\theta$ in layers $1,\dots,L-1$ satisfying conditions (eq:activation_functions), and the $\mathrm{sign}(\cdot)$ activation function in layer $L$. Let $\varepsilon\in(0,\sqrt{n}-1)$ and fix $0<\delta\leq \varepsilon/\sqrt{

Theorems & Definitions (1)

  • Theorem 1: Inevitability, typicality and undetectability of instability