Classification with neural networks with quadratic decision functions

Leon Frischauf; Otmar Scherzer; Cong Shi

Classification with neural networks with quadratic decision functions

Leon Frischauf, Otmar Scherzer, Cong Shi

TL;DR

This paper test and compare the algorithm on the MNIST dataset for classification of handwritten digits and for classification of subspecies and shows, that the implementation can be based on the neural network structure in the software Tensorflow and Keras, respectively.

Abstract

Neural networks with quadratic decision functions have been introduced as alternatives to standard neural networks with affine linear ones. They are advantageous when the objects or classes to be identified are compact and of basic geometries like circles, ellipses etc. In this paper we investigate the use of such ansatz functions for classification. In particular we test and compare the algorithm on the MNIST dataset for classification of handwritten digits and for classification of subspecies. We also show, that the implementation can be based on the neural network structure in the software Tensorflow and Keras, respectively.

Classification with neural networks with quadratic decision functions

TL;DR

Abstract

Paper Structure (10 sections, 2 theorems, 23 equations, 22 figures, 3 tables, 2 algorithms)

This paper contains 10 sections, 2 theorems, 23 equations, 22 figures, 3 tables, 2 algorithms.

Introduction
Background on neural network functions
Binary classification
Clustering
Classification
Binary classification
Implementation of RQNNs
Subspecies classification
Binary classification of handwritten digits
Conclusion

Key Result

Lemma 2.1

Let $\sigma:\mathbb{R} \to \mathbb{R}$ be a continuous discriminatory function A function $\sigma : \mathbb{R} \to \mathbb{R}$ is called discriminatory (see Cyb89) if every measure $\mu$ on $[0,1]^n$, which satisfies $\int_{[0,1]^n} \sigma ({\bf w}^T {\vec{x}} + \theta)\,d\mu({\vec{x}}) = 0$ for all where $\Psi[{\vec{p}}]$ is a RQNN from eq:radial_approximation.

Figures (22)

Figure 1: Ground truth data for the subspecies classification experiment. Only part of the $5000$ training data are shown.
Figure 2: Neural network structure of a shallow neural network. The output layer gives a number, can be interpreted as a probability, on which classification is performed by taking a threshold. $f$ in \ref{['eq:nnk']} maps input to category label, composing a neural and a decision function.
Figure 3: Neural network structure of a deep neural network.
Figure 4: Classification via RQNN and $10$ training epochs
Figure 5: Classification via ALNN and $10$ training epochs
...and 17 more figures

Theorems & Definitions (9)

Definition 1: Neural networks with affine linear decision functions
Definition 2: Neural networks with deep affine linear decision functions
Definition 3: Neural networks with radial quadratic decision functions (RQNN)
Lemma 2.1: $L^\infty$-convergence FriSchShi24
Theorem 1: $L^2$-convergence rate of RQNN FriSchShi24
Remark 1
Remark 2
Definition 4: Clustering
Definition 5: Classification

Classification with neural networks with quadratic decision functions

TL;DR

Abstract

Classification with neural networks with quadratic decision functions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (22)

Theorems & Definitions (9)