Table of Contents
Fetching ...

Classification with neural networks with quadratic decision functions

Leon Frischauf, Otmar Scherzer, Cong Shi

TL;DR

This paper test and compare the algorithm on the MNIST dataset for classification of handwritten digits and for classification of subspecies and shows, that the implementation can be based on the neural network structure in the software Tensorflow and Keras, respectively.

Abstract

Neural networks with quadratic decision functions have been introduced as alternatives to standard neural networks with affine linear ones. They are advantageous when the objects or classes to be identified are compact and of basic geometries like circles, ellipses etc. In this paper we investigate the use of such ansatz functions for classification. In particular we test and compare the algorithm on the MNIST dataset for classification of handwritten digits and for classification of subspecies. We also show, that the implementation can be based on the neural network structure in the software Tensorflow and Keras, respectively.

Classification with neural networks with quadratic decision functions

TL;DR

This paper test and compare the algorithm on the MNIST dataset for classification of handwritten digits and for classification of subspecies and shows, that the implementation can be based on the neural network structure in the software Tensorflow and Keras, respectively.

Abstract

Neural networks with quadratic decision functions have been introduced as alternatives to standard neural networks with affine linear ones. They are advantageous when the objects or classes to be identified are compact and of basic geometries like circles, ellipses etc. In this paper we investigate the use of such ansatz functions for classification. In particular we test and compare the algorithm on the MNIST dataset for classification of handwritten digits and for classification of subspecies. We also show, that the implementation can be based on the neural network structure in the software Tensorflow and Keras, respectively.
Paper Structure (10 sections, 2 theorems, 23 equations, 22 figures, 3 tables, 2 algorithms)

This paper contains 10 sections, 2 theorems, 23 equations, 22 figures, 3 tables, 2 algorithms.

Key Result

Lemma 2.1

Let $\sigma:\mathbb{R} \to \mathbb{R}$ be a continuous discriminatory function A function $\sigma : \mathbb{R} \to \mathbb{R}$ is called discriminatory (see Cyb89) if every measure $\mu$ on $[0,1]^n$, which satisfies $\int_{[0,1]^n} \sigma ({\bf w}^T {\vec{x}} + \theta)\,d\mu({\vec{x}}) = 0$ for all where $\Psi[{\vec{p}}]$ is a RQNN from eq:radial_approximation.

Figures (22)

  • Figure 1: Ground truth data for the subspecies classification experiment. Only part of the $5000$ training data are shown.
  • Figure 2: Neural network structure of a shallow neural network. The output layer gives a number, can be interpreted as a probability, on which classification is performed by taking a threshold. $f$ in \ref{['eq:nnk']} maps input to category label, composing a neural and a decision function.
  • Figure 3: Neural network structure of a deep neural network.
  • Figure 4: Classification via RQNN and $10$ training epochs
  • Figure 5: Classification via ALNN and $10$ training epochs
  • ...and 17 more figures

Theorems & Definitions (9)

  • Definition 1: Neural networks with affine linear decision functions
  • Definition 2: Neural networks with deep affine linear decision functions
  • Definition 3: Neural networks with radial quadratic decision functions (RQNN)
  • Lemma 2.1: $L^\infty$-convergence FriSchShi24
  • Theorem 1: $L^2$-convergence rate of RQNN FriSchShi24
  • Remark 1
  • Remark 2
  • Definition 4: Clustering
  • Definition 5: Classification