Table of Contents
Fetching ...

Efficient Vectorized Backpropagation Algorithms for Training Feedforward Networks Composed of Quadratic Neurons

Mathew Mithra Noel, Venkataraman Muthiah-Nakarajan, Yug D Oswal

TL;DR

This work introduces fully vectorized training for quadratic neural networks (QNNs) and a reduced-parameter variant (RPQNN), enabling efficient backpropagation with quadratic decision boundaries. It derives forward and backward propagation formulas, demonstrates XOR learning with a single quadratic neuron, and proves that a final layer of quadratic neurons can separate datasets composed of $\mathcal{C}$ bounded clusters with a single layer of $\mathcal{C}$ neurons. The paper provides a thorough complexity analysis showing $O(n^3)$ time/space per layer for QNNs versus $O(n^2)$ for standard ANNs and RPQNNs, and validates the approach on nonlinear cluster data and MNIST where QNN/RPQNN offer accuracy gains with smaller networks. These results suggest quadratic neurons can provide substantial expressivity with manageable computational costs, motivating further exploration in compact architectures and other domains.

Abstract

Higher order artificial neurons whose outputs are computed by applying an activation function to a higher order multinomial function of the inputs have been considered in the past, but did not gain acceptance due to the extra parameters and computational cost. However, higher order neurons have significantly greater learning capabilities since the decision boundaries of higher order neurons can be complex surfaces instead of just hyperplanes. The boundary of a single quadratic neuron can be a general hyper-quadric surface allowing it to learn many nonlinearly separable datasets. Since quadratic forms can be represented by symmetric matrices, only $\frac{n(n+1)}{2}$ additional parameters are needed instead of $n^2$. A quadratic Logistic regression model is first presented. Solutions to the XOR problem with a single quadratic neuron are considered. The complete vectorized equations for both forward and backward propagation in feedforward networks composed of quadratic neurons are derived. A reduced parameter quadratic neural network model with just $ n $ additional parameters per neuron that provides a compromise between learning ability and computational cost is presented. Comparison on benchmark classification datasets are used to demonstrate that a final layer of quadratic neurons enables networks to achieve higher accuracy with significantly fewer hidden layer neurons. In particular this paper shows that any dataset composed of $\mathcal{C}$ bounded clusters can be separated with only a single layer of $\mathcal{C}$ quadratic neurons.

Efficient Vectorized Backpropagation Algorithms for Training Feedforward Networks Composed of Quadratic Neurons

TL;DR

This work introduces fully vectorized training for quadratic neural networks (QNNs) and a reduced-parameter variant (RPQNN), enabling efficient backpropagation with quadratic decision boundaries. It derives forward and backward propagation formulas, demonstrates XOR learning with a single quadratic neuron, and proves that a final layer of quadratic neurons can separate datasets composed of bounded clusters with a single layer of neurons. The paper provides a thorough complexity analysis showing time/space per layer for QNNs versus for standard ANNs and RPQNNs, and validates the approach on nonlinear cluster data and MNIST where QNN/RPQNN offer accuracy gains with smaller networks. These results suggest quadratic neurons can provide substantial expressivity with manageable computational costs, motivating further exploration in compact architectures and other domains.

Abstract

Higher order artificial neurons whose outputs are computed by applying an activation function to a higher order multinomial function of the inputs have been considered in the past, but did not gain acceptance due to the extra parameters and computational cost. However, higher order neurons have significantly greater learning capabilities since the decision boundaries of higher order neurons can be complex surfaces instead of just hyperplanes. The boundary of a single quadratic neuron can be a general hyper-quadric surface allowing it to learn many nonlinearly separable datasets. Since quadratic forms can be represented by symmetric matrices, only additional parameters are needed instead of . A quadratic Logistic regression model is first presented. Solutions to the XOR problem with a single quadratic neuron are considered. The complete vectorized equations for both forward and backward propagation in feedforward networks composed of quadratic neurons are derived. A reduced parameter quadratic neural network model with just additional parameters per neuron that provides a compromise between learning ability and computational cost is presented. Comparison on benchmark classification datasets are used to demonstrate that a final layer of quadratic neurons enables networks to achieve higher accuracy with significantly fewer hidden layer neurons. In particular this paper shows that any dataset composed of bounded clusters can be separated with only a single layer of quadratic neurons.
Paper Structure (20 sections, 74 equations, 2 figures, 5 tables, 2 algorithms)

This paper contains 20 sections, 74 equations, 2 figures, 5 tables, 2 algorithms.

Figures (2)

  • Figure 1: A single quadratic neuron is able to separate the XOR dataset with a hyperbola or an ellipse. Two possible solutions are shown. (a) When $\mathbf{Q}$ is initalized with a random matrix, a hyperbolic decision boundary is obtained. (b) When $\mathbf{Q}$ is initalized with the Identity matrix, an ellipsoidal decision boundary is obtained.
  • Figure 2: A dataset that is not linearly separable and consists of 6 clusters. A single-layer 6 neuron QNN is able to successfully learn this dataset. Different classes and the associated neuronal decision boundaries are shown in the same color.