Table of Contents
Fetching ...

QuasiNet: a neural network with trainable product layers

Kristína Malinovská, Slavomír Holenda, Ľudovít Malinovský

TL;DR

QuasiNet introduces trainable product layers via quasi-exponentiation to address hard logical and pattern-recognition tasks (e.g., XOR, parity, two spirals) where small hidden networks struggle. By combining a tanh-activated hidden layer with a product-based output layer and learning the multiplicative interactions through gradient descent, QuasiNet avoids complex-number computations and remains compatible with standard backpropagation. Empirical results show superior convergence and efficiency over a baseline MLP on XOR and parity, and strong performance on the two-spirals task with relatively few parameters. The work suggests broad applicability in deep learning and cognitive robotics, offering a principled way to incorporate learnable multiplicative interactions with potential improvements in explainability and efficiency.

Abstract

Classical neural networks achieve only limited convergence in hard problems such as XOR or parity when the number of hidden neurons is small. With the motivation to improve the success rate of neural networks in these problems, we propose a new neural network model inspired by existing neural network models with so called product neurons and a learning rule derived from classical error backpropagation, which elegantly solves the problem of mutually exclusive situations. Unlike existing product neurons, which have weights that are preset and not adaptable, our product layers of neurons also do learn. We tested the model and compared its success rate to a classical multilayer perceptron in the aforementioned problems as well as in other hard problems such as the two spirals. Our results indicate that our model is clearly more successful than the classical MLP and has the potential to be used in many tasks and applications.

QuasiNet: a neural network with trainable product layers

TL;DR

QuasiNet introduces trainable product layers via quasi-exponentiation to address hard logical and pattern-recognition tasks (e.g., XOR, parity, two spirals) where small hidden networks struggle. By combining a tanh-activated hidden layer with a product-based output layer and learning the multiplicative interactions through gradient descent, QuasiNet avoids complex-number computations and remains compatible with standard backpropagation. Empirical results show superior convergence and efficiency over a baseline MLP on XOR and parity, and strong performance on the two-spirals task with relatively few parameters. The work suggests broad applicability in deep learning and cognitive robotics, offering a principled way to incorporate learnable multiplicative interactions with potential improvements in explainability and efficiency.

Abstract

Classical neural networks achieve only limited convergence in hard problems such as XOR or parity when the number of hidden neurons is small. With the motivation to improve the success rate of neural networks in these problems, we propose a new neural network model inspired by existing neural network models with so called product neurons and a learning rule derived from classical error backpropagation, which elegantly solves the problem of mutually exclusive situations. Unlike existing product neurons, which have weights that are preset and not adaptable, our product layers of neurons also do learn. We tested the model and compared its success rate to a classical multilayer perceptron in the aforementioned problems as well as in other hard problems such as the two spirals. Our results indicate that our model is clearly more successful than the classical MLP and has the potential to be used in many tasks and applications.
Paper Structure (9 sections, 9 equations, 7 figures, 1 table)

This paper contains 9 sections, 9 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: The schematic depiction of our model adapted from ghosh1992.
  • Figure 2: Results from XOR experiments with varying hidden layer size (max. 500 epochs) in terms of convergence (left) and average training epochs to convergence (right), including non-converging runs in the mean.
  • Figure 3: Results from parity 7 experiments: convergence (top), training epochs (bottom) for MLP baseline (left) and QuasiNet (blue).
  • Figure 4: Results from parity 8 experiments with varying hidden layer size (100 nets, max. 5000 epochs) in terms of convergence (left) and average training epochs to convergence (right), including non-converging runs in the mean.
  • Figure 5: Results from $n$-parity experiments: minimum size of the hidden layer h for maximum number of converging networks.
  • ...and 2 more figures