Table of Contents
Fetching ...

Training neural networks without backpropagation using particles

Deepak Kumar

TL;DR

This work tackles the challenge of training neural networks without backpropagation by introducing a neuron-wise particle swarm optimization framework. Each neuron is treated as an independent subproblem with multiple particles, and their best particle weights are assembled to form the full network, enabling batch-wise, derivative-free learning. Experiments on synthetic datasets and real-world Rice and Dry bean datasets show performance comparable to standard MLPs trained with backpropagation, while offering parallelizable updates and flexible connectivity. The approach provides a scalable alternative to gradient-based optimization and suggests avenues for extending to other architectures, with code available at the provided GitHub repository.

Abstract

Neural networks are a group of neurons stacked together in multiple layers to mimic the biological neurons in a human brain. Neural networks have been trained using the backpropagation algorithm based on gradient descent strategy for several decades. Several variants have been developed to improve the backpropagation algorithm. The loss function for the neural network is optimized through backpropagation, but several local minima exist in the manifold of the constructed neural network. We obtain several solutions matching the minima. The gradient descent strategy cannot avoid the problem of local minima and gets stuck in the minima due to the initialization. Particle swarm optimization (PSO) was proposed to select the best local minima among the search space of the loss function. The search space is limited to the instantiated particles in the PSO algorithm, and sometimes it cannot select the best solution. In the proposed approach, we overcome the problem of gradient descent and the limitation of the PSO algorithm by training individual neurons separately, capable of collectively solving the problem as a group of neurons forming a network. Our code and data are available at https://github.com/dipkmr/train-nn-wobp/

Training neural networks without backpropagation using particles

TL;DR

This work tackles the challenge of training neural networks without backpropagation by introducing a neuron-wise particle swarm optimization framework. Each neuron is treated as an independent subproblem with multiple particles, and their best particle weights are assembled to form the full network, enabling batch-wise, derivative-free learning. Experiments on synthetic datasets and real-world Rice and Dry bean datasets show performance comparable to standard MLPs trained with backpropagation, while offering parallelizable updates and flexible connectivity. The approach provides a scalable alternative to gradient-based optimization and suggests avenues for extending to other architectures, with code available at the provided GitHub repository.

Abstract

Neural networks are a group of neurons stacked together in multiple layers to mimic the biological neurons in a human brain. Neural networks have been trained using the backpropagation algorithm based on gradient descent strategy for several decades. Several variants have been developed to improve the backpropagation algorithm. The loss function for the neural network is optimized through backpropagation, but several local minima exist in the manifold of the constructed neural network. We obtain several solutions matching the minima. The gradient descent strategy cannot avoid the problem of local minima and gets stuck in the minima due to the initialization. Particle swarm optimization (PSO) was proposed to select the best local minima among the search space of the loss function. The search space is limited to the instantiated particles in the PSO algorithm, and sometimes it cannot select the best solution. In the proposed approach, we overcome the problem of gradient descent and the limitation of the PSO algorithm by training individual neurons separately, capable of collectively solving the problem as a group of neurons forming a network. Our code and data are available at https://github.com/dipkmr/train-nn-wobp/

Paper Structure

This paper contains 16 sections, 5 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: A three layer (MLP) neural network consisting of input, hidden, and output layers. There are 3, 6, and 2 neurons in input, hidden, and output layers, respectively for illustration purpose.
  • Figure 2: A three layer neural network with each node showing 'k' particles. The 'k' particles update their weights using PSO approach. The number of particles is 5 for illustration purpose in the network.
  • Figure 3: The flowchart of the proposed method including a batchwise style of training the network.
  • Figure 4: The synthetic data with linearly separable classes. The normalized loss function after each epoch is shown for basic MLP architecture, PSO without craziness term, the proposed method using Octave and Python.
  • Figure 5: The synthetic data with nonlinearly separable classes. The normalized loss function after each epoch is shown for basic MLP architecture, PSO without craziness term, the proposed method using Octave and Python.
  • ...and 3 more figures