Table of Contents
Fetching ...

Impacts of Darwinian Evolution on Pre-trained Deep Neural Networks

Guodong Du, Runhua Jiang, Senqiao Yang, Haoyang Li, Wei Chen, Keren Li, Sim Kuan Goh, Ho-Kin Tang

TL;DR

The paper addresses the inefficiencies and overfitting tendencies of gradient-based training by introducing a Darwinian framework that treats pretrained DNNs as evolving populations. It uses two stages: BP-based pretraining to seed a population of weights, followed by differential evolution to iteratively mutate, recombine, and select fitter networks using the cross-entropy loss as the fitness proxy. Empirical results across MNIST, Fashion-MNIST, CIFAR-10/100, and ImageNet show that this neuro-evolution approach improves accuracy, reduces overfitting without explicit regularization in the DE phase, and enhances robustness to corruptions such as MNIST-C and CIFAR-10-C, with lower time complexity than standard backpropagation. The framework generalizes across architectures and data scales, suggesting practical applicability and potential for adaptive DE refinements in deeper, larger models.

Abstract

Darwinian evolution of the biological brain is documented through multiple lines of evidence, although the modes of evolutionary changes remain unclear. Drawing inspiration from the evolved neural systems (e.g., visual cortex), deep learning models have demonstrated superior performance in visual tasks, among others. While the success of training deep neural networks has been relying on back-propagation (BP) and its variants to learn representations from data, BP does not incorporate the evolutionary processes that govern biological neural systems. This work proposes a neural network optimization framework based on evolutionary theory. Specifically, BP-trained deep neural networks for visual recognition tasks obtained from the ending epochs are considered the primordial ancestors (initial population). Subsequently, the population evolved with differential evolution. Extensive experiments are carried out to examine the relationships between Darwinian evolution and neural network optimization, including the correspondence between datasets, environment, models, and living species. The empirical results show that the proposed framework has positive impacts on the network, with reduced over-fitting and an order of magnitude lower time complexity compared to BP. Moreover, the experiments show that the proposed framework performs well on deep neural networks and big datasets.

Impacts of Darwinian Evolution on Pre-trained Deep Neural Networks

TL;DR

The paper addresses the inefficiencies and overfitting tendencies of gradient-based training by introducing a Darwinian framework that treats pretrained DNNs as evolving populations. It uses two stages: BP-based pretraining to seed a population of weights, followed by differential evolution to iteratively mutate, recombine, and select fitter networks using the cross-entropy loss as the fitness proxy. Empirical results across MNIST, Fashion-MNIST, CIFAR-10/100, and ImageNet show that this neuro-evolution approach improves accuracy, reduces overfitting without explicit regularization in the DE phase, and enhances robustness to corruptions such as MNIST-C and CIFAR-10-C, with lower time complexity than standard backpropagation. The framework generalizes across architectures and data scales, suggesting practical applicability and potential for adaptive DE refinements in deeper, larger models.

Abstract

Darwinian evolution of the biological brain is documented through multiple lines of evidence, although the modes of evolutionary changes remain unclear. Drawing inspiration from the evolved neural systems (e.g., visual cortex), deep learning models have demonstrated superior performance in visual tasks, among others. While the success of training deep neural networks has been relying on back-propagation (BP) and its variants to learn representations from data, BP does not incorporate the evolutionary processes that govern biological neural systems. This work proposes a neural network optimization framework based on evolutionary theory. Specifically, BP-trained deep neural networks for visual recognition tasks obtained from the ending epochs are considered the primordial ancestors (initial population). Subsequently, the population evolved with differential evolution. Extensive experiments are carried out to examine the relationships between Darwinian evolution and neural network optimization, including the correspondence between datasets, environment, models, and living species. The empirical results show that the proposed framework has positive impacts on the network, with reduced over-fitting and an order of magnitude lower time complexity compared to BP. Moreover, the experiments show that the proposed framework performs well on deep neural networks and big datasets.
Paper Structure (9 sections, 7 equations, 4 figures, 5 tables, 1 algorithm)

This paper contains 9 sections, 7 equations, 4 figures, 5 tables, 1 algorithm.

Figures (4)

  • Figure 1: The conceptual illustration of our proposed Darwinian evolution on DNNs with primordial ancestor. In analogy to Darwinian evolution, the dataset provides the environment where different types of DNNs strike to survive. The neuro-evolution (natural selection and inheritance) applies to different network architectures, as well as trainable weights in the same architect. Pretrained DNNs are used as the primordial ancestors for EAs to evolve and select the 'elite' solution. The complexity of the EA algorithm is low compared to the backpropagation.
  • Figure 2: The starting point of neuro-evolution: primordial ancestor v.s. primordial soup. The left and middle figures illustrate how the losses and accuracies change over the epoch for DNNs trained by Adam (blue curves) and DE (red markers) using Adam as the primordial ancestor, with and without regularization. It is observed that Adam requires regularization to handle overfitting, while DE does not. The right figure evolves a neural network from the primordial soup using random initialization, which has difficulty in convergence.
  • Figure 3: The impact of environmental changes on the neural network during training and testing. The upper left figure shows the relationships between the accuracy and the complexity of the dataset (MNIST with data augmentation) and model (LeNet). As complexity increases, the accuracy increases for both Adam and DE. DE is observed to achieve higher accuracy compared to ADAM. The upper right figure illustrates the influence of regularization, which has a positive impact on improving accuracy. The two figures at the bottom show the performance of ADAM and DE on the corrupted and out-of-distribution MNIST-C and CIFAR-10-C, using LeNet1, LeNet5, and ResNet. DE is found to have a lower mean corruption error than Adam.
  • Figure 4: Generalization of DE on different datasets and deep learning models. Datasets include MNIST, Fashion MNISTXiao2017-hq, CIFAR-100, and ImageNet, while models include: LeNet1, LeNet5, MLP, RNN, ResNet18. The blue lines indicate Adam as the optimizer, and the red points indicate DE as the optimizer.