Training Feedforward Neural Networks with Bayesian Hyper-Heuristics

Arné Schreuder; Anna Bosman; Andries Engelbrecht; Christopher Cleghorn

Training Feedforward Neural Networks with Bayesian Hyper-Heuristics

Arné Schreuder, Anna Bosman, Andries Engelbrecht, Christopher Cleghorn

TL;DR

This research introduces a novel population-based Bayesian hyper-heuristic (BHH) that is used to train feedforward neural networks (FFNNs) and provides an automated method for finding the best heuristic to train the FFNNs at various stages of the training process.

Abstract

The process of training feedforward neural networks (FFNNs) can benefit from an automated process where the best heuristic to train the network is sought out automatically by means of a high-level probabilistic-based heuristic. This research introduces a novel population-based Bayesian hyper-heuristic (BHH) that is used to train feedforward neural networks (FFNNs). The performance of the BHH is compared to that of ten popular low-level heuristics, each with different search behaviours. The chosen heuristic pool consists of classic gradient-based heuristics as well as meta-heuristics (MHs). The empirical process is executed on fourteen datasets consisting of classification and regression problems with varying characteristics. The BHH is shown to be able to train FFNNs well and provide an automated method for finding the best heuristic to train the FFNNs at various stages of the training process.

Training Feedforward Neural Networks with Bayesian Hyper-Heuristics

TL;DR

Abstract

Paper Structure (30 sections, 10 equations, 4 figures, 8 tables, 1 algorithm)

This paper contains 30 sections, 10 equations, 4 figures, 8 tables, 1 algorithm.

Introduction
Artifical Neural Networks
Heuristics
Hyper-Heuristics
Probability
Bayesian Hyper-Heuristics
Heuristic Pool
Proxies
Entity Pool
Performance Log
Credit Assignment Strategy
Selection Mechanism
Optimisation Step
Maximum A Posteriori Estimation
Hyper-Parameters
...and 15 more sections

Figures (4)

Figure 1: An illustration of the architecture and high level components of the BHH.
Figure 2: Mapping of proxied heuristic state update operations as implemented by the BHH
Figure 3: Descriptive plots for the average ranks of all low-level heuristics compared to three heuristic pool variants of the BHH baseline configuration, per dataset, across all independent runs and epochs.
Figure 4: Critical difference plots for the average ranks of all low-level heuristics compared to three heuristic pool variants of the baseline BHH, across all datasets, runs and epochs.

Training Feedforward Neural Networks with Bayesian Hyper-Heuristics

TL;DR

Abstract

Training Feedforward Neural Networks with Bayesian Hyper-Heuristics

Authors

TL;DR

Abstract

Table of Contents

Figures (4)