Table of Contents
Fetching ...

PropNEAT -- Efficient GPU-Compatible Backpropagation over NeuroEvolutionary Augmenting Topology Networks

Michael Merry, Patricia Riddle, Jim Warren

Abstract

We introduce PropNEAT, a fast backpropagation implementation of NEAT that uses a bidirectional mapping of the genome graph to a layer-based architecture that preserves the NEAT genomes whilst enabling efficient GPU backpropagation. We test PropNEAT on 58 binary classification datasets from the Penn Machine Learning Benchmarks database, comparing the performance against logistic regression, dense neural networks and random forests, as well as a densely retrained variant of the final PropNEAT model. PropNEAT had the second best overall performance, behind Random Forest, though the difference between the models was not statistically significant apart from between Random Forest in comparison with logistic regression and the PropNEAT retrain models. PropNEAT was substantially faster than a naive backpropagation method, and both were substantially faster and had better performance than the original NEAT implementation. We demonstrate that the per-epoch training time for PropNEAT scales linearly with network depth, and is efficient on GPU implementations for backpropagation. This implementation could be extended to support reinforcement learning or convolutional networks, and is able to find sparser and smaller networks with potential for applications in low-power contexts.

PropNEAT -- Efficient GPU-Compatible Backpropagation over NeuroEvolutionary Augmenting Topology Networks

Abstract

We introduce PropNEAT, a fast backpropagation implementation of NEAT that uses a bidirectional mapping of the genome graph to a layer-based architecture that preserves the NEAT genomes whilst enabling efficient GPU backpropagation. We test PropNEAT on 58 binary classification datasets from the Penn Machine Learning Benchmarks database, comparing the performance against logistic regression, dense neural networks and random forests, as well as a densely retrained variant of the final PropNEAT model. PropNEAT had the second best overall performance, behind Random Forest, though the difference between the models was not statistically significant apart from between Random Forest in comparison with logistic regression and the PropNEAT retrain models. PropNEAT was substantially faster than a naive backpropagation method, and both were substantially faster and had better performance than the original NEAT implementation. We demonstrate that the per-epoch training time for PropNEAT scales linearly with network depth, and is efficient on GPU implementations for backpropagation. This implementation could be extended to support reinforcement learning or convolutional networks, and is able to find sparser and smaller networks with potential for applications in low-power contexts.

Paper Structure

This paper contains 33 sections, 12 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: An example network produced by NEAT - Blue: Inputs; Green: Outputs; White: Hidden nodes; Red: Unreachable nodes. This graph exemplifies the major challenges of the naive approach to backpropagation (see 3.3). First, there are unreachable nodes X and Y which either do not connect to the inputs or do not connect to the output through their directed graph. X cannot be reached from the inputs and Y does not connect to the output. Second, there are skip-layer connections such as from 2 to the output. This has two paths 2-3-O and 2-O resulting in a skip-layer effect. If implemented naively this requires 13 operations (one for each connection).
  • Figure 2: This shows the resulting solution produced by PropNEAT. The graph traversals identify the unreachable nodes and these are removed. The nodes of the same depth from the input are grouped into layers, in this case [1,2] and [3,4,5] as depth 1 and 2 respectively. Where there are skip layers (e.g., 2-O), the outputs of the shallower layer are concatenated to the outputs of subsequent layer and otherwise treated as normal. The subsequent weights layer is then applied across all of these inputs. This provides a consistent layer-based structure that can be mapped to the tensor algebra operations. After the graph-traversal operations, and excluding concatenation as trivial, this requires 3 tensor operations (one for each layer connection).
  • Figure 3: Scatter, correlation and histogram plots for PropNEAT Model complexity over all iterations for all datasets. "True" models are the highest-performing on validation data, used for final analysis. "False" models are other candidates with lower validation performance. Significance is shown with * indicating p<0.05, ** indicating p<0.01, *** indicating p<0.001
  • Figure 4: Correlation plot of average time per epoch (s), width (n nodes), depth (n layers) and size (n nodes). Colours are datasets. Significance is shown with * indicating p<0.05, ** indicating p<0.01, *** indicating p<0.001
  • Figure 5: This shows the resulting mapping produced by PropNEAT. The graph traversals identify the unreachable nodes and these are removed. The nodes of the same depth from the input are grouped into layers, in this case [1,2] and [3,4,5] as depth 1 and 2 respectively. Where there are skip layers (e.g., 2-O), the outputs of the shallower layer concatenated to the outputs of subsequent layer and otherwise treated as normal. The subsequent weights layer is then applied across all of these inputs. This provides a consistent layer-based structure that can be mapped to the tensor algebra operations. After the graph-traversal operations, and excluding concatenation as trivial, this requires 3 tensor operations (one for each layer connection).