Task-Specific Activation Functions for Neuroevolution using Grammatical Evolution
Benjamin David Winter, William John Teahan
TL;DR
This work addresses the challenge of selecting activation functions that suit a specific task by automatically evolving them with Grammatical Evolution. Using a Backus–Naur grammar, Neuvo GEAF maps genotypes to novel activation functions and integrates them into neural networks without increasing parameter counts. Across four binary classification datasets, task-specific evolved activations consistently improve F1-scores by 2.4%–9.4% over ReLU, demonstrating potential for edge-efficient, high-performance networks. The approach highlights the value of neuroevolution in designing activation functions and sets the stage for combining activation-function evolution with broader architecture search and multi-objective optimization.
Abstract
Activation functions play a critical role in the performance and behaviour of neural networks, significantly impacting their ability to learn and generalise. Traditional activation functions, such as ReLU, sigmoid, and tanh, have been widely used with considerable success. However, these functions may not always provide optimal performance for all tasks and datasets. In this paper, we introduce Neuvo GEAF - an innovative approach leveraging grammatical evolution (GE) to automatically evolve novel activation functions tailored to specific neural network architectures and datasets. Experiments conducted on well-known binary classification datasets show statistically significant improvements in F1-score (between 2.4% and 9.4%) over ReLU using identical network architectures. Notably, these performance gains were achieved without increasing the network's parameter count, supporting the trend toward more efficient neural networks that can operate effectively on resource-constrained edge devices. This paper's findings suggest that evolved activation functions can provide significant performance improvements for compact networks while maintaining energy efficiency during both training and inference phases.
