Activation Functions: Comparison of trends in Practice and Research for Deep Learning
Chigozie Nwankpa, Winifred Ijomah, Anthony Gachagan, Stephen Marshall
TL;DR
This survey catalogs activation functions used in deep learning, contrasts their theoretical advantages with practical deployment trends, and analyzes how state-of-the-art AFs from research have diffused into real-world architectures. It highlights a persistent reliance on ReLU and Softmax in practice despite evidence that newer AFs can offer training and generalization benefits. The work provides a consolidated reference of AF families, variants, and their reported benefits, and identifies gaps between research results and application choices. Implications include guiding practitioners in selecting activation functions and motivating future comparative studies on flagship architectures and datasets.
Abstract
Deep neural networks have been successfully used in diverse emerging domains to solve real world complex problems with may more deep learning(DL) architectures, being developed to date. To achieve these state-of-the-art performances, the DL architectures use activation functions (AFs), to perform diverse computations between the hidden layers and the output layers of any given DL architecture. This paper presents a survey on the existing AFs used in deep learning applications and highlights the recent trends in the use of the activation functions for deep learning applications. The novelty of this paper is that it compiles majority of the AFs used in DL and outlines the current trends in the applications and usage of these functions in practical deep learning deployments against the state-of-the-art research results. This compilation will aid in making effective decisions in the choice of the most suitable and appropriate activation function for any given application, ready for deployment. This paper is timely because most research papers on AF highlights similar works and results while this paper will be the first, to compile the trends in AF applications in practice against the research results from literature, found in deep learning research to date.
