Recent Advances in Recurrent Neural Networks
Hojjat Salehinejad, Sharan Sankar, Joseph Barfett, Errol Colak, Shahrokh Valaee
TL;DR
<3-5 sentence high-level summary> The paper surveys recurrent neural networks (RNNs), detailing their fundamentals, training challenges such as vanishing/exploding gradients, and a comprehensive taxonomy of architectures from simple RNNs to LSTMs, GRUs, and memory-augmented variants. It reviews optimization and regularization techniques, including BPTT, SGD/Adam, Hessian-free methods, and dropout-type schemes, highlighting how these address training stability and generalization. The survey covers a wide range of architectures (BRNN, MDLSTM, Grid LSTM, SCRN, unitary/orthogonal RNNs) and their applications across text, speech, image, and video domains, illustrating both successes and open challenges. It also discusses potential directions such as unitary/orthogonal RNNs, deeper integration with external memory, and domain-specific extensions to 3D data and multimedia signals.
Abstract
Recurrent neural networks (RNNs) are capable of learning features and long term dependencies from sequential and time-series data. The RNNs have a stack of non-linear units where at least one connection between units forms a directed cycle. A well-trained RNN can model any dynamical system; however, training RNNs is mostly plagued by issues in learning long-term dependencies. In this paper, we present a survey on RNNs and several new advances for newcomers and professionals in the field. The fundamentals and recent advances are explained and the research challenges are introduced.
