Comparative Study of CNN and RNN for Natural Language Processing
Wenpeng Yin, Katharina Kann, Mo Yu, Hinrich Schütze
TL;DR
The paper addresses how to choose between CNNs and RNNs for NLP by conducting a broad, controlled comparison across tasks ranging from sentiment analysis to semantic matching. It systematically evaluates CNN, GRU, and LSTM architectures with training-from-scratch setups and per-task hyperparameter tuning, revealing when each architecture excels. The findings show that RNNs generally capture global sentence structure and long-range dependencies, while CNNs excel on tasks driven by local key phrases; both offer complementary information depending on the task. The work provides practical guidance for architecture selection and highlights the sensitivity of performance to hidden size and batch size.
Abstract
Deep neural networks (DNN) have revolutionized the field of natural language processing (NLP). Convolutional neural network (CNN) and recurrent neural network (RNN), the two main types of DNN architectures, are widely explored to handle various NLP tasks. CNN is supposed to be good at extracting position-invariant features and RNN at modeling units in sequence. The state of the art on many NLP tasks often switches due to the battle between CNNs and RNNs. This work is the first systematic comparison of CNN and RNN on a wide range of representative NLP tasks, aiming to give basic guidance for DNN selection.
