Table of Contents
Fetching ...

Comparative Study of CNN and RNN for Natural Language Processing

Wenpeng Yin, Katharina Kann, Mo Yu, Hinrich Schütze

TL;DR

The paper addresses how to choose between CNNs and RNNs for NLP by conducting a broad, controlled comparison across tasks ranging from sentiment analysis to semantic matching. It systematically evaluates CNN, GRU, and LSTM architectures with training-from-scratch setups and per-task hyperparameter tuning, revealing when each architecture excels. The findings show that RNNs generally capture global sentence structure and long-range dependencies, while CNNs excel on tasks driven by local key phrases; both offer complementary information depending on the task. The work provides practical guidance for architecture selection and highlights the sensitivity of performance to hidden size and batch size.

Abstract

Deep neural networks (DNN) have revolutionized the field of natural language processing (NLP). Convolutional neural network (CNN) and recurrent neural network (RNN), the two main types of DNN architectures, are widely explored to handle various NLP tasks. CNN is supposed to be good at extracting position-invariant features and RNN at modeling units in sequence. The state of the art on many NLP tasks often switches due to the battle between CNNs and RNNs. This work is the first systematic comparison of CNN and RNN on a wide range of representative NLP tasks, aiming to give basic guidance for DNN selection.

Comparative Study of CNN and RNN for Natural Language Processing

TL;DR

The paper addresses how to choose between CNNs and RNNs for NLP by conducting a broad, controlled comparison across tasks ranging from sentiment analysis to semantic matching. It systematically evaluates CNN, GRU, and LSTM architectures with training-from-scratch setups and per-task hyperparameter tuning, revealing when each architecture excels. The findings show that RNNs generally capture global sentence structure and long-range dependencies, while CNNs excel on tasks driven by local key phrases; both offer complementary information depending on the task. The work provides practical guidance for architecture selection and highlights the sensitivity of performance to hidden size and batch size.

Abstract

Deep neural networks (DNN) have revolutionized the field of natural language processing (NLP). Convolutional neural network (CNN) and recurrent neural network (RNN), the two main types of DNN architectures, are widely explored to handle various NLP tasks. CNN is supposed to be good at extracting position-invariant features and RNN at modeling units in sequence. The state of the art on many NLP tasks often switches due to the battle between CNNs and RNNs. This work is the first systematic comparison of CNN and RNN on a wide range of representative NLP tasks, aiming to give basic guidance for DNN selection.

Paper Structure

This paper contains 16 sections, 3 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Three typical DNN architectures
  • Figure 2: Distributions of sentence lengths (left) and accuracies of different length ranges (right).
  • Figure 3: Accuracy for sentiment classification (top) and MRR for WikiQA (bottom) as a function of three hyperparameters: learning rate (left), hidden size (center) and batch size (right).