Table of Contents
Fetching ...

Fine-tuning BERT with Bidirectional LSTM for Fine-grained Movie Reviews Sentiment Analysis

Gibson Nkhata, Susan Gauch, Usman Anjum, Justin Zhan

TL;DR

The paper presents a BERT-based framework augmented with BiLSTM to perform both binary and fine-grained sentiment analysis on movie reviews, introducing a heuristic to compute an overall polarity from per-review predictions. It evaluates on diverse datasets (IMDb, SST, MR, Amazon) and explores accuracy-enhancement strategies, finding NLPAUG improves five-class SST-5 performance to 60.48% while SMOTE provides little or no gain. The approach achieves competitive or state-of-the-art results, including 97.67% accuracy on IMDb binary and 59.48% on SST-5, surpassing baselines in several settings. The work also contributes a method to derive an aggregate sentiment polarity from the model outputs, showing robustness across dataset variants and classification schemes, with potential for extension to other pretrained models.

Abstract

Sentiment Analysis (SA) is instrumental in understanding peoples viewpoints facilitating social media monitoring recognizing products and brands and gauging customer satisfaction. Consequently SA has evolved into an active research domain within Natural Language Processing (NLP). Many approaches outlined in the literature devise intricate frameworks aimed at achieving high accuracy, focusing exclusively on either binary sentiment classification or fine-grained sentiment classification. In this paper our objective is to fine-tune the pre-trained BERT model with Bidirectional LSTM (BiLSTM) to enhance both binary and fine-grained SA specifically for movie reviews. Our approach involves conducting sentiment classification for each review followed by computing the overall sentiment polarity across all reviews. We present our findings on binary classification as well as fine-grained classification utilizing benchmark datasets. Additionally we implement and assess two accuracy improvement techniques Synthetic Minority Oversampling Technique (SMOTE) and NLP Augmenter (NLPAUG) to bolster the models generalization in fine-grained sentiment classification. Finally a heuristic algorithm is employed to calculate the overall polarity of predicted reviews from the BERT+BiLSTM output vector. Our approach performs comparably with state-of-the-art (SOTA) techniques in both classifications. For instance in binary classification we achieve 97.67% accuracy surpassing the leading SOTA model NB-weighted-BON+dv-cosine by 0.27% on the renowned IMDb dataset. Conversely for five-class classification on SST-5 while the top SOTA model RoBERTa+large+Self-explaining attains 55.5% accuracy our model achieves 59.48% accuracy surpassing the BERT-large baseline by 3.6%.

Fine-tuning BERT with Bidirectional LSTM for Fine-grained Movie Reviews Sentiment Analysis

TL;DR

The paper presents a BERT-based framework augmented with BiLSTM to perform both binary and fine-grained sentiment analysis on movie reviews, introducing a heuristic to compute an overall polarity from per-review predictions. It evaluates on diverse datasets (IMDb, SST, MR, Amazon) and explores accuracy-enhancement strategies, finding NLPAUG improves five-class SST-5 performance to 60.48% while SMOTE provides little or no gain. The approach achieves competitive or state-of-the-art results, including 97.67% accuracy on IMDb binary and 59.48% on SST-5, surpassing baselines in several settings. The work also contributes a method to derive an aggregate sentiment polarity from the model outputs, showing robustness across dataset variants and classification schemes, with potential for extension to other pretrained models.

Abstract

Sentiment Analysis (SA) is instrumental in understanding peoples viewpoints facilitating social media monitoring recognizing products and brands and gauging customer satisfaction. Consequently SA has evolved into an active research domain within Natural Language Processing (NLP). Many approaches outlined in the literature devise intricate frameworks aimed at achieving high accuracy, focusing exclusively on either binary sentiment classification or fine-grained sentiment classification. In this paper our objective is to fine-tune the pre-trained BERT model with Bidirectional LSTM (BiLSTM) to enhance both binary and fine-grained SA specifically for movie reviews. Our approach involves conducting sentiment classification for each review followed by computing the overall sentiment polarity across all reviews. We present our findings on binary classification as well as fine-grained classification utilizing benchmark datasets. Additionally we implement and assess two accuracy improvement techniques Synthetic Minority Oversampling Technique (SMOTE) and NLP Augmenter (NLPAUG) to bolster the models generalization in fine-grained sentiment classification. Finally a heuristic algorithm is employed to calculate the overall polarity of predicted reviews from the BERT+BiLSTM output vector. Our approach performs comparably with state-of-the-art (SOTA) techniques in both classifications. For instance in binary classification we achieve 97.67% accuracy surpassing the leading SOTA model NB-weighted-BON+dv-cosine by 0.27% on the renowned IMDb dataset. Conversely for five-class classification on SST-5 while the top SOTA model RoBERTa+large+Self-explaining attains 55.5% accuracy our model achieves 59.48% accuracy surpassing the BERT-large baseline by 3.6%.

Paper Structure

This paper contains 41 sections, 2 equations, 4 figures, 6 tables, 4 algorithms.

Figures (4)

  • Figure 1: Simplified diagram of BERT
  • Figure 2: Fine-tuning part of BERT with BiLSTM
  • Figure 3: Binary Tree Splitting
  • Figure 4: Overview of our work