Table of Contents
Fetching ...

Beyond the Star Rating: A Scalable Framework for Aspect-Based Sentiment Analysis Using LLMs and Text Classification

Vishal Patil, Shree Vaishnavi Bacha, Revanth Yamani, Yidan Sun, Mayank Kejriwal

TL;DR

This study proposes a hybrid approach that uses LLMs for aspect identification while employing classic machine-learning methods for sentiment classification at scale, demonstrating that combining LLMs with traditional machine learning approaches can effectively automate aspect-based sentiment analysis of large-scale customer feedback.

Abstract

Customer-provided reviews have become an important source of information for business owners and other customers alike. However, effectively analyzing millions of unstructured reviews remains challenging. While large language models (LLMs) show promise for natural language understanding, their application to large-scale review analysis has been limited by computational costs and scalability concerns. This study proposes a hybrid approach that uses LLMs for aspect identification while employing classic machine-learning methods for sentiment classification at scale. Using ChatGPT to analyze sampled restaurant reviews, we identified key aspects of dining experiences and developed sentiment classifiers using human-labeled reviews, which we subsequently applied to 4.7 million reviews collected over 17 years from a major online platform. Regression analysis reveals that our machine-labeled aspects significantly explain variance in overall restaurant ratings across different aspects of dining experiences, cuisines, and geographical regions. Our findings demonstrate that combining LLMs with traditional machine learning approaches can effectively automate aspect-based sentiment analysis of large-scale customer feedback, suggesting a practical framework for both researchers and practitioners in the hospitality industry and potentially, other service sectors.

Beyond the Star Rating: A Scalable Framework for Aspect-Based Sentiment Analysis Using LLMs and Text Classification

TL;DR

This study proposes a hybrid approach that uses LLMs for aspect identification while employing classic machine-learning methods for sentiment classification at scale, demonstrating that combining LLMs with traditional machine learning approaches can effectively automate aspect-based sentiment analysis of large-scale customer feedback.

Abstract

Customer-provided reviews have become an important source of information for business owners and other customers alike. However, effectively analyzing millions of unstructured reviews remains challenging. While large language models (LLMs) show promise for natural language understanding, their application to large-scale review analysis has been limited by computational costs and scalability concerns. This study proposes a hybrid approach that uses LLMs for aspect identification while employing classic machine-learning methods for sentiment classification at scale. Using ChatGPT to analyze sampled restaurant reviews, we identified key aspects of dining experiences and developed sentiment classifiers using human-labeled reviews, which we subsequently applied to 4.7 million reviews collected over 17 years from a major online platform. Regression analysis reveals that our machine-labeled aspects significantly explain variance in overall restaurant ratings across different aspects of dining experiences, cuisines, and geographical regions. Our findings demonstrate that combining LLMs with traditional machine learning approaches can effectively automate aspect-based sentiment analysis of large-scale customer feedback, suggesting a practical framework for both researchers and practitioners in the hospitality industry and potentially, other service sectors.
Paper Structure (14 sections, 3 equations, 5 figures, 12 tables)

This paper contains 14 sections, 3 equations, 5 figures, 12 tables.

Figures (5)

  • Figure 1: Example of aspect-based sentiment analysis on two restaurant reviews. Identified aspects in the review text, such as service, food, and ambiance, are highlighted in bold. Positive sentiments are highlighted in green, and negative sentiments in red.
  • Figure 2: Average classification accuracy of machine learning algorithms using different text vectorization methods (TF-IDF with and without class imbalance adjustment, and fastText) across all six aspects.
  • Figure 3: Performance comparison of machine learning models across restaurant review aspects and sentiment classifications. This figure presents F1-scores and overall accuracy for Support Vector Machines (SVM), Logistic Regression (L.R.), and different vectorization approaches (TF-IDF with/without class imbalance handling, and FastText) across six restaurant aspects. For each aspect, F1-scores are shown for three sentiment categories (Negative, Neutral/Irrelevant, Positive).
  • Figure 4: Visualization of control variable effects from the main regression models. Panel (a) shows the effects of cuisine from Model 2, and panel (b) shows the effects of state from Model 3. Horizontal bars represent effect sizes, with 95% confidence intervals shown as error bars. Categories are ordered by effect magnitude, with numerical estimates displayed alongside each bar. Gray bars indicate non-significant effects ($p>0.05$). Significance levels: *** $p<0.001$, ** $p<0.01$, * $p<0.05$.
  • Figure 5: Workflow visualization for the classification and regression analysis of Yelp restaurant reviews, detailing the training of one-stage and two-stage classifiers, prediction processes based on review relevance and sentiment, and a final regression analysis to model overall business ratings from aggregated sentiment predictions.