Table of Contents
Fetching ...

Yelp Dataset Challenge: Review Rating Prediction

Nabiha Asghar

TL;DR

This work tackles predicting Yelp restaurant review ratings from free-form text by casting it as a multiclass classification problem. It systematically compares 16 models formed by combining four feature extraction schemes (unigrams, bigrams, trigrams, and LSI) with four linear/classic classifiers (logistic regression, Naïve Bayes, perceptrons, and Linear SVC). The strongest results come from using top 10,000 Unigram+Bigram TF-IDF features with Logistic Regression, achieving about 64% validation accuracy, though test performance shows some drop due to potential overfitting. The study highlights the effectiveness of n-gram–based representations over LSI for this task and outlines practical avenues for future feature engineering and modeling enhancements across domains.

Abstract

Review websites, such as TripAdvisor and Yelp, allow users to post online reviews for various businesses, products and services, and have been recently shown to have a significant influence on consumer shopping behaviour. An online review typically consists of free-form text and a star rating out of 5. The problem of predicting a user's star rating for a product, given the user's text review for that product, is called Review Rating Prediction and has lately become a popular, albeit hard, problem in machine learning. In this paper, we treat Review Rating Prediction as a multi-class classification problem, and build sixteen different prediction models by combining four feature extraction methods, (i) unigrams, (ii) bigrams, (iii) trigrams and (iv) Latent Semantic Indexing, with four machine learning algorithms, (i) logistic regression, (ii) Naive Bayes classification, (iii) perceptrons, and (iv) linear Support Vector Classification. We analyse the performance of each of these sixteen models to come up with the best model for predicting the ratings from reviews. We use the dataset provided by Yelp for training and testing the models.

Yelp Dataset Challenge: Review Rating Prediction

TL;DR

This work tackles predicting Yelp restaurant review ratings from free-form text by casting it as a multiclass classification problem. It systematically compares 16 models formed by combining four feature extraction schemes (unigrams, bigrams, trigrams, and LSI) with four linear/classic classifiers (logistic regression, Naïve Bayes, perceptrons, and Linear SVC). The strongest results come from using top 10,000 Unigram+Bigram TF-IDF features with Logistic Regression, achieving about 64% validation accuracy, though test performance shows some drop due to potential overfitting. The study highlights the effectiveness of n-gram–based representations over LSI for this task and outlines practical avenues for future feature engineering and modeling enhancements across domains.

Abstract

Review websites, such as TripAdvisor and Yelp, allow users to post online reviews for various businesses, products and services, and have been recently shown to have a significant influence on consumer shopping behaviour. An online review typically consists of free-form text and a star rating out of 5. The problem of predicting a user's star rating for a product, given the user's text review for that product, is called Review Rating Prediction and has lately become a popular, albeit hard, problem in machine learning. In this paper, we treat Review Rating Prediction as a multi-class classification problem, and build sixteen different prediction models by combining four feature extraction methods, (i) unigrams, (ii) bigrams, (iii) trigrams and (iv) Latent Semantic Indexing, with four machine learning algorithms, (i) logistic regression, (ii) Naive Bayes classification, (iii) perceptrons, and (iv) linear Support Vector Classification. We analyse the performance of each of these sixteen models to come up with the best model for predicting the ratings from reviews. We use the dataset provided by Yelp for training and testing the models.

Paper Structure

This paper contains 23 sections, 1 equation, 5 figures.

Figures (5)

  • Figure 1: A Typical User Review: Free-form Text & a Star Rating
  • Figure 2: Descriptive Stats: Yelp Businesses & Reviews
  • Figure 3: RMSE plots for (a) Unigrams, and (b) Unigrams & Bigrams
  • Figure 4: Accuracy plots for (a) Unigrams, and (b) Unigrams & Bigrams
  • Figure 5: Latent Semantic Indexing (LSI)