Table of Contents
Fetching ...

A NLP Approach to "Review Bombing" in Metacritic PC Videogames User Ratings

Javier Coronado-Blázquez

TL;DR

The paper tackles the problem of review bombing in PC games on Metacritic by building an English-language review dataset using Kaggle data and applying NLP classification to separate bombing from negative but genuine reviews. It finds Multinomial Naive Bayes with TF-IDF features achieves 0.88 accuracy, and identifies five concept groups driving bombing: company-brand factors, originality expectations, economic considerations, sentiment, and frustration. The work provides actionable insights for platform moderation and product design and demonstrates a framework that can generalize to other scoring platforms. Overall, it offers a replicable methodology for detecting and understanding manipulation in online ratings, with practical implications for improving review integrity across platforms.

Abstract

Many videogames suffer "review bombing" -a large volume of unusually low scores that in many cases do not reflect the real quality of the product- when rated by users. By taking Metacritic's 50,000+ user score aggregations for PC games in English language, we use a Natural Language Processing (NLP) approach to try to understand the main words and concepts appearing in such cases, reaching a 0.88 accuracy on a validation set when distinguishing between just bad ratings and review bombings. By uncovering and analyzing the patterns driving this phenomenon, these results could be used to further mitigate these situations.

A NLP Approach to "Review Bombing" in Metacritic PC Videogames User Ratings

TL;DR

The paper tackles the problem of review bombing in PC games on Metacritic by building an English-language review dataset using Kaggle data and applying NLP classification to separate bombing from negative but genuine reviews. It finds Multinomial Naive Bayes with TF-IDF features achieves 0.88 accuracy, and identifies five concept groups driving bombing: company-brand factors, originality expectations, economic considerations, sentiment, and frustration. The work provides actionable insights for platform moderation and product design and demonstrates a framework that can generalize to other scoring platforms. Overall, it offers a replicable methodology for detecting and understanding manipulation in online ratings, with practical implications for improving review integrity across platforms.

Abstract

Many videogames suffer "review bombing" -a large volume of unusually low scores that in many cases do not reflect the real quality of the product- when rated by users. By taking Metacritic's 50,000+ user score aggregations for PC games in English language, we use a Natural Language Processing (NLP) approach to try to understand the main words and concepts appearing in such cases, reaching a 0.88 accuracy on a validation set when distinguishing between just bad ratings and review bombings. By uncovering and analyzing the patterns driving this phenomenon, these results could be used to further mitigate these situations.
Paper Structure (5 sections, 4 figures, 1 table)

This paper contains 5 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: Metascore -- user rating scores for PC games (1995--2023). Shaded areas show user-preferred games (average user score $>$ Metascore), critic-preferred titles (average user score $<$ Metascore), and potential games which have suffered review bombing (Metascore minus average user score $>$ 4.0). Colour code traces the Metascore.
  • Figure 2: User ratings for Metacritic PC game reviews in the sample of potential review bombing titles (left panel) and the sample of non-potential review bombing titles (right panel). See text for details.
  • Figure 3: Normalized confusion matrix of the MNB best NLP model, for Review Bombing (RB) and Non-Review Bombing (Non-RB).
  • Figure 4: Wordcloud for the most relevant concepts in review bombing user ratings. See text for more details.