Table of Contents
Fetching ...

AiGen-FoodReview: A Multimodal Dataset of Machine-Generated Restaurant Reviews and Images on Social Media

Alessandro Gambetti, Qiwei Han

TL;DR

AiGen-FoodReview addresses the rising threat of machine-generated restaurant reviews and accompanying images in online ecosystems. The authors assemble a 20,144 paired dataset of authentic and AI-generated reviews and images using GPT-4-Turbo and DALL-E-2, and rigorously evaluate unimodal and multimodal detectors. They demonstrate that FLAVA achieves near-perfect multimodal detection accuracy (~99.80%), while handcrafted readability and photographic features offer interpretable alternatives with strong performance. By releasing the data and detectors openly, the work provides a valuable benchmark for developing robust defenses against synthetic content in social media and online markets.

Abstract

Online reviews in the form of user-generated content (UGC) significantly impact consumer decision-making. However, the pervasive issue of not only human fake content but also machine-generated content challenges UGC's reliability. Recent advances in Large Language Models (LLMs) may pave the way to fabricate indistinguishable fake generated content at a much lower cost. Leveraging OpenAI's GPT-4-Turbo and DALL-E-2 models, we craft AiGen-FoodReview, a multi-modal dataset of 20,144 restaurant review-image pairs divided into authentic and machine-generated. We explore unimodal and multimodal detection models, achieving 99.80% multimodal accuracy with FLAVA. We use attributes from readability and photographic theories to score reviews and images, respectively, demonstrating their utility as hand-crafted features in scalable and interpretable detection models, with comparable performance. The paper contributes by open-sourcing the dataset and releasing fake review detectors, recommending its use in unimodal and multimodal fake review detection tasks, and evaluating linguistic and visual features in synthetic versus authentic data.

AiGen-FoodReview: A Multimodal Dataset of Machine-Generated Restaurant Reviews and Images on Social Media

TL;DR

AiGen-FoodReview addresses the rising threat of machine-generated restaurant reviews and accompanying images in online ecosystems. The authors assemble a 20,144 paired dataset of authentic and AI-generated reviews and images using GPT-4-Turbo and DALL-E-2, and rigorously evaluate unimodal and multimodal detectors. They demonstrate that FLAVA achieves near-perfect multimodal detection accuracy (~99.80%), while handcrafted readability and photographic features offer interpretable alternatives with strong performance. By releasing the data and detectors openly, the work provides a valuable benchmark for developing robust defenses against synthetic content in social media and online markets.

Abstract

Online reviews in the form of user-generated content (UGC) significantly impact consumer decision-making. However, the pervasive issue of not only human fake content but also machine-generated content challenges UGC's reliability. Recent advances in Large Language Models (LLMs) may pave the way to fabricate indistinguishable fake generated content at a much lower cost. Leveraging OpenAI's GPT-4-Turbo and DALL-E-2 models, we craft AiGen-FoodReview, a multi-modal dataset of 20,144 restaurant review-image pairs divided into authentic and machine-generated. We explore unimodal and multimodal detection models, achieving 99.80% multimodal accuracy with FLAVA. We use attributes from readability and photographic theories to score reviews and images, respectively, demonstrating their utility as hand-crafted features in scalable and interpretable detection models, with comparable performance. The paper contributes by open-sourcing the dataset and releasing fake review detectors, recommending its use in unimodal and multimodal fake review detection tasks, and evaluating linguistic and visual features in synthetic versus authentic data.
Paper Structure (23 sections, 1 equation, 5 figures, 5 tables)

This paper contains 23 sections, 1 equation, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Example of how a scraped review (with images attached by the same user) is displayed on Yelp. The red label indicates the user's elite status. Other variables include user location, user number of friends, number of previous reviews posted, and number of images posted. Name and image were anonymized and blurred.
  • Figure 2: Diagram of the data generation methodology. Elite reviews are used as a source of information in the prompt to query the GPT-4-Turbo model to generate a fake review. Next, the generated fake review is used as information to produce a related synthetic image. Finally, review-image pairs, both authentic and generated, were aggregated into the final dataset to form the negative class (Class 0) and positive class (Class 1), respectively.
  • Figure 3: Generated Images from Dall-E-2 (left column) and Dall-E-3 (right column) for the same prompt.
  • Figure 4: Representation of unimodal and multimodal feature combinations for detection models.
  • Figure 5: SHAP evaluation: top 5 influential features to predict generated reviews in descending order. The concentration of red dots in the $x>0$ quadrant, with the blue ones on the $x<0$ quadrant, implies a positive correlation with the target variable, and vice versa. For example, ARI is positively correlated, while FR is negatively correlated.