Table of Contents
Fetching ...

Baitradar: A Multi-Model Clickbait Detection Algorithm Using Deep Learning

Bhanuka Gamage, Adnan Labib, Aisha Joomun, Chern Hong Lim, KokSheik Wong

TL;DR

The paper addresses the problem of YouTube clickbait by introducing BaitRadar, a multi-model deep learning ensemble that fuses six attribute-specific models—title, thumbnail, comments, tags, statistics, and audio transcript—via average-ensemble fusion to improve robustness against missing data. It collects a large, multi-modal dataset (~14k videos) and demonstrates that individual cues range in accuracy, with Tags and Audio Transcript models performing strongest, while the six-model fusion achieves ~98% accuracy with latency under 2 seconds. The work highlights significant gains over prior single-signal approaches, shows good generalization to new videos, and discusses limitations in borderline cases and potential biases. Practically, BaitRadar offers a scalable, fast, and robust solution for flagging clickbait on YouTube, with implications for safer user experiences and improved content recommendations.

Abstract

Following the rising popularity of YouTube, there is an emerging problem on this platform called clickbait, which provokes users to click on videos using attractive titles and thumbnails. As a result, users ended up watching a video that does not have the content as publicized in the title. This issue is addressed in this study by proposing an algorithm called BaitRadar, which uses a deep learning technique where six inference models are jointly consulted to make the final classification decision. These models focus on different attributes of the video, including title, comments, thumbnail, tags, video statistics and audio transcript. The final classification is attained by computing the average of multiple models to provide a robust and accurate output even in situation where there is missing data. The proposed method is tested on 1,400 YouTube videos. On average, a test accuracy of 98% is achieved with an inference time of less than 2s.

Baitradar: A Multi-Model Clickbait Detection Algorithm Using Deep Learning

TL;DR

The paper addresses the problem of YouTube clickbait by introducing BaitRadar, a multi-model deep learning ensemble that fuses six attribute-specific models—title, thumbnail, comments, tags, statistics, and audio transcript—via average-ensemble fusion to improve robustness against missing data. It collects a large, multi-modal dataset (~14k videos) and demonstrates that individual cues range in accuracy, with Tags and Audio Transcript models performing strongest, while the six-model fusion achieves ~98% accuracy with latency under 2 seconds. The work highlights significant gains over prior single-signal approaches, shows good generalization to new videos, and discusses limitations in borderline cases and potential biases. Practically, BaitRadar offers a scalable, fast, and robust solution for flagging clickbait on YouTube, with implications for safer user experiences and improved content recommendations.

Abstract

Following the rising popularity of YouTube, there is an emerging problem on this platform called clickbait, which provokes users to click on videos using attractive titles and thumbnails. As a result, users ended up watching a video that does not have the content as publicized in the title. This issue is addressed in this study by proposing an algorithm called BaitRadar, which uses a deep learning technique where six inference models are jointly consulted to make the final classification decision. These models focus on different attributes of the video, including title, comments, thumbnail, tags, video statistics and audio transcript. The final classification is attained by computing the average of multiple models to provide a robust and accurate output even in situation where there is missing data. The proposed method is tested on 1,400 YouTube videos. On average, a test accuracy of 98% is achieved with an inference time of less than 2s.

Paper Structure

This paper contains 9 sections, 1 equation, 2 figures, 1 table.

Figures (2)

  • Figure 1: Combined Model Architecture - BaitRadar
  • Figure 2: Accuracies of different model combinations