Table of Contents
Fetching ...

BigTokDetect: A Clinically-Informed Vision-Language Modeling Framework for Detecting Pro-Bigorexia Videos on TikTok

Minh Duc Chu, Kshitij Pawar, Zihao He, Roxanna Sharifi, Ross Sonnenblick, Magdalayna Curry, Laura D'Adamo, Lindsay Young, Stuart B Murray, Kristina Lerman

TL;DR

BigTokDetect tackles the challenge of detecting pro-bigorexia content on TikTok by introducing a clinically informed, multimodal framework that fuses visual, audio, and textual signals. It builds BigTok, the first expert-annotated multimodal dataset of over 2,200 TikTok videos labeled by clinicians across a detailed taxonomy, enabling robust evaluation of vision-language models under zero-shot, few-shot, and finetuning paradigms. The study finds that multimodal fusion substantially boosts detection performance (5–15%), with video cues contributing the most discriminative information, and that domain-specific finetuning on open-source models closes the gap to proprietary API models for fine-grained subcategory detection. Severity prediction remains challenging, but expert-guided annotation and modality integration provide a solid foundation for scalable, clinically informed content moderation aimed at mitigating harmful mental health risks. Overall, the work establishes a reproducible, domain-adapted framework for moderating nuanced online harms in emergent mental health domains.

Abstract

Social media platforms face escalating challenges in detecting harmful content that promotes muscle dysmorphic behaviors and cognitions (bigorexia). This content can evade moderation by camouflaging as legitimate fitness advice and disproportionately affects adolescent males. We address this challenge with BigTokDetect, a clinically informed framework for identifying pro-bigorexia content on TikTok. We introduce BigTok, the first expert-annotated multimodal benchmark dataset of over 2,200 TikTok videos labeled by clinical psychiatrists across five categories and eighteen fine-grained subcategories. Comprehensive evaluation of state-of-the-art vision-language models reveals that while commercial zero-shot models achieve the highest accuracy on broad primary categories, supervised fine-tuning enables smaller open-source models to perform better on fine-grained subcategory detection. Ablation studies show that multimodal fusion improves performance by 5 to 15 percent, with video features providing the most discriminative signals. These findings support a grounded moderation approach that automates detection of explicit harms while flagging ambiguous content for human review, and they establish a scalable framework for harm mitigation in emerging mental health domains.

BigTokDetect: A Clinically-Informed Vision-Language Modeling Framework for Detecting Pro-Bigorexia Videos on TikTok

TL;DR

BigTokDetect tackles the challenge of detecting pro-bigorexia content on TikTok by introducing a clinically informed, multimodal framework that fuses visual, audio, and textual signals. It builds BigTok, the first expert-annotated multimodal dataset of over 2,200 TikTok videos labeled by clinicians across a detailed taxonomy, enabling robust evaluation of vision-language models under zero-shot, few-shot, and finetuning paradigms. The study finds that multimodal fusion substantially boosts detection performance (5–15%), with video cues contributing the most discriminative information, and that domain-specific finetuning on open-source models closes the gap to proprietary API models for fine-grained subcategory detection. Severity prediction remains challenging, but expert-guided annotation and modality integration provide a solid foundation for scalable, clinically informed content moderation aimed at mitigating harmful mental health risks. Overall, the work establishes a reproducible, domain-adapted framework for moderating nuanced online harms in emergent mental health domains.

Abstract

Social media platforms face escalating challenges in detecting harmful content that promotes muscle dysmorphic behaviors and cognitions (bigorexia). This content can evade moderation by camouflaging as legitimate fitness advice and disproportionately affects adolescent males. We address this challenge with BigTokDetect, a clinically informed framework for identifying pro-bigorexia content on TikTok. We introduce BigTok, the first expert-annotated multimodal benchmark dataset of over 2,200 TikTok videos labeled by clinical psychiatrists across five categories and eighteen fine-grained subcategories. Comprehensive evaluation of state-of-the-art vision-language models reveals that while commercial zero-shot models achieve the highest accuracy on broad primary categories, supervised fine-tuning enables smaller open-source models to perform better on fine-grained subcategory detection. Ablation studies show that multimodal fusion improves performance by 5 to 15 percent, with video features providing the most discriminative signals. These findings support a grounded moderation approach that automates detection of explicit harms while flagging ambiguous content for human review, and they establish a scalable framework for harm mitigation in emerging mental health domains.

Paper Structure

This paper contains 58 sections, 11 figures, 15 tables.

Figures (11)

  • Figure 1: BigTok pipeline overview. Left: Expert annotation process with dual annotation and consensus resolution. Right: Multimodal feature extraction (visual, audio, text) and classification evaluation across VLMs using zero-shot, few-shot, and finetuning approaches for primary category, subcategory, and severity prediction tasks.
  • Figure 2: Annotation Interface on Amazon Mechanical Turk platform.
  • Figure 3: Inter-annotator agreement matrix for primary categories. Each cell shows the number of videos where Annotator A assigned the row category and Annotator B assigned the column category. Diagonal elements represent perfect agreement, while off-diagonal elements indicate disagreements between annotators.
  • Figure 4: Inter-annotator agreement matrix for subcategories. Each cell shows the number of videos where Annotator A assigned the row subcategory and Annotator B assigned the column subcategory. The matrix reveals patterns of confusion between semantically related subcategories and overall annotation consistency.
  • Figure 5: Distribution of Severity Scores. The histogram shows most values concentrated between 1.0 and 2.0, with fewer cases at higher severity levels.
  • ...and 6 more figures