Table of Contents
Fetching ...

DariMis: Harm-Aware Modeling for Dari Misinformation Detection on YouTube

Jawid Ahmad Baktash, Mosa Ebrahimi, Mohammad Zarif Joya, Mursal Dawodi

Abstract

Dari, the primary language of Afghanistan, is spoken by tens of millions of people yet remains largely absent from the misinformation detection literature. We address this gap with DariMis, the first manually annotated dataset of 9,224 Dari-language YouTube videos, labeled across two dimensions: Information Type (Misinformation, Partly True, True) and Harm Level (Low, Medium, High). A central empirical finding is that these dimensions are structurally coupled, not independent: 55.9 percent of Misinformation carries at least Medium harm potential, compared with only 1.0 percent of True content. This enables Information Type classifiers to function as implicit harm-triage filters in content moderation pipelines. We further propose a pair-input encoding strategy that represents the video title and description as separate BERT segment inputs, explicitly modeling the semantic relationship between headline claims and body content, a key signal of misleading information. An ablation study against single-field concatenation shows that pair-input encoding yields a 7.0 percentage point gain in Misinformation recall (60.1 percent to 67.1 percent), the safety-critical minority class, despite modest overall macro F1 differences (0.09 percentage points). We benchmark a Dari/Farsi-specialized model (ParsBERT) against XLM-RoBERTa-base; ParsBERT achieves the best test performance with accuracy of 76.60 percent and macro F1 of 72.77 percent. Bootstrap 95 percent confidence intervals are reported for all metrics, and we discuss both the practical significance and statistical limitations of the results.

DariMis: Harm-Aware Modeling for Dari Misinformation Detection on YouTube

Abstract

Dari, the primary language of Afghanistan, is spoken by tens of millions of people yet remains largely absent from the misinformation detection literature. We address this gap with DariMis, the first manually annotated dataset of 9,224 Dari-language YouTube videos, labeled across two dimensions: Information Type (Misinformation, Partly True, True) and Harm Level (Low, Medium, High). A central empirical finding is that these dimensions are structurally coupled, not independent: 55.9 percent of Misinformation carries at least Medium harm potential, compared with only 1.0 percent of True content. This enables Information Type classifiers to function as implicit harm-triage filters in content moderation pipelines. We further propose a pair-input encoding strategy that represents the video title and description as separate BERT segment inputs, explicitly modeling the semantic relationship between headline claims and body content, a key signal of misleading information. An ablation study against single-field concatenation shows that pair-input encoding yields a 7.0 percentage point gain in Misinformation recall (60.1 percent to 67.1 percent), the safety-critical minority class, despite modest overall macro F1 differences (0.09 percentage points). We benchmark a Dari/Farsi-specialized model (ParsBERT) against XLM-RoBERTa-base; ParsBERT achieves the best test performance with accuracy of 76.60 percent and macro F1 of 72.77 percent. Bootstrap 95 percent confidence intervals are reported for all metrics, and we discuss both the practical significance and statistical limitations of the results.
Paper Structure (38 sections, 1 equation, 7 figures, 7 tables)

This paper contains 38 sections, 1 equation, 7 figures, 7 tables.

Figures (7)

  • Figure 1: Class distributions in DariMis (a) Information Type and (b) Harm Level. Partly True dominates at 60%; Low harm accounts for 74.1% of videos.
  • Figure 2: Harm Level within each Information Type: (left) absolute counts; (right) row-normalised proportions. Misinformation has 55.9% of instances at Medium or High harm versus only 1.0% for True content.
  • Figure 3: Heatmaps of the Information Type $\times$ Harm Level cross-tabulation: (a) raw counts; (b) row-normalised percentages. The concentration of Misinformation in the Medium and High harm columns, and the near-exclusive Low-harm profile of True content, confirm the structural accuracy--harm coupling.
  • Figure 4: Overview of the proposed pair-input modeling framework for DariMis. The video title and description are routed into separate BERT segment inputs (Seg A and Seg B), enabling cross-segment self-attention to capture headline--body semantic inconsistencies--- a primary signal of misleading content. The predicted Information Type implicitly encodes harm level (55.9% of Misinformation carries $\geq$Medium harm), enabling downstream moderation triage without a dedicated harm classifier.
  • Figure 5: Test-set performance of both models on all four metrics. ParsBERT with pair-input encoding achieves the highest macro F1 and best Misinformation recall.
  • ...and 2 more figures