Table of Contents
Fetching ...

FactGuard: Agentic Video Misinformation Detection via Reinforcement Learning

Zehao Li, Hongwei Yu, Hao Jiang, Qiang Sheng, Yilong Xu, Baolong Bi, Yang Li, Zhenlong Yuan, Yujun Cai, Zhaoqi Wang

TL;DR

FactGuard is proposed, an agentic framework for video misinformation detection that formulates verification as an iterative reasoning process built upon MLLMs, and a two-stage training strategy that combines domain-specific agentic supervised fine-tuning with decision-aware reinforcement learning to optimize tool usage and calibrate risk-sensitive decision making.

Abstract

Multimodal large language models (MLLMs) have substantially advanced video misinformation detection through unified multimodal reasoning, but they often rely on fixed-depth inference and place excessive trust in internally generated assumptions, particularly in scenarios where critical evidence is sparse, fragmented, or requires external verification. To address these limitations, we propose FactGuard, an agentic framework for video misinformation detection that formulates verification as an iterative reasoning process built upon MLLMs. FactGuard explicitly assesses task ambiguity and selectively invokes external tools to acquire critical evidence, enabling progressive refinement of reasoning trajectories. To further strengthen this capability, we introduce a two-stage training strategy that combines domain-specific agentic supervised fine-tuning with decision-aware reinforcement learning to optimize tool usage and calibrate risk-sensitive decision making. Extensive experiments on FakeSV, FakeTT, and FakeVV demonstrate FactGuard's state-of-the-art performance and validate its excellent robustness and generalization capacity.

FactGuard: Agentic Video Misinformation Detection via Reinforcement Learning

TL;DR

FactGuard is proposed, an agentic framework for video misinformation detection that formulates verification as an iterative reasoning process built upon MLLMs, and a two-stage training strategy that combines domain-specific agentic supervised fine-tuning with decision-aware reinforcement learning to optimize tool usage and calibrate risk-sensitive decision making.

Abstract

Multimodal large language models (MLLMs) have substantially advanced video misinformation detection through unified multimodal reasoning, but they often rely on fixed-depth inference and place excessive trust in internally generated assumptions, particularly in scenarios where critical evidence is sparse, fragmented, or requires external verification. To address these limitations, we propose FactGuard, an agentic framework for video misinformation detection that formulates verification as an iterative reasoning process built upon MLLMs. FactGuard explicitly assesses task ambiguity and selectively invokes external tools to acquire critical evidence, enabling progressive refinement of reasoning trajectories. To further strengthen this capability, we introduce a two-stage training strategy that combines domain-specific agentic supervised fine-tuning with decision-aware reinforcement learning to optimize tool usage and calibrate risk-sensitive decision making. Extensive experiments on FakeSV, FakeTT, and FakeVV demonstrate FactGuard's state-of-the-art performance and validate its excellent robustness and generalization capacity.
Paper Structure (39 sections, 10 equations, 7 figures, 4 tables)

This paper contains 39 sections, 10 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Comparison of video misinformation detection methods and our proposed FactGuard in terms of (a) explainability analysis and (b) comprehensive performance analysis.
  • Figure 2: Pipeline of FactGuard. The upper part illustrates the inference-time agentic verification process, where FactGuard assesses uncertainty based on the ambiguity of the input and selectively invokes external tools to acquire additional evidence before refining its reasoning and producing a final decision. The lower part depicts the training pipeline, which combines supervised fine-tuning with decision-aware reinforcement learning to reinforce structured reasoning, calibrated tool usage, and risk-sensitive verification behavior.
  • Figure 3: Key advantages of FactGuard. (a) MLLM-based methods with enhanced reasoning may induce cross-modal hallucination in ambiguous cases by over-relying on internally generated assumptions, treating them as grounded evidence without acquiring or validating critical supporting information. (b) FactGuard formulates misinformation verification as an uncertainty-aware, tool-assisted decision-making process that adaptively refines its conclusions, enabling reliable verification in open and dynamic environments.
  • Figure 4: Qualitative analysis of model reasoning. Representative reasoning traces under correct predictions show that FactGuard produces more coherent and evidence-grounded reasoning than Qwen2.5-VL and Fact-R1, highlighting improved interpretability.
  • Figure 5: Additional Case Study of FactGuard.
  • ...and 2 more figures