Table of Contents
Fetching ...

AI-Generated Video Detection via Spatio-Temporal Anomaly Learning

Jianfa Bai, Man Lin, Gang Cao

TL;DR

An effective AI-generated video detection (AIGVDet) scheme by capturing the forensic traces with a two-branch spatio-temporal convolutional neural network (CNN) and two ResNet sub-detectors are learned separately for identifying the anomalies in spatical and optical flow domains, respectively.

Abstract

The advancement of generation models has led to the emergence of highly realistic artificial intelligence (AI)-generated videos. Malicious users can easily create non-existent videos to spread false information. This letter proposes an effective AI-generated video detection (AIGVDet) scheme by capturing the forensic traces with a two-branch spatio-temporal convolutional neural network (CNN). Specifically, two ResNet sub-detectors are learned separately for identifying the anomalies in spatical and optical flow domains, respectively. Results of such sub-detectors are fused to further enhance the discrimination ability. A large-scale generated video dataset (GVD) is constructed as a benchmark for model training and evaluation. Extensive experimental results verify the high generalization and robustness of our AIGVDet scheme. Code and dataset will be available at https://github.com/multimediaFor/AIGVDet.

AI-Generated Video Detection via Spatio-Temporal Anomaly Learning

TL;DR

An effective AI-generated video detection (AIGVDet) scheme by capturing the forensic traces with a two-branch spatio-temporal convolutional neural network (CNN) and two ResNet sub-detectors are learned separately for identifying the anomalies in spatical and optical flow domains, respectively.

Abstract

The advancement of generation models has led to the emergence of highly realistic artificial intelligence (AI)-generated videos. Malicious users can easily create non-existent videos to spread false information. This letter proposes an effective AI-generated video detection (AIGVDet) scheme by capturing the forensic traces with a two-branch spatio-temporal convolutional neural network (CNN). Specifically, two ResNet sub-detectors are learned separately for identifying the anomalies in spatical and optical flow domains, respectively. Results of such sub-detectors are fused to further enhance the discrimination ability. A large-scale generated video dataset (GVD) is constructed as a benchmark for model training and evaluation. Extensive experimental results verify the high generalization and robustness of our AIGVDet scheme. Code and dataset will be available at https://github.com/multimediaFor/AIGVDet.
Paper Structure (10 sections, 4 equations, 3 figures, 3 tables)

This paper contains 10 sections, 4 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: The first frame and the first three optical flow maps of the videos generated by three models, along with those of a real video.
  • Figure 2: Overall pipeline of the proposed generated video detection scheme AIGVDet, where RAFT teed2020raft is the method for calculating optical flow maps.
  • Figure 3: Robustness evaluation results against post H.264 compression with different quality factors (CRFs).