Table of Contents
Fetching ...

Domain Generalized Recaptured Screen Image Identification Using SWIN Transformer

Preeti Mehta, Aman Sagar, Suchi Kumari

TL;DR

This work tackles recaptured LCD screen image detection under domain shifts and scale variation. It presents DAST-DG, a cascaded data-augmentation strategy combined with a SWIN Transformer-based domain-generalization framework, featuring a feature generator adversarially trained against a domain discriminator and a multi-stage hierarchical representation. Experiments across NTU-ROSE, ICL, and Mturk datasets demonstrate strong intra-domain performance and improved cross-domain generalization, with accuracy around 82% and precision up to 95% on high-variance data, surpassing several baselines. The approach offers practical benefits for anti-forensic tasks like insurance fraud, face spoofing, and video piracy by enabling robust detection across diverse capture conditions and displays.

Abstract

An increasing number of classification approaches have been developed to address the issue of image rebroadcast and recapturing, a standard attack strategy in insurance frauds, face spoofing, and video piracy. However, most of them neglected scale variations and domain generalization scenarios, performing poorly in instances involving domain shifts, typically made worse by inter-domain and cross-domain scale variances. To overcome these issues, we propose a cascaded data augmentation and SWIN transformer domain generalization framework (DAST-DG) in the current research work Initially, we examine the disparity in dataset representation. A feature generator is trained to make authentic images from various domains indistinguishable. This process is then applied to recaptured images, creating a dual adversarial learning setup. Extensive experiments demonstrate that our approach is practical and surpasses state-of-the-art methods across different databases. Our model achieves an accuracy of approximately 82\% with a precision of 95\% on high-variance datasets.

Domain Generalized Recaptured Screen Image Identification Using SWIN Transformer

TL;DR

This work tackles recaptured LCD screen image detection under domain shifts and scale variation. It presents DAST-DG, a cascaded data-augmentation strategy combined with a SWIN Transformer-based domain-generalization framework, featuring a feature generator adversarially trained against a domain discriminator and a multi-stage hierarchical representation. Experiments across NTU-ROSE, ICL, and Mturk datasets demonstrate strong intra-domain performance and improved cross-domain generalization, with accuracy around 82% and precision up to 95% on high-variance data, surpassing several baselines. The approach offers practical benefits for anti-forensic tasks like insurance fraud, face spoofing, and video piracy by enabling robust detection across diverse capture conditions and displays.

Abstract

An increasing number of classification approaches have been developed to address the issue of image rebroadcast and recapturing, a standard attack strategy in insurance frauds, face spoofing, and video piracy. However, most of them neglected scale variations and domain generalization scenarios, performing poorly in instances involving domain shifts, typically made worse by inter-domain and cross-domain scale variances. To overcome these issues, we propose a cascaded data augmentation and SWIN transformer domain generalization framework (DAST-DG) in the current research work Initially, we examine the disparity in dataset representation. A feature generator is trained to make authentic images from various domains indistinguishable. This process is then applied to recaptured images, creating a dual adversarial learning setup. Extensive experiments demonstrate that our approach is practical and surpasses state-of-the-art methods across different databases. Our model achieves an accuracy of approximately 82\% with a precision of 95\% on high-variance datasets.
Paper Structure (25 sections, 4 equations, 10 figures, 9 tables)

This paper contains 25 sections, 4 equations, 10 figures, 9 tables.

Figures (10)

  • Figure 1: Left: Traditional methods of dataset domain generalization place source domains with acquiring a common feature space. Still, they cannot obtain a selective class boundary on the testing dataset. Right: Our DAST-DG method custer all the original image samples while separating the recaptured image sets from various domains to learn a class boundary.
  • Figure 2: Introduction to inter, intra and cross-domain recapture detection. Our model aims to learn a shared feature space which is invariant to domain and scale variance setting. Both the training and testing phases contain original and recaptured images.
  • Figure 3: Architecture of the proposed SWIN transformer
  • Figure 4: The proposed SWIN Transformer builds a stratified feature map by merging image segmentations in subsequent layers, capturing high and low-resolution details similar to the wavelets concept.
  • Figure 5: Block diagram of two successive SWIN transformer Blocks
  • ...and 5 more figures