Domain Generalized Recaptured Screen Image Identification Using SWIN Transformer

Preeti Mehta; Aman Sagar; Suchi Kumari

Domain Generalized Recaptured Screen Image Identification Using SWIN Transformer

Preeti Mehta, Aman Sagar, Suchi Kumari

TL;DR

This work tackles recaptured LCD screen image detection under domain shifts and scale variation. It presents DAST-DG, a cascaded data-augmentation strategy combined with a SWIN Transformer-based domain-generalization framework, featuring a feature generator adversarially trained against a domain discriminator and a multi-stage hierarchical representation. Experiments across NTU-ROSE, ICL, and Mturk datasets demonstrate strong intra-domain performance and improved cross-domain generalization, with accuracy around 82% and precision up to 95% on high-variance data, surpassing several baselines. The approach offers practical benefits for anti-forensic tasks like insurance fraud, face spoofing, and video piracy by enabling robust detection across diverse capture conditions and displays.

Abstract

An increasing number of classification approaches have been developed to address the issue of image rebroadcast and recapturing, a standard attack strategy in insurance frauds, face spoofing, and video piracy. However, most of them neglected scale variations and domain generalization scenarios, performing poorly in instances involving domain shifts, typically made worse by inter-domain and cross-domain scale variances. To overcome these issues, we propose a cascaded data augmentation and SWIN transformer domain generalization framework (DAST-DG) in the current research work Initially, we examine the disparity in dataset representation. A feature generator is trained to make authentic images from various domains indistinguishable. This process is then applied to recaptured images, creating a dual adversarial learning setup. Extensive experiments demonstrate that our approach is practical and surpasses state-of-the-art methods across different databases. Our model achieves an accuracy of approximately 82\% with a precision of 95\% on high-variance datasets.

Domain Generalized Recaptured Screen Image Identification Using SWIN Transformer

TL;DR

Abstract

Paper Structure (25 sections, 4 equations, 10 figures, 9 tables)

This paper contains 25 sections, 4 equations, 10 figures, 9 tables.

Introduction
Motivation
Major Contributions
Paper Organization
Related Work
Aliasing Artefacts
Blurriness Artefacts
Noise Artefacts
Contrast, Colour and Texture Non-Uniformity Artefacts
Automatic Extracted Artefacts
preliminary
Proposed Methodology
Stage 1: Initial Embedding and Transformation
Stage 2: Hierarchical Representation
Stages 3 and 4: Further Hierarchical Representation
...and 10 more sections

Figures (10)

Figure 1: Left: Traditional methods of dataset domain generalization place source domains with acquiring a common feature space. Still, they cannot obtain a selective class boundary on the testing dataset. Right: Our DAST-DG method custer all the original image samples while separating the recaptured image sets from various domains to learn a class boundary.
Figure 2: Introduction to inter, intra and cross-domain recapture detection. Our model aims to learn a shared feature space which is invariant to domain and scale variance setting. Both the training and testing phases contain original and recaptured images.
Figure 3: Architecture of the proposed SWIN transformer
Figure 4: The proposed SWIN Transformer builds a stratified feature map by merging image segmentations in subsequent layers, capturing high and low-resolution details similar to the wavelets concept.
Figure 5: Block diagram of two successive SWIN transformer Blocks
...and 5 more figures

Domain Generalized Recaptured Screen Image Identification Using SWIN Transformer

TL;DR

Abstract

Domain Generalized Recaptured Screen Image Identification Using SWIN Transformer

Authors

TL;DR

Abstract

Table of Contents

Figures (10)