Copy-Move Forgery Detection and Question Answering for Remote Sensing Image

Ze Zhang; Enyuan Zhao; Di Niu; Jie Nie; Xinyue Liang; Lei Huang

Copy-Move Forgery Detection and Question Answering for Remote Sensing Image

Ze Zhang, Enyuan Zhao, Di Niu, Jie Nie, Xinyue Liang, Lei Huang

TL;DR

This work defines the Remote Sensing Copy-Move Question Answering (RSCMQA) task to jointly detect copy-move forgeries and reason about tampered RS images through QA. It introduces five large, region-rich datasets (RS-CMQA, RS-CMQA-B, Real-RSCM, RS-TQA, RS-TQA-B) and presents the Copy-Move Forgery Perception Framework (CMFPF), which uses region-discrimination guided prompts to inject tampering cues into both visual and textual modalities. The approach yields state-of-the-art results across multiple RS-CMQA datasets, demonstrates robustness to blurred tampering, and maintains transferability across related tasks, outperforming general VQA and RSVQA baselines. By providing rich datasets and a targeted multimodal framework, the work advances practical tampering perception for land-resource monitoring and national defense applications.

Abstract

Driven by practical demands in land resource monitoring and national defense security, this paper introduces the Remote Sensing Copy-Move Question Answering (RSCMQA) task. Unlike traditional Remote Sensing Visual Question Answering (RSVQA), RSCMQA focuses on interpreting complex tampering scenarios and inferring relationships between objects. We present a suite of global RSCMQA datasets, comprising images from 29 different regions across 14 countries. Specifically, we propose five distinct datasets, including the basic dataset RS-CMQA, the category-balanced dataset RS-CMQA-B, the high-authenticity dataset Real-RSCM, the extended dataset RS-TQA, and the extended category-balanced dataset RS-TQA-B. These datasets fill a critical gap in the field while ensuring comprehensiveness, balance, and challenge. Furthermore, we introduce a region-discrimination-guided multimodal copy-move forgery perception framework (CMFPF), which enhances the accuracy of answering questions about tampered images by leveraging prompt about the differences and connections between the source and tampered domains. Extensive experiments demonstrate that our method provides a stronger benchmark for RSCMQA compared to general VQA and RSVQA models. Our datasets and code are publicly available at https://github.com/shenyedepisa/RSCMQA.

Copy-Move Forgery Detection and Question Answering for Remote Sensing Image

TL;DR

Abstract

Copy-Move Forgery Detection and Question Answering for Remote Sensing Image

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)