Table of Contents
Fetching ...

Synthetic dataset of ID and Travel Document

Carlos Boned, Maxime Talarmain, Nabil Ghanmi, Guillaume Chiron, Sanket Biswas, Ahmad Montaser Awal, Oriol Ramos Terrades

TL;DR

The state-of-the-art models trained on this dataset are compared to the performance achieved in larger, but private, datasets and will help to document image analysis community to progress in the task of ID document verification.

Abstract

This paper presents a new synthetic dataset of ID and travel documents, called SIDTD. The SIDTD dataset is created to help training and evaluating forged ID documents detection systems. Such a dataset has become a necessity as ID documents contain personal information and a public dataset of real documents can not be released. Moreover, forged documents are scarce, compared to legit ones, and the way they are generated varies from one fraudster to another resulting in a class of high intra-variability. In this paper we trained state-of-the-art models on this dataset and we compare them to the performance achieved in larger, but private, datasets. The creation of this dataset will help to document image analysis community to progress in the task of ID document verification.

Synthetic dataset of ID and Travel Document

TL;DR

The state-of-the-art models trained on this dataset are compared to the performance achieved in larger, but private, datasets and will help to document image analysis community to progress in the task of ID document verification.

Abstract

This paper presents a new synthetic dataset of ID and travel documents, called SIDTD. The SIDTD dataset is created to help training and evaluating forged ID documents detection systems. Such a dataset has become a necessity as ID documents contain personal information and a public dataset of real documents can not be released. Moreover, forged documents are scarce, compared to legit ones, and the way they are generated varies from one fraudster to another resulting in a class of high intra-variability. In this paper we trained state-of-the-art models on this dataset and we compare them to the performance achieved in larger, but private, datasets. The creation of this dataset will help to document image analysis community to progress in the task of ID document verification.
Paper Structure (15 sections, 5 figures, 2 tables)

This paper contains 15 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Crop & Replace Composite PAI example. The signature of two ID documents of the same nationality is replaced.
  • Figure 2: Inpainting Composite PAI example. The name field in the ID document is replaced by the same content but changing the font.
  • Figure 3: Pathflow followed to generate forged videos and clips
  • Figure 4: Distribution of videos according to: a) phone year of release b) resolution of the smartphone main rear camera (in megapixels)
  • Figure 5: Examples of SIDTD video clips of fake documents with different backgrounds, lightening and devices: a) Natural light with table background recorded with Xiaomi Mi Max 2 b) Natural light with outside floor background recorded with Samsung Galaxy A70 c) low lighting with chair background recorded with Xiaomi Redmi Note Pro 11+ d) artificial indoor light with keyboard background recorded with Xiaomi Mi A3.