ONOT: a High-Quality ICAO-compliant Synthetic Mugshot Dataset
Nicolò Di Domenico, Guido Borghi, Annalisa Franco, Davide Maltoni
TL;DR
The paper tackles privacy and bias in facial datasets by introducing ONOT, a synthetic mugshot collection designed to be ISO/ICAO compliant for eMRTD applications. It presents a scalable generation pipeline based on a fine-tuned diffusion model, producing 960k images across 15k pseudo-classes, with stringent ISO/ICAO constraints and subsequent intra- and inter-class identity consistency checks, plus Print&Scan simulation. The results show that only a subset of generated identities survive ISO and consistency tests (4032 identities after ISO; 55–255 after identity consistency depending on thresholds), revealing inherent challenges and bias patterns in synthetic, standards-aligned face data. The work provides reproducible prompts and a release strategy to enable further research in Morphing Attack Detection, Face Quality Assessment, and related document-analysis tasks, contributing a standards-aligned resource for privacy-preserving evaluation and benchmarking.
Abstract
Nowadays, state-of-the-art AI-based generative models represent a viable solution to overcome privacy issues and biases in the collection of datasets containing personal information, such as faces. Following this intuition, in this paper we introduce ONOT, a synthetic dataset specifically focused on the generation of high-quality faces in adherence to the requirements of the ISO/IEC 39794-5 standards that, following the guidelines of the International Civil Aviation Organization (ICAO), defines the interchange formats of face images in electronic Machine-Readable Travel Documents (eMRTD). The strictly controlled and varied mugshot images included in ONOT are useful in research fields related to the analysis of face images in eMRTD, such as Morphing Attack Detection and Face Quality Assessment. The dataset is publicly released, in combination with the generation procedure details in order to improve the reproducibility and enable future extensions.
