Multi-Domain Biometric Recognition using Body Embeddings

Anirudh Nanduri; Siyuan Huang; Rama Chellappa

Multi-Domain Biometric Recognition using Body Embeddings

Anirudh Nanduri, Siyuan Huang, Rama Chellappa

TL;DR

The paper tackles cross-spectral biometric recognition across VIS and infrared bands (SWIR, MWIR, LWIR) by focusing on body embeddings. It proposes a Cross-Spectral Semantic Body Identification framework based on a vision transformer that fuses global and local semantic features to achieve robust template-level matching across spectra, trained with a joint identity and triplet loss. Empirical results show body embeddings outperform face embeddings in MWIR/LWIR, and a VIS-pretrained ViT with simple finetuning attains state-of-the-art mAP on LLCM, while domain-aware sampling and local features enhance cross-domain consistency. The work demonstrates that reusing VIS-trained models with careful architectural design can effectively bridge large modality gaps in multi-domain infrared recognition, offering practical benefits for low-light surveillance and related applications.

Abstract

Biometric recognition becomes increasingly challenging as we move away from the visible spectrum to infrared imagery, where domain discrepancies significantly impact identification performance. In this paper, we show that body embeddings perform better than face embeddings for cross-spectral person identification in medium-wave infrared (MWIR) and long-wave infrared (LWIR) domains. Due to the lack of multi-domain datasets, previous research on cross-spectral body identification - also known as Visible-Infrared Person Re-Identification (VI-ReID) - has primarily focused on individual infrared bands, such as near-infrared (NIR) or LWIR, separately. We address the multi-domain body recognition problem using the IARPA Janus Benchmark Multi-Domain Face (IJB-MDF) dataset, which enables matching of short-wave infrared (SWIR), MWIR, and LWIR images against RGB (VIS) images. We leverage a vision transformer architecture to establish benchmark results on the IJB-MDF dataset and, through extensive experiments, provide valuable insights into the interrelation of infrared domains, the adaptability of VIS-pretrained models, the role of local semantic features in body-embeddings, and effective training strategies for small datasets. Additionally, we show that finetuning a body model, pretrained exclusively on VIS data, with a simple combination of cross-entropy and triplet losses achieves state-of-the-art mAP scores on the LLCM dataset.

Multi-Domain Biometric Recognition using Body Embeddings

TL;DR

Abstract

Multi-Domain Biometric Recognition using Body Embeddings

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)