Visual-textual Dermatoglyphic Animal Biometrics: A First Case Study on Panthera tigris
Wenshuo Li, Majid Mirmehdi, Tilo Burghardt
TL;DR
This work addresses the limitations of vision-only animal re-identification by introducing dermatoglyphic ACE textual descriptors to encode coat-pattern topology. It couples a robust text-based ACE encoding with a visual-textual co-synthesis pipeline to generate large-scale, biologically grounded synthetic data and trains a cross-modal retrieval system. Key findings include near-perfect text-only Re-ID (≈99.8%), substantial gains from synthetic data for text-to-image retrieval, and robustness improvements via anchor permutation. The approach advances explainable, language-guided animal biometrics with practical implications for ecological monitoring and data-efficient Re-ID across modalities.
Abstract
Biologists have long combined visuals with textual field notes to re-identify (Re-ID) animals. Contemporary AI tools automate this for species with distinctive morphological features but remain largely image-based. Here, we extend Re-ID methodologies by incorporating precise dermatoglyphic textual descriptors-an approach used in forensics but new to ecology. We demonstrate that these specialist semantics abstract and encode animal coat topology using human-interpretable language tags. Drawing on 84,264 manually labelled minutiae across 3,355 images of 185 tigers (Panthera tigris), we evaluate this visual-textual methodology, revealing novel capabilities for cross-modal identity retrieval. To optimise performance, we developed a text-image co-synthesis pipeline to generate 'virtual individuals', each comprising dozens of life-like visuals paired with dermatoglyphic text. Benchmarking against real-world scenarios shows this augmentation significantly boosts AI accuracy in cross-modal retrieval while alleviating data scarcity. We conclude that dermatoglyphic language-guided biometrics can overcome vision-only limitations, enabling textual-to-visual identity recovery underpinned by human-verifiable matchings. This represents a significant advance towards explainability in Re-ID and a language-driven unification of descriptive modalities in ecological monitoring.
