CLRecogEye : Curriculum Learning towards exploiting convolution features for Dynamic Iris Recognition
Geetanjali Sharma, Gaurav Jaswal, Aditya Nigam, Raghavendra Ramachandra
TL;DR
This work tackles robustness gaps in iris recognition under rotation, scale, reflections, and blur by introducing CLRecogEye, a curriculum-guided 3D-CNN that learns spatio-spatial-temporal iris features from stacked patch sequences. The method preprocesses iris data, converts it into a 4D patch-based representation, and uses a Modified Inflated 3D backbone with rectangular filters, trained with alternating Triplet and ArcFace losses to balance inter-class separation and intra-class compactness. Key contributions include the first explicit modeling of non-rigid motion via patch stacks, a lightweight 3D-CNN with temporal modeling, and a deep metric learning framework with curriculum guidance. Evaluations on CASIA Lamp-V3/V4, Blue Iris, and Dark Iris demonstrate competitive closed- and open-set performance and robust cross-condition generalization, while revealing areas for improvement in segmentation quality and data balance for future work.
Abstract
Iris authentication algorithms have achieved impressive recognition performance, making them highly promising for real-world applications such as border control, citizen identification, and both criminal investigations and commercial systems. However, their robustness is still challenged by variations in rotation, scale, specular reflections, and defocus blur. In addition, most existing approaches rely on straightforward point-to-point comparisons, typically using cosine or L2 distance, without effectively leveraging the spatio-spatial-temporal structure of iris patterns. To address these limitations, we propose a novel and generalized matching pipeline that learns rich spatio-spatial-temporal representations of iris features. Our approach first splits each iris image along one dimension, generating a sequence of sub-images that serve as input to a 3D-CNN, enabling the network to capture both spatial and spatio-spatial-temporal cues. To further enhance the modeling of spatio-spatial-temporal feature dynamics, we train the model in curriculum manner. This design allows the network to embed temporal dependencies directly into the feature space, improving discriminability in the deep metric domain. The framework is trained end-to-end with triplet and ArcFace loss in a curriculum manner, enforcing highly discriminative embeddings despite challenges like rotation, scale, reflections, and blur. This design yields a robust and generalizable solution for iris authentication.Github code: https://github.com/GeetanjaliGTZ/CLRecogEye
