Table of Contents
Fetching ...

Exploring Complementarity and Explainability in CNNs for Periocular Verification Across Acquisition Distances

Fernando Alonso-Fernandez, Kevin Hernandez Diaz, Jose M. Buades, Kiran Raja, Josef Bigun

TL;DR

This work addresses periocular verification under varying acquisition distances by comparing three CNNs of increasing complexity (SqueezeNet, MobileNetv2, ResNet50) trained on a large pool of ocular crops from VGGFace2 and evaluated on UBIPr. It evaluates two similarity metrics, applies score-level fusion via logistic regression, and leverages explainability tools (LIME heatmaps and Jensen–Shannon divergence) to analyze attention patterns, revealing complementary regions across networks. The study finds that while ResNet50 is strongest individually, fusing all three networks yields substantial gains, achieving state-of-the-art results on UBIPr and demonstrating that architectural diversity can enhance robustness to distance variations. It also highlights the value of explainability in guiding architectural decisions and fusion strategies, with practical implications for robust periocular biometrics in unconstrained settings.

Abstract

We study the complementarity of different CNNs for periocular verification at different distances on the UBIPr database. We train three architectures of increasing complexity (SqueezeNet, MobileNetv2, and ResNet50) on a large set of eye crops from VGGFace2. We analyse performance with cosine and chi2 metrics, compare different network initialisations, and apply score-level fusion via logistic regression. In addition, we use LIME heatmaps and Jensen-Shannon divergence to compare attention patterns of the CNNs. While ResNet50 consistently performs best individually, the fusion provides substantial gains, especially when combining all three networks. Heatmaps show that networks usually focus on distinct regions of a given image, which explains their complementarity. Our method significantly outperforms previous works on UBIPr, achieving a new state-of-the-art.

Exploring Complementarity and Explainability in CNNs for Periocular Verification Across Acquisition Distances

TL;DR

This work addresses periocular verification under varying acquisition distances by comparing three CNNs of increasing complexity (SqueezeNet, MobileNetv2, ResNet50) trained on a large pool of ocular crops from VGGFace2 and evaluated on UBIPr. It evaluates two similarity metrics, applies score-level fusion via logistic regression, and leverages explainability tools (LIME heatmaps and Jensen–Shannon divergence) to analyze attention patterns, revealing complementary regions across networks. The study finds that while ResNet50 is strongest individually, fusing all three networks yields substantial gains, achieving state-of-the-art results on UBIPr and demonstrating that architectural diversity can enhance robustness to distance variations. It also highlights the value of explainability in guiding architectural decisions and fusion strategies, with practical implications for robust periocular biometrics in unconstrained settings.

Abstract

We study the complementarity of different CNNs for periocular verification at different distances on the UBIPr database. We train three architectures of increasing complexity (SqueezeNet, MobileNetv2, and ResNet50) on a large set of eye crops from VGGFace2. We analyse performance with cosine and chi2 metrics, compare different network initialisations, and apply score-level fusion via logistic regression. In addition, we use LIME heatmaps and Jensen-Shannon divergence to compare attention patterns of the CNNs. While ResNet50 consistently performs best individually, the fusion provides substantial gains, especially when combining all three networks. Heatmaps show that networks usually focus on distinct regions of a given image, which explains their complementarity. Our method significantly outperforms previous works on UBIPr, achieving a new state-of-the-art.

Paper Structure

This paper contains 6 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Example images from the databases employed. The relative scale differences among normalised UBIPr images are shown, as well as their resulting size.
  • Figure 2: Ocular verification results (EER %) on UBIPr for scale variation experiments (ImageNet initialization, $\chi^2$ distance). The figure shows the performance of the individual networks (top) and of the different fusion combinations (bottom). The top plot also shows the fusion of all networks (best fusion case) for comparison with the individual networks.
  • Figure 3: Average LIME heatmaps on UBIPr per distance (columns) and CNN (rows).
  • Figure 4: Jensen–Shannon divergence between the heatmaps generated by the networks. The 3D scatter plot on the left represents divergence values across images for each pair of CNNs. The three plots on the right show the 2D projections onto each pair of axes. Correlation values are also given. The clouds are computed with images of the entire database (all distances).
  • Figure 5: Individual heatmaps where the three networks diverge the least (left) and the most (right). The numeric values indicate the average JS divergence between the three possible network pairs.