Table of Contents
Fetching ...

A Responsible Face Recognition Approach for Small and Mid-Scale Systems Through Personalized Neural Networks

Sebastian Groß, Stefan Heindorf, Philipp Terhörst

TL;DR

This paper introduces Model-Template (MOTE), a face recognition approach that replaces fixed vector templates with per-identity small neural classifiers trained from a single reference sample augmented by KDE-generated synthetic templates. This design aims to enhance privacy, fairness, and explainability in small- and mid-scale systems, while preserving competitive recognition performance. Key contributions include a KDE-based template generation pipeline, an identity-specific classifier training framework with a privacy-preserving loss, and empirical evidence showing improved privacy (near-random gender-inference attack success), tunable per-individual fairness, and interpretable decisions via Grad-CAM++. Trade-offs include increased storage (≈7.6× per identity) and enrollment time, but the approach is practical for deployments prioritizing responsibility aspects over raw efficiency.

Abstract

Traditional face recognition systems rely on extracting fixed face representations, known as templates, to store and verify identities. These representations are typically generated by neural networks that often lack explainability and raise concerns regarding fairness and privacy. In this work, we propose a novel model-template (MOTE) approach that replaces vector-based face templates with small personalized neural networks. This design enables more responsible face recognition for small and medium-scale systems. During enrollment, MOTE creates a dedicated binary classifier for each identity, trained to determine whether an input face matches the enrolled identity. Each classifier is trained using only a single reference sample, along with synthetically balanced samples to allow adjusting fairness at the level of a single individual during enrollment. Extensive experiments across multiple datasets and recognition systems demonstrate substantial improvements in fairness and particularly in privacy. Although the method increases inference time and storage requirements, it presents a strong solution for small- and mid-scale applications where fairness and privacy are critical.

A Responsible Face Recognition Approach for Small and Mid-Scale Systems Through Personalized Neural Networks

TL;DR

This paper introduces Model-Template (MOTE), a face recognition approach that replaces fixed vector templates with per-identity small neural classifiers trained from a single reference sample augmented by KDE-generated synthetic templates. This design aims to enhance privacy, fairness, and explainability in small- and mid-scale systems, while preserving competitive recognition performance. Key contributions include a KDE-based template generation pipeline, an identity-specific classifier training framework with a privacy-preserving loss, and empirical evidence showing improved privacy (near-random gender-inference attack success), tunable per-individual fairness, and interpretable decisions via Grad-CAM++. Trade-offs include increased storage (≈7.6× per identity) and enrollment time, but the approach is practical for deployments prioritizing responsibility aspects over raw efficiency.

Abstract

Traditional face recognition systems rely on extracting fixed face representations, known as templates, to store and verify identities. These representations are typically generated by neural networks that often lack explainability and raise concerns regarding fairness and privacy. In this work, we propose a novel model-template (MOTE) approach that replaces vector-based face templates with small personalized neural networks. This design enables more responsible face recognition for small and medium-scale systems. During enrollment, MOTE creates a dedicated binary classifier for each identity, trained to determine whether an input face matches the enrolled identity. Each classifier is trained using only a single reference sample, along with synthetically balanced samples to allow adjusting fairness at the level of a single individual during enrollment. Extensive experiments across multiple datasets and recognition systems demonstrate substantial improvements in fairness and particularly in privacy. Although the method increases inference time and storage requirements, it presents a strong solution for small- and mid-scale applications where fairness and privacy are critical.

Paper Structure

This paper contains 20 sections, 5 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Verification process of the proposed and the traditional face recognition approach: (a) Our proposed model-template (MOTE) method uses a classifier-based approach where individual neural networks are trained for each identity to make genuine/imposter decisions directly from extracted face features; (b) Traditional similarity-based face recognition compares extracted templates against stored templates using a similarity function. The key difference lies in how verification decisions are made: MOTE leverages identity-specific models that learn to distinguish between genuine and imposter features, while similarity-based methods rely on fixed distance metrics between feature vectors.
  • Figure 2: Visualization of the template generation process - (0) Original embeddings with one color per identity, (1) Computing centroids for each identity, (2) Normalizing templates by subtracting centroids, (3) Splitting normalized templates by attribute (gender), (4) Computing separate Kernel Density Estimation (KDE) models for each attribute, and (5) Generating new templates with a variable balancing factor and controllable sample sizes.
  • Figure 3: Analysing Verification Performance in Terms Of ROC Curves - The FNMR is reported over various FMR, comparing the proposed MOTE approach against traditional (Trd.) face recognition methods (either ArcFace or MagFace) across multiple datasets. Over all dataset and face recognition systems, a weaker but comparable performance is observable.
  • Figure 4: Analysis of Explainability - Grad-CAM++ Visualization of MOTE demonstrating cross-subject behavior. The leftmost column shows template images used to train individual classifiers, featuring varying lighting conditions, backgrounds, and poses. The top row displays the corresponding test images. The heatmaps in subsequent rows demonstrate how each personlized model attends to facial features when making decisions, with red regions indicating areas of highest importance for the identity verification.