Table of Contents
Fetching ...

How Knowledge Distillation Mitigates the Synthetic Gap in Fair Face Recognition

Pedro C. Neto, Ivona Colakovic, Sašo Karakatič, Ana F. Sequeira

TL;DR

The paper tackles the retraction of real face-recognition datasets by exploring knowledge distillation (KD) from a real-data trained Teacher to smaller Student models trained on synthetic or mixed data. It presents an ethnicity-aware dataset merging strategy and evaluates multiple architectures and losses, showing that KD improves both accuracy and fairness and mitigates the performance gap between real and synthetic data. A 70% real and 30% synthetic mix often matches or surpass real-data KD performance, while also reducing bias across ethnic groups, making synthetic data training more viable and privacy-preserving. The findings highlight the practical impact of KD in fair, privacy-conscious FR systems and point to future work on expanding architectures, refining sampling, and exploring the role of training-time complexity versus deployment efficiency.

Abstract

Leveraging the capabilities of Knowledge Distillation (KD) strategies, we devise a strategy to fight the recent retraction of face recognition datasets. Given a pretrained Teacher model trained on a real dataset, we show that carefully utilising synthetic datasets, or a mix between real and synthetic datasets to distil knowledge from this teacher to smaller students can yield surprising results. In this sense, we trained 33 different models with and without KD, on different datasets, with different architectures and losses. And our findings are consistent, using KD leads to performance gains across all ethnicities and decreased bias. In addition, it helps to mitigate the performance gap between real and synthetic datasets. This approach addresses the limitations of synthetic data training, improving both the accuracy and fairness of face recognition models.

How Knowledge Distillation Mitigates the Synthetic Gap in Fair Face Recognition

TL;DR

The paper tackles the retraction of real face-recognition datasets by exploring knowledge distillation (KD) from a real-data trained Teacher to smaller Student models trained on synthetic or mixed data. It presents an ethnicity-aware dataset merging strategy and evaluates multiple architectures and losses, showing that KD improves both accuracy and fairness and mitigates the performance gap between real and synthetic data. A 70% real and 30% synthetic mix often matches or surpass real-data KD performance, while also reducing bias across ethnic groups, making synthetic data training more viable and privacy-preserving. The findings highlight the practical impact of KD in fair, privacy-conscious FR systems and point to future work on expanding architectures, refining sampling, and exploring the role of training-time complexity versus deployment efficiency.

Abstract

Leveraging the capabilities of Knowledge Distillation (KD) strategies, we devise a strategy to fight the recent retraction of face recognition datasets. Given a pretrained Teacher model trained on a real dataset, we show that carefully utilising synthetic datasets, or a mix between real and synthetic datasets to distil knowledge from this teacher to smaller students can yield surprising results. In this sense, we trained 33 different models with and without KD, on different datasets, with different architectures and losses. And our findings are consistent, using KD leads to performance gains across all ethnicities and decreased bias. In addition, it helps to mitigate the performance gap between real and synthetic datasets. This approach addresses the limitations of synthetic data training, improving both the accuracy and fairness of face recognition models.
Paper Structure (21 sections, 2 equations, 2 figures, 4 tables)

This paper contains 21 sections, 2 equations, 2 figures, 4 tables.

Figures (2)

  • Figure 1: Visual representation of the proposed approach. Includes the merge of several real or synthetic datasets using a ethnicity aware sampling approach. On the top, it is visible a frozen (snowflake icon) pretrained teacher that distils knowledge to a training student (fire icon).
  • Figure 2: Samples of images that appear in all datasets retrieved from BalancedFace (real data) in the first row, IDiff-Face (synthetic) in the second row and DCFace (synthetic) in the third row.