AugGen: Synthetic Augmentation using Diffusion Models Can Improve Recognition
Parsa Rahimi, Damien Teney, Sebastien Marcel
TL;DR
AugGen tackles privacy-sensitive face recognition by training a self-contained, class-conditional diffusion model on the target dataset and generating mixed-identity samples to augment the discriminator's training. It introduces a principled class-mixing scheme, forming new class conditions $\mathbf{c}^{*} = \alpha \mathbf{c}^{i} + \beta \mathbf{c}^{j}$ and selecting $\alpha,\beta$ via grid search to maximize a combined dissimilarity and similarity objective, yielding $\mathrm{D}^{aug}$. The discriminator trained on the mix of real and augmented data exhibits tighter intra-class compactness and stronger inter-class separation, delivering 1–12% gains across 8 benchmarks and often rivaling architectural improvements while using less real data. The results demonstrate that carefully integrated synthetic data can mitigate privacy concerns and meaningfully boost FR performance, though the approach relies on substantial upfront computation and highlights the need for better generative proxy metrics for downstream tasks.
Abstract
The increasing reliance on large-scale datasets in machine learning poses significant privacy and ethical challenges, particularly in sensitive domains such as face recognition. Synthetic data generation offers a promising alternative; however, most existing methods depend heavily on external datasets or pre-trained models, increasing complexity and resource demands. In this paper, we introduce AugGen, a self-contained synthetic augmentation technique. AugGen strategically samples from a class-conditional generative model trained exclusively on the target FR dataset, eliminating the need for external resources. Evaluated across 8 FR benchmarks, including IJB-C and IJB-B, our method achieves 1-12% performance improvements, outperforming models trained solely on real data and surpassing state-of-the-art synthetic data generation approaches, while using less real data. Notably, these gains often exceed those from architectural enhancements, underscoring the value of synthetic augmentation in data-limited scenarios. Our findings demonstrate that carefully integrated synthetic data can both mitigate privacy constraints and substantially enhance recognition performance. Paper website: https://parsa-ra.github.io/auggen/.
