AIM-Fair: Advancing Algorithmic Fairness via Selectively Fine-Tuning Biased Models with Contextual Synthetic Data

Zengqun Zhao; Ziquan Liu; Yu Cao; Shaogang Gong; Ioannis Patras

AIM-Fair: Advancing Algorithmic Fairness via Selectively Fine-Tuning Biased Models with Contextual Synthetic Data

Zengqun Zhao, Ziquan Liu, Yu Cao, Shaogang Gong, Ioannis Patras

TL;DR

AIM-Fair tackles fairness in facial attribute classification when demographic labels are unavailable by integrating Contextual Synthetic Data Generation with a latent diffusion model, and a gradient-based Selective Fine-Tuning framework that updates only bias- and domain-insensitive parameters. The three-component pipeline (CSDG, SMG, SFT) uses GPT-4-driven prompts to produce diverse, label-free synthetic data and identifies a parameter subset via gradient differentials across biased real, biased synthetic, and unbiased synthetic data, enabling fairer yet utility-preserving fine-tuning. Empirical results on CelebA and UTKFace show AIM-Fair achieving superior fairness (lower EO, higher WST) while maintaining or improving ACC compared with fully and partially fine-tuned baselines and state-of-the-art generative-data approaches like DiGA. The work demonstrates a scalable, annotation-free path to debiasing FAC models through context-rich synthetic data and principled, parameter-wise updates, with broad implications for fair AI systems using synthetic data.

Abstract

Recent advances in generative models have sparked research on improving model fairness with AI-generated data. However, existing methods often face limitations in the diversity and quality of synthetic data, leading to compromised fairness and overall model accuracy. Moreover, many approaches rely on the availability of demographic group labels, which are often costly to annotate. This paper proposes AIM-Fair, aiming to overcome these limitations and harness the potential of cutting-edge generative models in promoting algorithmic fairness. We investigate a fine-tuning paradigm starting from a biased model initially trained on real-world data without demographic annotations. This model is then fine-tuned using unbiased synthetic data generated by a state-of-the-art diffusion model to improve its fairness. Two key challenges are identified in this fine-tuning paradigm, 1) the low quality of synthetic data, which can still happen even with advanced generative models, and 2) the domain and bias gap between real and synthetic data. To address the limitation of synthetic data quality, we propose Contextual Synthetic Data Generation (CSDG) to generate data using a text-to-image diffusion model (T2I) with prompts generated by a context-aware LLM, ensuring both data diversity and control of bias in synthetic data. To resolve domain and bias shifts, we introduce a novel selective fine-tuning scheme in which only model parameters more sensitive to bias and less sensitive to domain shift are updated. Experiments on CelebA and UTKFace datasets show that our AIM-Fair improves model fairness while maintaining utility, outperforming both fully and partially fine-tuned approaches to model fairness.

AIM-Fair: Advancing Algorithmic Fairness via Selectively Fine-Tuning Biased Models with Contextual Synthetic Data

TL;DR

Abstract

AIM-Fair: Advancing Algorithmic Fairness via Selectively Fine-Tuning Biased Models with Contextual Synthetic Data

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)