Table of Contents
Fetching ...

Projected Boosting with Fairness Constraints: Quantifying the Cost of Fair Training Distributions

Amir Asiaee, Kaveh Aryan

TL;DR

This work addresses enforcing group fairness in boosting by introducing FairProj, a projection-based approach that decouples the ensemble-induced distribution used for coefficient updates from a fairness-constrained training distribution. At each round, FairProj projects the current exponential-weights distribution $q^t$ onto a convex fairness set to obtain $w^t$, trains the weak learner on $w^t$, and computes the boosting coefficient $\alpha_t$ under $q^t$, preserving the AdaBoost-style exponential-loss dynamics. Theoretical contributions include a tight edge-transfer bound $\gamma_t^{(q)} \geq \gamma_t^{(w)} - \delta_t$ with $\delta_t = \sqrt{\mathrm{KL}(w^t \| q^t)/2}$ and an exponential loss bound $L_{ m exp}(f_T) \leq n \exp\big(-2 \sum_{t=1}^T (\gamma_t^{(w)} - \delta_t)^2\big)$, which explicitly quantifies the cost of enforcing fairness in boosting. The paper also provides a closed-form KL-projection solution, a practical algorithm (FairProj) with warm-starting, and empirical evidence across standard benchmarks showing meaningful fairness-accuracy tradeoffs and stable training dynamics. Overall, it offers a principled framework to analyze and quantify the tradeoffs between fairness constraints and boosting performance, enabling targeted, data-driven fairness in sequential ensemble learning.

Abstract

Boosting algorithms enjoy strong theoretical guarantees: when weak learners maintain positive edge, AdaBoost achieves geometric decrease of exponential loss. We study how to incorporate group fairness constraints into boosting while preserving analyzable training dynamics. Our approach, FairBoost, projects the ensemble-induced exponential-weights distribution onto a convex set of distributions satisfying fairness constraints (as a reweighting surrogate), then trains weak learners on this fair distribution. The key theoretical insight is that projecting the training distribution reduces the effective edge of weak learners by a quantity controlled by the KL-divergence of the projection. We prove an exponential-loss bound where the convergence rate depends on weak learner edge minus a "fairness cost" term $δ_t = \sqrt{\mathrm{KL}(w^t \| q^t)/2}$. This directly quantifies the accuracy-fairness tradeoff in boosting dynamics. Experiments on standard benchmarks validate the theoretical predictions and demonstrate competitive fairness-accuracy tradeoffs with stable training curves.

Projected Boosting with Fairness Constraints: Quantifying the Cost of Fair Training Distributions

TL;DR

This work addresses enforcing group fairness in boosting by introducing FairProj, a projection-based approach that decouples the ensemble-induced distribution used for coefficient updates from a fairness-constrained training distribution. At each round, FairProj projects the current exponential-weights distribution onto a convex fairness set to obtain , trains the weak learner on , and computes the boosting coefficient under , preserving the AdaBoost-style exponential-loss dynamics. Theoretical contributions include a tight edge-transfer bound with and an exponential loss bound , which explicitly quantifies the cost of enforcing fairness in boosting. The paper also provides a closed-form KL-projection solution, a practical algorithm (FairProj) with warm-starting, and empirical evidence across standard benchmarks showing meaningful fairness-accuracy tradeoffs and stable training dynamics. Overall, it offers a principled framework to analyze and quantify the tradeoffs between fairness constraints and boosting performance, enabling targeted, data-driven fairness in sequential ensemble learning.

Abstract

Boosting algorithms enjoy strong theoretical guarantees: when weak learners maintain positive edge, AdaBoost achieves geometric decrease of exponential loss. We study how to incorporate group fairness constraints into boosting while preserving analyzable training dynamics. Our approach, FairBoost, projects the ensemble-induced exponential-weights distribution onto a convex set of distributions satisfying fairness constraints (as a reweighting surrogate), then trains weak learners on this fair distribution. The key theoretical insight is that projecting the training distribution reduces the effective edge of weak learners by a quantity controlled by the KL-divergence of the projection. We prove an exponential-loss bound where the convergence rate depends on weak learner edge minus a "fairness cost" term . This directly quantifies the accuracy-fairness tradeoff in boosting dynamics. Experiments on standard benchmarks validate the theoretical predictions and demonstrate competitive fairness-accuracy tradeoffs with stable training curves.
Paper Structure (60 sections, 5 theorems, 38 equations, 3 figures, 2 tables, 1 algorithm)

This paper contains 60 sections, 5 theorems, 38 equations, 3 figures, 2 tables, 1 algorithm.

Key Result

Lemma 4.1

Let $q \in \Delta_n$ and $\mathcal{C}_\epsilon$ be the constraint set eq:constraint-set. Define: The optimal dual variables solve: The projected distribution is: and the projection divergence is:

Figures (3)

  • Figure 1: EOpp fairness--accuracy tradeoffs across Adult (A=sex), German (A=sex), and COMPAS (A=race). FairProj traces a Pareto curve as $\epsilon$ varies (smaller is tighter). On Adult and German, tightening $\epsilon$ yields large EOpp reductions with moderate accuracy loss; on COMPAS the gains are smaller. EG achieves lower gaps but can incur a larger accuracy cost.
  • Figure 2: DP fairness--accuracy tradeoffs across datasets for the mass-balanced DP surrogate. Smaller $\epsilon$ can reduce DP gap relative to unconstrained AdaBoost, at the cost of early termination and reduced accuracy. Reweighing is often competitive on DP since it directly targets group--label mass balancing.
  • Figure 3: Training dynamics for FairProj with $\epsilon = 0.25$ on Adult. Top-left: Exponential loss decreases monotonically. Top-right: Edge under fair distribution $\gamma_t^{(w)}$ (blue), edge under ensemble distribution $\gamma_t^{(q)}$ (orange), and effective edge lower bound $\gamma_t^{(w)} - \delta_t$ (green dashed). The bound from Lemma \ref{['lem:edge-transfer']} holds at every round. Bottom-left: Fairness cost $\delta_t$ stabilizes after initial rounds. Bottom-right: Constraint violation stays below threshold $\epsilon$.

Theorems & Definitions (16)

  • Definition 3.1: Exponential-weights distribution
  • Definition 3.2: Fairness constraint set
  • Remark 3.3: Reweighting surrogate, not classifier fairness
  • Example 3.4: Equal opportunity surrogate
  • Lemma 4.1: KL projection via dual
  • proof
  • Remark 4.2: Computational cost
  • Definition 5.1: Edge
  • Lemma 5.2: Edge transfer bound
  • proof
  • ...and 6 more