Table of Contents
Fetching ...

M$^2$FGB: A Min-Max Gradient Boosting Framework for Subgroup Fairness

Jansen S. B. Pereira, Giovani Valdrighi, Marcos Medeiros Raimundo

TL;DR

This work tackles subgroup fairness in supervised learning by introducing M²FGB, a min-max gradient-boosting framework that minimizes a combined objective aimed at the worst-group loss. It uses a primal-dual boosting scheme with dual variables to balance overall and group-specific losses, and it employs differentiable proxy losses for fairness metrics such as equalized loss, equality of opportunity, and demographic parity. The approach is shown to converge under mild conditions and delivers competitive accuracy and fairness on German Credit, COMPAS, ENEM, and ACSIncome, with favorable computation relative to existing min-max methods. The method is versatile for both classification and regression on tabular data and is released as open-source code for practical deployment.

Abstract

In recent years, fairness in machine learning has emerged as a critical concern to ensure that developed and deployed predictive models do not have disadvantageous predictions for marginalized groups. It is essential to mitigate discrimination against individuals based on protected attributes such as gender and race. In this work, we consider applying subgroup justice concepts to gradient-boosting machines designed for supervised learning problems. Our approach expanded gradient-boosting methodologies to explore a broader range of objective functions, which combines conventional losses such as the ones from classification and regression and a min-max fairness term. We study relevant theoretical properties of the solution of the min-max optimization problem. The optimization process explored the primal-dual problems at each boosting round. This generic framework can be adapted to diverse fairness concepts. The proposed min-max primal-dual gradient boosting algorithm was theoretically shown to converge under mild conditions and empirically shown to be a powerful and flexible approach to address binary and subgroup fairness.

M$^2$FGB: A Min-Max Gradient Boosting Framework for Subgroup Fairness

TL;DR

This work tackles subgroup fairness in supervised learning by introducing M²FGB, a min-max gradient-boosting framework that minimizes a combined objective aimed at the worst-group loss. It uses a primal-dual boosting scheme with dual variables to balance overall and group-specific losses, and it employs differentiable proxy losses for fairness metrics such as equalized loss, equality of opportunity, and demographic parity. The approach is shown to converge under mild conditions and delivers competitive accuracy and fairness on German Credit, COMPAS, ENEM, and ACSIncome, with favorable computation relative to existing min-max methods. The method is versatile for both classification and regression on tabular data and is released as open-source code for practical deployment.

Abstract

In recent years, fairness in machine learning has emerged as a critical concern to ensure that developed and deployed predictive models do not have disadvantageous predictions for marginalized groups. It is essential to mitigate discrimination against individuals based on protected attributes such as gender and race. In this work, we consider applying subgroup justice concepts to gradient-boosting machines designed for supervised learning problems. Our approach expanded gradient-boosting methodologies to explore a broader range of objective functions, which combines conventional losses such as the ones from classification and regression and a min-max fairness term. We study relevant theoretical properties of the solution of the min-max optimization problem. The optimization process explored the primal-dual problems at each boosting round. This generic framework can be adapted to diverse fairness concepts. The proposed min-max primal-dual gradient boosting algorithm was theoretically shown to converge under mild conditions and empirically shown to be a powerful and flexible approach to address binary and subgroup fairness.

Paper Structure

This paper contains 17 sections, 2 theorems, 19 equations, 7 figures, 2 tables, 1 algorithm.

Key Result

Theorem 1

A predictor (classifier or regressor) $f_{m}$ solution for the min-max problem (Eq. eq:minimax) has all group-wise losses lower or equal than any predictor with equal group-wise loss, i.e., $f_{eq} \text{ such that }\overline{\mathcal{L}}_z(\bm{y}, f_{eq}(\bm{x})) = \overline{\mathcal{L}}_{\hat{z}}(

Figures (7)

  • Figure 1: Comparison between LGBM and M²FGB. The first column displays the decrease on log cross-entropy $\mathcal{L}$ for the model, and in the following two columns, each of the 8 lines display the evolution of the differentiable loss function ($\overline{\mathcal{L}}_z$), the non-differentiable metric for a specific group $z$. $\lambda$ was set to 0.5.
  • Figure 2: M²FGB solutions at different fairness strength values ($\lambda$ parameter). Increasing $\lambda$ increases performance on the worst-performing group, with a cost on the overall loss.
  • Figure 3: Average performance and fairness of algorithms in the test set of 1000 executions. 95% confidence intervals are displayed but really tight. Hyperparameter optimization was executed at different levels in the trade-off between performance and fairness. M²FGB presented higher or competitive accuracy and WG TP rate at all studied datasets.
  • Figure 4: Average performance and fairness of hyperparam optimization performed at multiple $\alpha$ considering the positive rate criteria. This setting may result on models that only predict the positive class, as ocurred with LGBM on German Credit and COMPAS datasets, and with M²FGB on German Credit. On other datasets, M²FGB was able to obtain gains on fairness without a high cost in performance.
  • Figure 5: Average performance and fairness of hyperparam optimization performed at multiple $\alpha$ in a regression task.
  • ...and 2 more figures

Theorems & Definitions (4)

  • Theorem 1: No unnecessary harm
  • proof
  • Theorem 2: Fairness monotonicity
  • proof