Table of Contents
Fetching ...

Bridging Jensen Gap for Max-Min Group Fairness Optimization in Recommendation

Chen Xu, Yuxin Li, Wenjie Wang, Liang Pang, Jun Xu, Tat-Seng Chua

TL;DR

This work tackles the Jensen gap that arises when optimizing group max-min fairness (MMF) in recommender systems under mini-batch training. It reformulates the MMF-constrained objective as a group-weighted accuracy objective via a dual formulation and introduces FairDual, a scalable algorithm that updates group weights through dual mirror-gradient steps, achieving sub-linear convergence to the global optimum. The authors prove a bound on the Jensen gap under random-shuffle minibatch training and demonstrate that FairDual reduces the gap while improving both ranking accuracy (NDCG/MRR) and fairness (MMF) across six backbones on three public datasets. The approach enables efficient, theoretically grounded fairness optimization in industrial-scale RS, with code publicly available.

Abstract

Group max-min fairness (MMF) is commonly used in fairness-aware recommender systems (RS) as an optimization objective, as it aims to protect marginalized item groups and ensures a fair competition platform. However, our theoretical analysis indicates that integrating MMF constraint violates the assumption of sample independence during optimization, causing the loss function to deviate from linear additivity. Such nonlinearity property introduces the Jensen gap between the model's convergence point and the optimal point if mini-batch sampling is applied. Both theoretical and empirical studies show that as the mini-batch size decreases and the group size increases, the Jensen gap will widen accordingly. Some methods using heuristic re-weighting or debiasing strategies have the potential to bridge the Jensen gap. However, they either lack theoretical guarantees or suffer from heavy computational costs. To overcome these limitations, we first theoretically demonstrate that the MMF-constrained objective can be essentially reformulated as a group-weighted optimization objective. Then we present an efficient and effective algorithm named FairDual, which utilizes a dual optimization technique to minimize the Jensen gap. Our theoretical analysis demonstrates that FairDual can achieve a sub-linear convergence rate to the globally optimal solution and the Jensen gap can be well bounded under a mini-batch sampling strategy with random shuffle. Extensive experiments conducted using six large-scale RS backbone models on three publicly available datasets demonstrate that FairDual outperforms all baselines in terms of both accuracy and fairness. Our data and codes are shared at https://github.com/XuChen0427/FairDual.

Bridging Jensen Gap for Max-Min Group Fairness Optimization in Recommendation

TL;DR

This work tackles the Jensen gap that arises when optimizing group max-min fairness (MMF) in recommender systems under mini-batch training. It reformulates the MMF-constrained objective as a group-weighted accuracy objective via a dual formulation and introduces FairDual, a scalable algorithm that updates group weights through dual mirror-gradient steps, achieving sub-linear convergence to the global optimum. The authors prove a bound on the Jensen gap under random-shuffle minibatch training and demonstrate that FairDual reduces the gap while improving both ranking accuracy (NDCG/MRR) and fairness (MMF) across six backbones on three public datasets. The approach enables efficient, theoretically grounded fairness optimization in industrial-scale RS, with code publicly available.

Abstract

Group max-min fairness (MMF) is commonly used in fairness-aware recommender systems (RS) as an optimization objective, as it aims to protect marginalized item groups and ensures a fair competition platform. However, our theoretical analysis indicates that integrating MMF constraint violates the assumption of sample independence during optimization, causing the loss function to deviate from linear additivity. Such nonlinearity property introduces the Jensen gap between the model's convergence point and the optimal point if mini-batch sampling is applied. Both theoretical and empirical studies show that as the mini-batch size decreases and the group size increases, the Jensen gap will widen accordingly. Some methods using heuristic re-weighting or debiasing strategies have the potential to bridge the Jensen gap. However, they either lack theoretical guarantees or suffer from heavy computational costs. To overcome these limitations, we first theoretically demonstrate that the MMF-constrained objective can be essentially reformulated as a group-weighted optimization objective. Then we present an efficient and effective algorithm named FairDual, which utilizes a dual optimization technique to minimize the Jensen gap. Our theoretical analysis demonstrates that FairDual can achieve a sub-linear convergence rate to the globally optimal solution and the Jensen gap can be well bounded under a mini-batch sampling strategy with random shuffle. Extensive experiments conducted using six large-scale RS backbone models on three publicly available datasets demonstrate that FairDual outperforms all baselines in terms of both accuracy and fairness. Our data and codes are shared at https://github.com/XuChen0427/FairDual.

Paper Structure

This paper contains 31 sections, 7 theorems, 61 equations, 6 figures, 8 tables, 1 algorithm.

Key Result

Theorem 1

For a vector $\bm{x}\in\mathbb{R}^n$, $\bm{x}^i$ denotes the element of the vector raised to the power of $i$. Similarly, $\log(\bm{x})$ denotes the element of the vector reduced as $\log(\bm{x}_i)$. Let $\bm{A}\in\mathbb{R}^{|\mathcal{I}|\times|\mathcal{G}|}$ is the item-group adjacent matrix, and where $\hat{\bm{A}}$ is the row-normalized matrix for $\bm{A}$: $\hat{\bm{A}}=\text{diag}(\bm{A}\bm

Figures (6)

  • Figure 1: Loss converges simulation with 1000 users and 1000 items. Sub-figure (a) and (b) illustrate the distance away from the optimal point (i.e., Jensen gap) w.r.t. mini-batch and group size, respectively. Figure (a) was conducted with the same group size (G=7) under different batch sizes, while Figure (b) was conducted with the same batch size (B=32) under different group sizes. Sub-figure (c) describes the converged trace under different batch sizes.
  • Figure 2: Overall workflow of FairDual under every two batches $j$ and $j+1$.
  • Figure 3: Sub-figure (a) conducts a simulation to show Jensen gap $J(B)$ changes w.r.t. batch size $B$ and group size $|\mathcal{G}|$ for all baselines and FairDual. Sub-figure(b,c) conducts on MIND dataset under BigRec. Sub-figure (b) describes the NDCG and MMF changes w.r.t. accuracy-fairness trade-off co-efficient $\lambda$. Sub-figure (c) conducts the case study on the advantage group News and worst-off group Sports. We show their weight $\bm{s}_g$, group score $\bm{w}_g$, and t-SNE embeddings of UNI and FairDual.
  • Figure 4: Sub-figure (a) and (b) describe the NDCG and MMF changes w.r.t. freeze parameter updating gap $\beta$.
  • Figure 5: Sub-figure (a) and (b) describe the NDCG and MMF changes w.r.t. dual learning rate $\eta$.
  • ...and 1 more figures

Theorems & Definitions (14)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4: Bound on Jensen Gap
  • proof
  • Lemma 1
  • proof
  • proof
  • Lemma 2
  • proof
  • ...and 4 more