Table of Contents
Fetching ...

Fair Recommendations with Limited Sensitive Attributes: A Distributionally Robust Optimization Approach

Tianhao Shi, Yang Zhang, Jizhi Zhang, Fuli Feng, Xiangnan He

TL;DR

This work tackles fairness in recommender systems when only a subset of sensitive attributes is available, a setting motivated by privacy and legal constraints. It introduces Distributionally Robust Fair Optimization (DRFO), which hedges against reconstruction errors by minimizing worst-case unfairness over an ambiguity set of distributions around reconstructed attributes. The method combines building a TV-distance-based ambiguity set from reconstruction errors with a DRO objective that enforces fairness across all distributions in the set, showing theoretical robustness and empirical gains. Across MovieLens-1M and Tenrec, DRFO outperforms reconstruction-based baselines, particularly as the proportion of unknown sensitive attributes grows, while incurring only modest losses in recommendation accuracy. The approach also accommodates users who refuse reconstruction by using broad ambiguity sets, preserving fairness in real-world, privacy-conscious scenarios.

Abstract

As recommender systems are indispensable in various domains such as job searching and e-commerce, providing equitable recommendations to users with different sensitive attributes becomes an imperative requirement. Prior approaches for enhancing fairness in recommender systems presume the availability of all sensitive attributes, which can be difficult to obtain due to privacy concerns or inadequate means of capturing these attributes. In practice, the efficacy of these approaches is limited, pushing us to investigate ways of promoting fairness with limited sensitive attribute information. Toward this goal, it is important to reconstruct missing sensitive attributes. Nevertheless, reconstruction errors are inevitable due to the complexity of real-world sensitive attribute reconstruction problems and legal regulations. Thus, we pursue fair learning methods that are robust to reconstruction errors. To this end, we propose Distributionally Robust Fair Optimization (DRFO), which minimizes the worst-case unfairness over all potential probability distributions of missing sensitive attributes instead of the reconstructed one to account for the impact of the reconstruction errors. We provide theoretical and empirical evidence to demonstrate that our method can effectively ensure fairness in recommender systems when only limited sensitive attributes are accessible.

Fair Recommendations with Limited Sensitive Attributes: A Distributionally Robust Optimization Approach

TL;DR

This work tackles fairness in recommender systems when only a subset of sensitive attributes is available, a setting motivated by privacy and legal constraints. It introduces Distributionally Robust Fair Optimization (DRFO), which hedges against reconstruction errors by minimizing worst-case unfairness over an ambiguity set of distributions around reconstructed attributes. The method combines building a TV-distance-based ambiguity set from reconstruction errors with a DRO objective that enforces fairness across all distributions in the set, showing theoretical robustness and empirical gains. Across MovieLens-1M and Tenrec, DRFO outperforms reconstruction-based baselines, particularly as the proportion of unknown sensitive attributes grows, while incurring only modest losses in recommendation accuracy. The approach also accommodates users who refuse reconstruction by using broad ambiguity sets, preserving fairness in real-world, privacy-conscious scenarios.

Abstract

As recommender systems are indispensable in various domains such as job searching and e-commerce, providing equitable recommendations to users with different sensitive attributes becomes an imperative requirement. Prior approaches for enhancing fairness in recommender systems presume the availability of all sensitive attributes, which can be difficult to obtain due to privacy concerns or inadequate means of capturing these attributes. In practice, the efficacy of these approaches is limited, pushing us to investigate ways of promoting fairness with limited sensitive attribute information. Toward this goal, it is important to reconstruct missing sensitive attributes. Nevertheless, reconstruction errors are inevitable due to the complexity of real-world sensitive attribute reconstruction problems and legal regulations. Thus, we pursue fair learning methods that are robust to reconstruction errors. To this end, we propose Distributionally Robust Fair Optimization (DRFO), which minimizes the worst-case unfairness over all potential probability distributions of missing sensitive attributes instead of the reconstructed one to account for the impact of the reconstruction errors. We provide theoretical and empirical evidence to demonstrate that our method can effectively ensure fairness in recommender systems when only limited sensitive attributes are accessible.
Paper Structure (28 sections, 2 theorems, 13 equations, 5 figures, 2 tables, 1 algorithm)

This paper contains 28 sections, 2 theorems, 13 equations, 5 figures, 2 tables, 1 algorithm.

Key Result

Theorem 1

Assuming that the reconstructed sensitive attributes $\hat{S}$ have the same prior distribution as the true sensitive attributes $S$, i.e., $P(\hat{S})=P(S)$, the TV distance between $Q^{(s)}$ and $\hat{Q}^{(s)}$ is upper-bounded by the probability of incorrectly reconstructing the sensitive attribu

Figures (5)

  • Figure 1: Illustration of FLrSA and DRFO for providing fair recommendations with limited sensitive attributes. After the reconstruction of unknown sensitive attributes, the FLrSA directly applies fair learning with the reconstructed distribution. Conversely, DRFO builds an ambiguity set that encompasses the unknown true distribution and guarantees fairness across the entire ambiguity set.
  • Figure 2: Fairness comparison between baselines and DRFO on ML-1M and Tenrec for varying known sensitive attribute ratios. Lower DP values indicate better fairness.
  • Figure 3: Fairness performance under different levels of reconstruction errors for sensitive attributes.
  • Figure 4: Absolute difference of average predicted scores of different groups from global average predictions. Higher difference means more unfairness. 'K' stands for 'known', and 'U' stands for 'unknown'. 'S=0 (K)' denotes the users with the known sensitive attribute of 0, similarly for others.
  • Figure 5: Fairness results in scenarios where a portion of users does not allow reconstruction of their attributes among the users with unknown sensitive attributes.

Theorems & Definitions (3)

  • Theorem 1
  • Definition A.1: Total Variation Distance
  • Theorem 2