Fair Submodular Cover

Wenjing Chen; Shuo Xing; Samson Zhou; Victoria G. Crawford

Fair Submodular Cover

Wenjing Chen, Shuo Xing, Samson Zhou, Victoria G. Crawford

TL;DR

This work introduces Fair Submodular Cover (FSC), the problem of minimizing a subset size under a submodular, monotone objective and per-group fairness bounds. By exploiting a dual relationship to fair submodular maximization (FSM), the authors develop two conversion schemes—convert-fair and convert-continuous—that transform FSM bicriteria guarantees into FSC guarantees, preserving fairness via a $\beta$-extension of the fairness matroid. They then provide three FSM bicriteria algorithms (two discrete: greedy-fair-bi, threshold-fairness-bi; one continuous: cont-thresh-greedy-bi) that can be paired with the conversions to yield FSC algorithms with strong approximation ratios approaching the best known for plain submodular cover. Empirical evaluations on maximum-coverage instances demonstrate that the fair algorithms achieve more balanced group representations at the cost of larger solution sizes, validating the practical viability of fair submodular cover in real datasets such as Twitch_5000 and Corel5k.

Abstract

Submodular optimization is a fundamental problem with many applications in machine learning, often involving decision-making over datasets with sensitive attributes such as gender or age. In such settings, it is often desirable to produce a diverse solution set that is fairly distributed with respect to these attributes. Motivated by this, we initiate the study of Fair Submodular Cover (FSC), where given a ground set $U$, a monotone submodular function $f:2^U\to\mathbb{R}_{\ge 0}$, a threshold $τ$, the goal is to find a balanced subset of $S$ with minimum cardinality such that $f(S)\geτ$. We first introduce discrete algorithms for FSC that achieve a bicriteria approximation ratio of $(\frac{1}ε, 1-O(ε))$. We then present a continuous algorithm that achieves a $(\ln\frac{1}ε, 1-O(ε))$-bicriteria approximation ratio, which matches the best approximation guarantee of submodular cover without a fairness constraint. Finally, we complement our theoretical results with a number of empirical evaluations that demonstrate the effectiveness of our algorithms on instances of maximum coverage.

Fair Submodular Cover

TL;DR

-extension of the fairness matroid. They then provide three FSM bicriteria algorithms (two discrete: greedy-fair-bi, threshold-fairness-bi; one continuous: cont-thresh-greedy-bi) that can be paired with the conversions to yield FSC algorithms with strong approximation ratios approaching the best known for plain submodular cover. Empirical evaluations on maximum-coverage instances demonstrate that the fair algorithms achieve more balanced group representations at the cost of larger solution sizes, validating the practical viability of fair submodular cover in real datasets such as Twitch_5000 and Corel5k.

Abstract

, a monotone submodular function

, a threshold

, the goal is to find a balanced subset of

with minimum cardinality such that

. We first introduce discrete algorithms for FSC that achieve a bicriteria approximation ratio of

. We then present a continuous algorithm that achieves a

-bicriteria approximation ratio, which matches the best approximation guarantee of submodular cover without a fairness constraint. Finally, we complement our theoretical results with a number of empirical evaluations that demonstrate the effectiveness of our algorithms on instances of maximum coverage.

Paper Structure (26 sections, 18 theorems, 51 equations, 2 figures, 7 algorithms)

This paper contains 26 sections, 18 theorems, 51 equations, 2 figures, 7 algorithms.

Introduction
Related Work
Preliminaries
Conversion Algorithms for FSC
Bicriteria Algorithms for FSM
Discrete Bicriteria Algorithms for FSM
Continuous Algorithms for FSM
Experiments
Omitted Lemma of Section \ref{['sec:problem_setup']}
Appendix for Section \ref{['sec:conv']}
Proof of Theorem \ref{['thm:convert']}
Converting theorem for continuous algorithms
Appendix for Section \ref{['sec:alg_for_FSM']}
Appendix for Section \ref{['sec:discrete']}
Proof of Theorem \ref{['thm:greedy']}
...and 11 more sections

Key Result

Theorem 1

Assuming $\sum_{c\in[N]}\min\{q_c, \frac{|U_c|}{\beta(1+\alpha)|OPT|)}\}\geq1$, any $(\gamma,\beta)$-bicriteria approximation algorithm for FSM that returns a solution set in time $\mathcal{T}(n,\kappa)$ can be converted into an approximation algorithm for FSC that is a $((1+\alpha)\beta,\gamma)$-bi

Figures (2)

Figure 1: Performance comparison on the Twitch_5000 dataset for Maximum Coverage. \ref{['fig:radar-greedy-bi']}, \ref{['fig:radar-greedy-fairness-bi']}, \ref{['fig:radar-threshold-fairness-bi']} illustrate the distribution of users speaking different languages in the solutions produced by various algorithms with $\tau = 2400$. $f$: the value of the objective submodular function. Cost: the size of the returned solution. Fairness difference: $(\max_c |S \cap U_c| - \min_c |S \cap U_c|) / |S|$.
Figure 2: Performance comparison on the Corel dataset for Set Covering. \ref{['fig:radar-greedy-bi-set']}, \ref{['fig:radar-greedy-fairness-bi-set']}, \ref{['fig:radar-threshold-fairness-bi-set']} illustrate the distribution of images across various categories in the solutions produced by different algorithms with $\tau = 300$. $f$: the value of the objective submodular function. Cost: the size of the returned solution. Fairness difference: $(\max_c |S \cap U_c| - \min_c |S \cap U_c|) / |S|$

Theorems & Definitions (34)

Definition 1
Definition 2
Definition 3
Definition 4
Theorem 1
Theorem 2
Lemma 1
Lemma 2
Theorem 3
Theorem 4
...and 24 more

Fair Submodular Cover

TL;DR

Abstract

Fair Submodular Cover

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (34)