Table of Contents
Fetching ...

Centralized Selection with Preferences in the Presence of Biases

L. Elisa Celis, Amit Kumar, Nisheeth K. Vishnoi, Andrew Xu

TL;DR

This paper considers the scenario in which there are multiple institutions, each with a limited capacity for candidates, and candidates, each with preferences over the institutions, and an algorithm is presented with proof that it produces selections that achieve near-optimal group fairness with respect to preferences while also nearly maximizing the true utility under distributional assumptions.

Abstract

This paper considers the scenario in which there are multiple institutions, each with a limited capacity for candidates, and candidates, each with preferences over the institutions. A central entity evaluates the utility of each candidate to the institutions, and the goal is to select candidates for each institution in a way that maximizes utility while also considering the candidates' preferences. The paper focuses on the setting in which candidates are divided into multiple groups and the observed utilities of candidates in some groups are biased--systematically lower than their true utilities. The first result is that, in these biased settings, prior algorithms can lead to selections with sub-optimal true utility and significant discrepancies in the fraction of candidates from each group that get their preferred choices. Subsequently, an algorithm is presented along with proof that it produces selections that achieve near-optimal group fairness with respect to preferences while also nearly maximizing the true utility under distributional assumptions. Further, extensive empirical validation of these results in real-world and synthetic settings, in which the distributional assumptions may not hold, are presented.

Centralized Selection with Preferences in the Presence of Biases

TL;DR

This paper considers the scenario in which there are multiple institutions, each with a limited capacity for candidates, and candidates, each with preferences over the institutions, and an algorithm is presented with proof that it produces selections that achieve near-optimal group fairness with respect to preferences while also nearly maximizing the true utility under distributional assumptions.

Abstract

This paper considers the scenario in which there are multiple institutions, each with a limited capacity for candidates, and candidates, each with preferences over the institutions. A central entity evaluates the utility of each candidate to the institutions, and the goal is to select candidates for each institution in a way that maximizes utility while also considering the candidates' preferences. The paper focuses on the setting in which candidates are divided into multiple groups and the observed utilities of candidates in some groups are biased--systematically lower than their true utilities. The first result is that, in these biased settings, prior algorithms can lead to selections with sub-optimal true utility and significant discrepancies in the fraction of candidates from each group that get their preferred choices. Subsequently, an algorithm is presented along with proof that it produces selections that achieve near-optimal group fairness with respect to preferences while also nearly maximizing the true utility under distributional assumptions. Further, extensive empirical validation of these results in real-world and synthetic settings, in which the distributional assumptions may not hold, are presented.
Paper Structure (86 sections, 28 theorems, 72 equations, 13 figures, 1 table, 5 algorithms)

This paper contains 86 sections, 28 theorems, 72 equations, 13 figures, 1 table, 5 algorithms.

Key Result

Theorem 4.1

Consider an instance where the utilities of the candidates are drawn from the uniform distribution on $[0,1]$ and the distribution over preferences is arbitrary. Assume $n_1 = n_2=K$. Then, $\mathscr{P}({\mathcal{A}}_{\rm st}) \leq \beta + O \left( \frac{p\sqrt{\log n}}{{\sqrt{n}}} \right),$$\mathsc

Figures (13)

  • Figure 1: Preference-based fairness as measured by $\mathscr{P}^{(1)}$ and $\mathscr{P}^{(3)}$ using either gender or birth-category as the protected attribute with data from the 2009 JEE test under centralized admission (see \ref{['sec:simulation:real_world_data']} for details). The $x$-axis denotes $\phi$, the dispersion parameter of the Mallows distribution. Error bars denote the standard error of the mean over 50 iterations. Institution-wise constraints achieve near-optimal preference-based fairness.
  • Figure 2: Preference-based fairness as measured by $\mathscr{P}^{(1)}$, with synthetic data where non-i.i.d. preferences are generated from Mallows distributions. The $x$-axis denotes $\gamma$, the Kendall-Tau distance between the central rankings, and the error bars denote the standard error of the mean over 50 iterations. We observe that institution-wise constraints achieve higher preference-based fairness than group-wise and unconstrained settings.
  • Figure 3: Illustration of the proof of \ref{['thm:specialcaselog']}: The figure on the left plots the density $f(x)$. The quantities $A_1, A_2, A_3$ are the areas under the curve for the intervals $[0, \Delta], [\Delta, \Delta/\beta], [\Delta/\beta,1]$. The figure on the right plots $g(x):=\ln(f(x))$, which is a concave function. The line $L(x)$ is shown in red.
  • Figure 4: $\mathscr{P}^{(1)}$, $\mathscr{P}^{(3)}$, and $\mathscr{U}$ measured for synthetic data when the dispersion parameter for the Mallows distribution is varied over $\phi \in [0, 1]$. In (a), we see the preference-based fairness measured by $\mathscr{P}^{(1)}$ when utilities are generated from $\mathcal{D}_{\rm Gauss}$. In (b), we measure $\mathscr{P}^{(3)}$ when utilities are generated from $\mathcal{D}_{\rm Gauss}$. (c) shows the utility ratio when utilities are generated from $\mathcal{D}_{\rm Gauss}$. (d), (e), and (f) show the results of (a), (b), and (c), respectively, when utilities are generated from $\mathcal{D}_{\rm Pareto}$. Our main observation is that $\phi$ does not have a large impact on preference-based fairness for ${\mathcal{A}_{\rm inst\text{-}wise}{}}$, while ${\mathcal{A}}_{\rm group}$ and ${\mathcal{A}}_{\rm st}$ generally increase with $\phi$. See \ref{['sec:additional:empirical:dispersionpref']} for details and discussion. The $x$-axis denotes $\alpha$, the $y$-axis denotes $\mathscr{P}^{(1)}$, $\mathscr{P}^{(3)}$, or $\mathscr{U}$, and the error bars denote the standard error of the mean over 50 iterations.
  • Figure 5: Preference-based fairness as measured by the top-1 metric, $\mathscr{P}^{(1)}$, with synthetic data where non-i.i.d. preferences are generated from Mallows distributions. The $x$-axis denotes $\gamma$, the Kendall-Tau distance between the central rankings, and the error bars denote the standard error of the mean over 50 iterations. (a) shows $\mathscr{P}^{(1)}$ when utilities are generated from $\mathcal{D}_{\rm Gauss}$ with $\beta = \frac{2}{4}$. (b) modifies $\beta$ to $\beta = \frac{3}{4}$. (c) and (d) are equivalent to (a) and (b), respectively, when utilities are drawn from $\mathcal{D}_{\rm Pareto}$. See \ref{['sec:simulation:synthetic_data']} and \ref{['sec:additional:empirical:additional_plot_synthetic']} for details and discussion. We observe that institution-wise constraints achieve higher preference-based fairness than group-wise and unconstrained settings.
  • ...and 8 more figures

Theorems & Definitions (55)

  • Remark 3.1
  • Theorem 4.1
  • Theorem 4.2
  • Lemma 4.3: Informal; see \ref{['lem:concentrationbounds']}
  • Lemma 4.4: Informal; see \ref{['cl:AB']}
  • Lemma 4.4
  • Theorem 7.1: Hoeffding's Bound
  • Theorem 7.2: dubhashibook
  • Definition 7.3
  • Proposition 7.4
  • ...and 45 more