Table of Contents
Fetching ...

Diversity-Fair Online Selection

Ming Hu, Yanzhi Li, Tongwen Wu

TL;DR

The paper tackles online diversity-aware selection where a recruiter must maximize the minimum, dimension-weighted utility across $d$ diversity attributes under adversarial candidate arrivals. It introduces a bilevel randomized framework: a higher level outputs ex-ante selection probabilities guided by a fluid LP benchmark, while a lower level implements online dependent rounding to meet capacity constraints with no loss in objective value. Two scenarios are analyzed: a fixed-capacity setting with marginal information, achieving a competitive ratio of $1/(4\sqrt{d}\lceil\log_2 d\rceil)$, and an unknown-capacity setting with increasing capacity per round, achieving $\Omega(1/d^{3/4})$ under mild boundedness assumptions. An overarching impossibility barrier of $O(1/d^{1/3})$ is established, highlighting the polynomial degradation with $d$ for any policy, which motivates the proposed structured bilevel approaches. The results provide guidance for dynamically arriving candidates in crowdsourcing and long-horizon hiring while prioritizing core diversity and compensating underrepresented dimensions, with implications for fairness-aware online allocation in practice.

Abstract

Online selection problems frequently arise in applications such as crowdsourcing and employee recruitment. Existing research typically focuses on candidates with a single attribute. However, crowdsourcing tasks often require contributions from individuals across various demographics. Further motivated by the dynamic nature of crowdsourcing and hiring, we study the diversity-fair online selection problem, in which a recruiter must make real-time decisions to foster workforce diversity across many dimensions. We propose two scenarios for this problem. The fixed-capacity scenario, suited for short-term hiring for crowdsourced workers, provides the recruiter with a fixed capacity to fill temporary job vacancies. In contrast, in the unknown-capacity scenario, recruiters optimize diversity across recruitment seasons with increasing capacities, reflecting that the firm honors diversity consideration in a long-term employee acquisition strategy. By modeling the diversity over $d$ dimensions as a max-min fairness objective, we show that no policy can surpass a competitive ratio of $O(1/d^{1/3})$ for either scenario, indicating that any achievable result inevitably decays by some polynomial factor in $d$. To this end, we develop bilevel hierarchical randomized policies that ensure compliance with the capacity constraint. For the fixed-capacity scenario, leveraging marginal information about the arriving population allows us to achieve a competitive ratio of $1/(4\sqrt{d} \lceil \log_2 d \rceil)$. For the unknown-capacity scenario, we establish a competitive ratio of $Ω(1/d^{3/4})$ under mild boundedness conditions. In both bilevel hierarchical policies, the higher level determines ex-ante selection probabilities and then informs the lower level's randomized selection that ensures no loss in efficiency. Both policies prioritize core diversity and then adjust for underrepresented dimensions.

Diversity-Fair Online Selection

TL;DR

The paper tackles online diversity-aware selection where a recruiter must maximize the minimum, dimension-weighted utility across diversity attributes under adversarial candidate arrivals. It introduces a bilevel randomized framework: a higher level outputs ex-ante selection probabilities guided by a fluid LP benchmark, while a lower level implements online dependent rounding to meet capacity constraints with no loss in objective value. Two scenarios are analyzed: a fixed-capacity setting with marginal information, achieving a competitive ratio of , and an unknown-capacity setting with increasing capacity per round, achieving under mild boundedness assumptions. An overarching impossibility barrier of is established, highlighting the polynomial degradation with for any policy, which motivates the proposed structured bilevel approaches. The results provide guidance for dynamically arriving candidates in crowdsourcing and long-horizon hiring while prioritizing core diversity and compensating underrepresented dimensions, with implications for fairness-aware online allocation in practice.

Abstract

Online selection problems frequently arise in applications such as crowdsourcing and employee recruitment. Existing research typically focuses on candidates with a single attribute. However, crowdsourcing tasks often require contributions from individuals across various demographics. Further motivated by the dynamic nature of crowdsourcing and hiring, we study the diversity-fair online selection problem, in which a recruiter must make real-time decisions to foster workforce diversity across many dimensions. We propose two scenarios for this problem. The fixed-capacity scenario, suited for short-term hiring for crowdsourced workers, provides the recruiter with a fixed capacity to fill temporary job vacancies. In contrast, in the unknown-capacity scenario, recruiters optimize diversity across recruitment seasons with increasing capacities, reflecting that the firm honors diversity consideration in a long-term employee acquisition strategy. By modeling the diversity over dimensions as a max-min fairness objective, we show that no policy can surpass a competitive ratio of for either scenario, indicating that any achievable result inevitably decays by some polynomial factor in . To this end, we develop bilevel hierarchical randomized policies that ensure compliance with the capacity constraint. For the fixed-capacity scenario, leveraging marginal information about the arriving population allows us to achieve a competitive ratio of . For the unknown-capacity scenario, we establish a competitive ratio of under mild boundedness conditions. In both bilevel hierarchical policies, the higher level determines ex-ante selection probabilities and then informs the lower level's randomized selection that ensures no loss in efficiency. Both policies prioritize core diversity and then adjust for underrepresented dimensions.

Paper Structure

This paper contains 51 sections, 11 theorems, 79 equations, 3 figures, 3 algorithms.

Key Result

Proposition 1

For any instance $\mathcal{I}$, the optimal objective value $\emph{OPT}(\mathcal{I})$ of Problem (eq:fluid) is equal to the performance $\emph{OFF}(\mathcal{I})$ of the optimal offline algorithm.

Figures (3)

  • Figure 1: The Relationships among FC, UC, UFC, UUC, and FCS scenarios.
  • Figure 2: Bilevel Solution Approach.
  • Figure EC.1: Hard Instances for FHC Scenario.

Theorems & Definitions (15)

  • Remark 1
  • Proposition 1: Equivalence on the optimal Offline algorithm
  • Remark 2
  • Definition 1
  • Proposition 2
  • Theorem 1
  • Proposition 3
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • ...and 5 more