Table of Contents
Fetching ...

Fairness with Exponential Weights

Stephen Pasteris, Chris Hicks, Vasilios Mavroudis

TL;DR

This work introduces Few, a meta-algorithm that converts any efficient Hedge-based base learner into a fair contextual bandit (and batch classifier) that guarantees exact statistical parity on every trial. By leveraging online-to-batch conversion and a carefully designed policy construction, Few matches the asymptotic regret of running Exp4 for each protected group while preserving computational efficiency, with extensions to dynamic distributions, massive context spaces, and hierarchical decompositions. The framework also provides reductions to handle empirical fairness in non-stationary environments, as well as approximate parity guarantees in large or infinite context spaces under IID assumptions. Overall, the approach delivers provable fairness guarantees with practical time/space complexity, enabling scalable, fair decision-making in online and batch settings across diverse contexts.

Abstract

Motivated by the need to remove discrimination in certain applications, we develop a meta-algorithm that can convert any efficient implementation of an instance of Hedge (or equivalently, an algorithm for discrete bayesian inference) into an efficient algorithm for the equivalent contextual bandit problem which guarantees exact statistical parity on every trial. Relative to any comparator with statistical parity, the resulting algorithm has the same asymptotic regret bound as running the corresponding instance of Exp4 for each protected characteristic independently. Given that our Hedge instance admits non-stationarity we can handle a varying distribution with which to enforce statistical parity with respect to, which is useful when the true population is unknown and needs to be estimated from the data received so far. Via online-to-batch conversion we can handle the equivalent batch classification problem with exact statistical parity, giving us results that we believe are novel and important in their own right.

Fairness with Exponential Weights

TL;DR

This work introduces Few, a meta-algorithm that converts any efficient Hedge-based base learner into a fair contextual bandit (and batch classifier) that guarantees exact statistical parity on every trial. By leveraging online-to-batch conversion and a carefully designed policy construction, Few matches the asymptotic regret of running Exp4 for each protected group while preserving computational efficiency, with extensions to dynamic distributions, massive context spaces, and hierarchical decompositions. The framework also provides reductions to handle empirical fairness in non-stationary environments, as well as approximate parity guarantees in large or infinite context spaces under IID assumptions. Overall, the approach delivers provable fairness guarantees with practical time/space complexity, enabling scalable, fair decision-making in online and batch settings across diverse contexts.

Abstract

Motivated by the need to remove discrimination in certain applications, we develop a meta-algorithm that can convert any efficient implementation of an instance of Hedge (or equivalently, an algorithm for discrete bayesian inference) into an efficient algorithm for the equivalent contextual bandit problem which guarantees exact statistical parity on every trial. Relative to any comparator with statistical parity, the resulting algorithm has the same asymptotic regret bound as running the corresponding instance of Exp4 for each protected characteristic independently. Given that our Hedge instance admits non-stationarity we can handle a varying distribution with which to enforce statistical parity with respect to, which is useful when the true population is unknown and needs to be estimated from the data received so far. Via online-to-batch conversion we can handle the equivalent batch classification problem with exact statistical parity, giving us results that we believe are novel and important in their own right.

Paper Structure

This paper contains 45 sections, 19 theorems, 170 equations.

Key Result

Theorem 4.1

Given a base algortihm with inductive bias $\vartheta\in\Delta_{\mathcal{H}}$ and learning rate $\hat{\eta}:=\eta/\sqrt{KT}$ for some $\eta>0$, Few gives us the following regret for the fair bandit problem. For any fair policy: and any $\vartheta^*\in\mathcal{E}(\tilde{\pi})$ we have: where: The asymptotic time complexity of each trial $t\in[T]$ is that of calling, for each $c\in\mathcal{C}$, $

Theorems & Definitions (19)

  • Theorem 4.1
  • Theorem 4.2
  • Lemma 3.1
  • Lemma 3.2
  • Lemma 3.3
  • Lemma 3.4
  • Lemma 3.5
  • Lemma 3.6
  • Lemma 3.7
  • Lemma 3.8
  • ...and 9 more