Balancing the Scales: Reinforcement Learning for Fair Classification

Leon Eshuijs; Shihan Wang; Antske Fokkens

Balancing the Scales: Reinforcement Learning for Fair Classification

Leon Eshuijs, Shihan Wang, Antske Fokkens

TL;DR

This work reframes fair classification as a Contextual Multi-Armed Bandit problem and leverages reward scaling to counter imbalances across classes and protected groups. It evaluates three RL algorithms—LinUCB, DQN_bandit, and PPO_bandit—as well as a supervised loss-reweighting baseline, on BiasBios (multi-class) and Emoji (binary) datasets, demonstrating that reward scaling can improve fairness with varying effects on accuracy. The study finds that deep RL methods excel in multi-class settings while a classical CMAB approach can be highly effective in binary tasks, and that Equal Opportunity-based scaling often yields robust fairness with modest accuracy trade-offs. These findings suggest RL-based fairness tools can adapt to diverse data imbalances and highlight practical considerations for deploying fair classifiers in real-world settings.

Abstract

Fairness in classification tasks has traditionally focused on bias removal from neural representations, but recent trends favor algorithmic methods that embed fairness into the training process. These methods steer models towards fair performance, preventing potential elimination of valuable information that arises from representation manipulation. Reinforcement Learning (RL), with its capacity for learning through interaction and adjusting reward functions to encourage desired behaviors, emerges as a promising tool in this domain. In this paper, we explore the usage of RL to address bias in imbalanced classification by scaling the reward function to mitigate bias. We employ the contextual multi-armed bandit framework and adapt three popular RL algorithms to suit our objectives, demonstrating a novel approach to mitigating bias.

Balancing the Scales: Reinforcement Learning for Fair Classification

TL;DR

Abstract

Paper Structure (44 sections, 15 equations, 6 figures, 11 tables, 1 algorithm)

This paper contains 44 sections, 15 equations, 6 figures, 11 tables, 1 algorithm.

Introduction
Related work
Bias Mitigation
Reinforcement Learning for Classification
Methodology
Contextual Multi-Armed Bandit
Reinforcement Learning Algorithms
LinUCB
$\textbf{DQN}_{\textbf{bandit}}$
$\textbf{PPO}_{\textbf{bandit}}$
Reward Scales
Supervised Learning: Loss reweighting
Experiments
Dataset
Context Vectors
...and 29 more sections

Figures (6)

Figure 1: Overview of the classification setup with input vector $x$, and output class $a$ for Reinforcement Learning and Supervised Learning, highlighting the place of the reward scaling matrix $\mathcal{W}_{RS}$
Figure 2: Reward scales for the professions with different gender imbalances Professor (50/50) and Nurse (90/10) using the different scaling functions.
Figure 3: TPR gap plotted against the gender distribution per profession for LinUCB. Left without reward scaling and right with EO reward scaling
Figure 4: Performance (Accuracy) and Fairness (GAP) on the Emoji dataset using different stereotyping ratios. All models use the scaling of $\mathcal{W}^{EO}$.
Figure 5: Evaluation accuracy of the different algorithms the full 28 classes and the 8 class subset of the Bias in Bios dataset
...and 1 more figures

Balancing the Scales: Reinforcement Learning for Fair Classification

TL;DR

Abstract

Balancing the Scales: Reinforcement Learning for Fair Classification

Authors

TL;DR

Abstract

Table of Contents

Figures (6)