Table of Contents
Fetching ...

PyCFRL: A Python library for counterfactually fair offline reinforcement learning via sequential data preprocessing

Jianhan Zhang, Jitao Wang, Chengchun Shi, John D. Piette, Donglin Zeng, Zhenke Wu

TL;DR

This work tackles fairness in sequential decision making by addressing counterfactual fairness within offline reinforcement learning. It introduces PyCFRL, a library that performs data preprocessing to remove sensitive-attribute influence from states and rewards, enabling the learning of counterfactually fair policies via offline RL methods such as fitted Q-iteration. The framework provides end-to-end tooling for preprocessing, policy learning, and evaluation of both policy value and counterfactual unfairness in a target environment, and demonstrates a data-example workflow with comparative baselines. The results illustrate that counterfactually fair policies can achieve low unfairness while maintaining competitive performance, highlighting the practical potential of CF-aware RL in high-stakes domains. Future work includes handling variable-length episodes, integrating with broader RL ecosystems, and extending state reconstruction beyond additive models.

Abstract

Reinforcement learning (RL) aims to learn and evaluate a sequential decision rule, often referred to as a "policy", that maximizes the population-level benefit in an environment across possibly infinitely many time steps. However, the sequential decisions made by an RL algorithm, while optimized to maximize overall population benefits, may disadvantage certain individuals who are in minority or socioeconomically disadvantaged groups. To address this problem, we introduce PyCFRL, a Python library for ensuring counterfactual fairness in offline RL. PyCFRL implements a novel data preprocessing algorithm for learning counterfactually fair RL policies from offline datasets and provides tools to evaluate the values and counterfactual unfairness levels of RL policies. We describe the high-level functionalities of PyCFRL and demonstrate one of its major use cases through a data example. The library is publicly available on PyPI and Github (https://github.com/JianhanZhang/PyCFRL), and detailed tutorials can be found in the PyCFRL documentation (https://pycfrl-documentation.netlify.app).

PyCFRL: A Python library for counterfactually fair offline reinforcement learning via sequential data preprocessing

TL;DR

This work tackles fairness in sequential decision making by addressing counterfactual fairness within offline reinforcement learning. It introduces PyCFRL, a library that performs data preprocessing to remove sensitive-attribute influence from states and rewards, enabling the learning of counterfactually fair policies via offline RL methods such as fitted Q-iteration. The framework provides end-to-end tooling for preprocessing, policy learning, and evaluation of both policy value and counterfactual unfairness in a target environment, and demonstrates a data-example workflow with comparative baselines. The results illustrate that counterfactually fair policies can achieve low unfairness while maintaining competitive performance, highlighting the practical potential of CF-aware RL in high-stakes domains. Future work includes handling variable-length episodes, integrating with broader RL ecosystems, and extending state reconstruction beyond additive models.

Abstract

Reinforcement learning (RL) aims to learn and evaluate a sequential decision rule, often referred to as a "policy", that maximizes the population-level benefit in an environment across possibly infinitely many time steps. However, the sequential decisions made by an RL algorithm, while optimized to maximize overall population benefits, may disadvantage certain individuals who are in minority or socioeconomically disadvantaged groups. To address this problem, we introduce PyCFRL, a Python library for ensuring counterfactual fairness in offline RL. PyCFRL implements a novel data preprocessing algorithm for learning counterfactually fair RL policies from offline datasets and provides tools to evaluate the values and counterfactual unfairness levels of RL policies. We describe the high-level functionalities of PyCFRL and demonstrate one of its major use cases through a data example. The library is publicly available on PyPI and Github (https://github.com/JianhanZhang/PyCFRL), and detailed tutorials can be found in the PyCFRL documentation (https://pycfrl-documentation.netlify.app).

Paper Structure

This paper contains 11 sections, 2 tables.