What Constitutes a Less Discriminatory Algorithm?
Benjamin Laufer, Manish Raghavan, Solon Barocas
TL;DR
The paper addresses how to define and locate less discriminatory algorithms (LDAs) under disparate impact law, arguing that purely quantitative definitions fail without held-out data and proposing a reasonableness standard based on projected performance. It formalizes LDAs as a comparison between a baseline $h^0$ and a candidate $h'$, analyzes the mathematical structure of achievable accuracy-disparity trade-offs via a feasible set (a polygon in the utility-disparity plane), and proves that finding the exact least-discriminatory algorithm is NP-hard while providing approximation approaches. It also demonstrates, through empirical studies on the Adult and German Credit datasets, that simple randomized search procedures can yield out-of-sample disparity reductions with little or no loss in utility in some settings. Overall, the work bridges legal concepts with algorithmic design by offering a rigorous framework for searching for LDAs and clarifying the practical limits and strategies for such searches.
Abstract
Disparate impact doctrine offers an important legal apparatus for targeting discriminatory data-driven algorithmic decisions. A recent body of work has focused on conceptualizing one particular construct from this doctrine: the less discriminatory alternative, an alternative policy that reduces disparities while meeting the same business needs of a status quo or baseline policy. However, attempts to operationalize this construct in the algorithmic setting must grapple with some thorny challenges and ambiguities. In this paper, we attempt to raise and resolve important questions about less discriminatory algorithms (LDAs). How should we formally define LDAs, and how does this interact with different societal goals they might serve? And how feasible is it for firms or plaintiffs to computationally search for candidate LDAs? We find that formal LDA definitions face fundamental challenges when they attempt to evaluate and compare predictive models in the absence of held-out data. As a result, we argue that LDA definitions cannot be purely quantitative, and must rely on standards of "reasonableness." We then identify both mathematical and computational constraints on firms' ability to efficiently conduct a proactive search for LDAs, but we provide evidence that these limits are "weak" in a formal sense. By defining LDAs formally, we put forward a framework in which both firms and plaintiffs can search for alternative models that comport with societal goals.
