Table of Contents
Fetching ...

Binary Choice under Asymmetric Loss in a Data-Rich Environment: Theory and an Application to Algorithmic Fairness

Andrii Babii, Xi Chen, Eric Ghysels, Rohit Kumar

TL;DR

The paper tackles binary decision problems in data-rich settings under asymmetric, covariate-dependent losses by introducing a loss-based reweighting strategy that reduces to convexified empirical risk minimization. By replacing the nonconvex indicator loss with convex surrogates $\phi$, it derives a general excess-risk bound that ties the binary decision risk to the convexified risk, enabling use of standard ML methods (logistic regression, boosting, deep nets, SVM) with weights determined by the losses. The authors provide finite-sample rates for parametric models, high-dimensional LASSO variants, and deep neural networks under a Tsybakov-type margin condition, and demonstrate that the approach can reproduce minimax-optimal behavior in certain regimes. They illustrate the method with Monte Carlo simulations and a substantive empirical application to pretrial detention fairness using the Broward COMPAS dataset, showing how covariate-driven loss functions can reduce disparities and align classifier performance with welfare-inspired objectives. Overall, the work offers a distribution-free, conceptually transparent framework for cost-sensitive binary decisions in high dimensions, with direct implications for algorithmic fairness and policy design.

Abstract

We study the binary choice problem in a data-rich environment with asymmetric loss functions. The econometrics literature covers nonparametric binary choice problems but does not offer computationally attractive solutions in data-rich environments. The machine learning literature has many algorithms but is focused mostly on loss functions that are independent of covariates. We show that theoretically valid decisions on binary outcomes with general loss functions can be achieved via a very simple loss-based reweighting of logistic regression or state-of-the-art machine learning techniques. We apply our analysis to algorithmic fairness in pretrial detentions.

Binary Choice under Asymmetric Loss in a Data-Rich Environment: Theory and an Application to Algorithmic Fairness

TL;DR

The paper tackles binary decision problems in data-rich settings under asymmetric, covariate-dependent losses by introducing a loss-based reweighting strategy that reduces to convexified empirical risk minimization. By replacing the nonconvex indicator loss with convex surrogates , it derives a general excess-risk bound that ties the binary decision risk to the convexified risk, enabling use of standard ML methods (logistic regression, boosting, deep nets, SVM) with weights determined by the losses. The authors provide finite-sample rates for parametric models, high-dimensional LASSO variants, and deep neural networks under a Tsybakov-type margin condition, and demonstrate that the approach can reproduce minimax-optimal behavior in certain regimes. They illustrate the method with Monte Carlo simulations and a substantive empirical application to pretrial detention fairness using the Broward COMPAS dataset, showing how covariate-driven loss functions can reduce disparities and align classifier performance with welfare-inspired objectives. Overall, the work offers a distribution-free, conceptually transparent framework for cost-sensitive binary decisions in high dimensions, with direct implications for algorithmic fairness and policy design.

Abstract

We study the binary choice problem in a data-rich environment with asymmetric loss functions. The econometrics literature covers nonparametric binary choice problems but does not offer computationally attractive solutions in data-rich environments. The machine learning literature has many algorithms but is focused mostly on loss functions that are independent of covariates. We show that theoretically valid decisions on binary outcomes with general loss functions can be achieved via a very simple loss-based reweighting of logistic regression or state-of-the-art machine learning techniques. We apply our analysis to algorithmic fairness in pretrial detentions.

Paper Structure

This paper contains 23 sections, 22 theorems, 191 equations, 9 figures, 9 tables.

Key Result

Proposition 2.1

The optimal binary decision $f^*$ solves with $\omega(Y,X)\triangleq Ya(X)+b(X)$, $a(x) = \ell_{-1,1}(x)-\ell_{1,1}(x) + \ell_{-1,-1}(x) - \ell_{1,-1}(x)$, $b(x) = \ell_{-1,1}(x) - \ell_{1,1}(x) + \ell_{1,-1}(x) - \ell_{-1,-1}(x)$, and $\ell_{f,y}(x)\triangleq\ell(f,y,x)$.

Figures (9)

  • Figure 1: Indicator function with convexifications corresponding to logistic regression (logistic), boosting (exponential), and support vector machines (hinge).
  • Figure 2: Directed graph of our deep learning architecture with $d=4$ covariates, $L=3$ hidden layers of width $\mathbf{w}=(4,3,5)$ neurons, and 2 outer ReLU neurons. The orange neuron takes covariates $X\in\mathbf{R}^d$ as an input and produces $c(X)\in\mathbf{R}$, which is fed directly into 2 ReLU neurons.
  • Figure 3: Asymmetric Binary Choice. Each subplot corresponds to changes in the FP cost (left) or FN cost (right) for Logit (top) and boosting (bottom). The figure shows that introducing asymmetries in the loss function can equalize the False Positive and the False Negative rates across groups. Setting: $\rho=0.2$, $\sigma=0.1$, $\tau=0$, $n=1,000$. Results based on $5,000$ Monte Carlo experiments.
  • Figure 4: Training and Test AUC of LASSO-Logit path. The figure shows that the highest test AUC is achieved for the model with 152 covariates.
  • Figure 5: Asymmetric Binary Choice: Penalizing False Negative Mistakes. The figure shows that increasing the cost of false negative mistakes in the loss function is enough to balance FP and FN rates.
  • ...and 4 more figures

Theorems & Definitions (50)

  • Example 2.1: Lending decisions
  • Example 2.2: Social planner with a disadvantaged group
  • Proposition 2.1
  • Proposition 3.1
  • Theorem 3.1
  • Example 3.1: Logistic convexification
  • Example 3.2: Exponential convexification
  • Example 3.3: Hinge convexification
  • Theorem 4.1
  • Theorem 4.2
  • ...and 40 more