Optimal and Fair Encouragement Policy Evaluation and Learning

Angela Zhou

Optimal and Fair Encouragement Policy Evaluation and Learning

Angela Zhou

TL;DR

This work addresses the challenge of learning optimal encouragement policies in settings where participation cannot be forced, by integrating causal inference with constrained, fairness-aware policy optimization. It introduces a covariate-conditional exclusion restriction to identify the causal impact of encouragement on take-up and downstream outcomes, and develops a two-stage learning framework with robust, variance-sensitive guarantees under overlap violations. The methodology supports resource-parity fairness constraints and extends to algorithmic recommendations, using doubly robust estimators and reductions to constrained cost-sensitive classification. Through SNAP reminders, health-insurance enrollment, and electronic-monitoring case studies, the paper demonstrates how disentangling treatment efficacy from take-up dynamics improves both efficiency and equity, while acknowledging context-specific trade-offs. The results highlight that effective and fair decision rules require careful targeting of encouragements and a nuanced understanding of disparities in access and responsiveness, rather than purely optimizing overall outcomes.

Abstract

In consequential domains, it is often impossible to compel individuals to take treatment, so that optimal policy rules are merely suggestions in the presence of human non-adherence to treatment recommendations. Under heterogeneity, covariates may predict take-up of treatment and final outcome, but differently. While optimal treatment rules optimize causal outcomes across the population, access parity constraints or other fairness considerations on who receives treatment can be important. For example, in social services, a persistent puzzle is the gap in take-up of beneficial services among those who may benefit from them the most. We study causal identification and robust estimation of optimal treatment rules, including under potential violations of positivity. We consider fairness constraints such as demographic parity in treatment take-up, and other constraints, via constrained optimization. Our framework can be extended to handle algorithmic recommendations under an often-reasonable covariate-conditional exclusion restriction, using our robustness checks for lack of positivity in the recommendation. We develop a two-stage algorithm for solving over parametrized policy classes under general constraints to obtain variance-sensitive regret bounds. We illustrate the methods in three case studies based on data from reminders of SNAP benefits recertification, randomized encouragement to enroll in insurance, and from pretrial supervised release with electronic monitoring. While the specific remedy to inequities in algorithmic allocation is context-specific, it requires studying both take-up of decisions and downstream outcomes of them.

Optimal and Fair Encouragement Policy Evaluation and Learning

TL;DR

Abstract

Paper Structure (60 sections, 19 theorems, 114 equations, 6 figures, 4 tables, 2 algorithms)

This paper contains 60 sections, 19 theorems, 114 equations, 6 figures, 4 tables, 2 algorithms.

Introduction
Related Work
Algorithmic advice in operations
Problem Setup
Analysis: What drives utility and budget unfairness under naive ITT targeting?
Unfairness arises from differences in marginal levels, and joint dependence between treatment effect and nudgeability.
Improvement and disparity at a single threshold.
Case study: Text message reminders for SNAP recertification
Background
Descriptives: heterogeneity vs nudgeability.
Method
Resource parity--constrained optimal decision rules
Doubly robust estimation
Robust estimation with treatment overlap but without recommendation overlap
Robust extrapolation under violations of overlap
...and 45 more sections

Key Result

Proposition 1

Assume the distribution of $s(X,A)$ is continuous (no atoms). Consider optimizing the ITT effect of encouragements under a global encouragement budget: Let $q_s:[0,1]\to\mathbb{R}$ denote the (right–continuous) quantile function of $s$, $q_s(t) \;\coloneqq\; \inf\{z\in\mathbb{R}:\; \mathbb{P}(s(X,A)\le z)\ge t\}$ Then the optimal policy is a threshold policy at the top-$b$ quantile of the encoura

Figures (6)

Figure 1: SNAP recertification case study: Heterogeneous treatment effects $\tau$, compliance, and their product (heterogeneous encouragement effects), density distribution plots by race.
Figure 2: Comparison of average group outcomes under budget allocations for targeted treatment of different beneficiary shares ($E[Y(\pi_{\%})\mid A=a]$) with self-selection ($E[Y(\pi_0)\mid A=a]$) or no-reminder ($E[Y(0)\mid A=a]$) status quo.
Figure 3: SNAP recertification case study. Each figure indicates the performance metric (conditional encouragement effect, i.e., compliance score $\times$ heterogeneous treatment effect; heterogeneous treatment effect; and compliance score induced by a threshold policy that thresholds on $\left(p_{1 \mid 1}-p_{1 \mid 0}\right) \times \tau$).
Figure 4: Groupwise separate budgets: Which budget allocations would achieve equal improvements?
Figure 5: Distribution of treatment effect by gender, lift in treatment probabilities $p_{11a}-p_{01a} =P(T=1\mid R=1,A=a,X)-P(T=1\mid R=0,A=a,X)$, and plot of $p_{11a}-p_{01a}$ vs. $\tau.$
...and 1 more figures

Theorems & Definitions (38)

Example 1: Pricing: Demand vs. revenue vs. long-term outcomes
Example 2: Healthcare: Adherence vs. treatment efficacy
Example 3: Takeup of social services and digital outreach
Example 4: Algorithmic advice
Proposition 1: Quantile-threshold optimality
Theorem 3: Spearman lower bound for disparity at the pooled cut
Proposition 2: Regression adjustment identification
Proposition 3: Threshold solutions under resource constraints
Proposition 4: Policy value generalization
Proposition 5: Variance-reduced estimation
...and 28 more

Optimal and Fair Encouragement Policy Evaluation and Learning

TL;DR

Abstract

Optimal and Fair Encouragement Policy Evaluation and Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (38)