Optimal and Fair Encouragement Policy Evaluation and Learning
Angela Zhou
TL;DR
This work addresses the challenge of learning optimal encouragement policies in settings where participation cannot be forced, by integrating causal inference with constrained, fairness-aware policy optimization. It introduces a covariate-conditional exclusion restriction to identify the causal impact of encouragement on take-up and downstream outcomes, and develops a two-stage learning framework with robust, variance-sensitive guarantees under overlap violations. The methodology supports resource-parity fairness constraints and extends to algorithmic recommendations, using doubly robust estimators and reductions to constrained cost-sensitive classification. Through SNAP reminders, health-insurance enrollment, and electronic-monitoring case studies, the paper demonstrates how disentangling treatment efficacy from take-up dynamics improves both efficiency and equity, while acknowledging context-specific trade-offs. The results highlight that effective and fair decision rules require careful targeting of encouragements and a nuanced understanding of disparities in access and responsiveness, rather than purely optimizing overall outcomes.
Abstract
In consequential domains, it is often impossible to compel individuals to take treatment, so that optimal policy rules are merely suggestions in the presence of human non-adherence to treatment recommendations. Under heterogeneity, covariates may predict take-up of treatment and final outcome, but differently. While optimal treatment rules optimize causal outcomes across the population, access parity constraints or other fairness considerations on who receives treatment can be important. For example, in social services, a persistent puzzle is the gap in take-up of beneficial services among those who may benefit from them the most. We study causal identification and robust estimation of optimal treatment rules, including under potential violations of positivity. We consider fairness constraints such as demographic parity in treatment take-up, and other constraints, via constrained optimization. Our framework can be extended to handle algorithmic recommendations under an often-reasonable covariate-conditional exclusion restriction, using our robustness checks for lack of positivity in the recommendation. We develop a two-stage algorithm for solving over parametrized policy classes under general constraints to obtain variance-sensitive regret bounds. We illustrate the methods in three case studies based on data from reminders of SNAP benefits recertification, randomized encouragement to enroll in insurance, and from pretrial supervised release with electronic monitoring. While the specific remedy to inequities in algorithmic allocation is context-specific, it requires studying both take-up of decisions and downstream outcomes of them.
