Machine Learning Who to Nudge: Causal vs Predictive Targeting in a Field Experiment on Student Financial Aid Renewal

Susan Athey; Niall Keleher; Jann Spiess

Machine Learning Who to Nudge: Causal vs Predictive Targeting in a Field Experiment on Student Financial Aid Renewal

Susan Athey, Niall Keleher, Jann Spiess

TL;DR

This paper tackles the problem that nudges may not be equally effective for all individuals by leveraging causal ML to estimate how treatment effects vary with covariates in a large field experiment on FAFSA renewal. It compares targeting policies based on non-parametric CATE estimates, baseline outcome predictions, and hybrids that blend predictive power with causal insight, using data from two RCTs with over $N=66{,}000$ students in total. The key finding is that targeting those with intermediate baseline outcomes using a causal or semi-parametric approach yields substantial gains—roughly two-thirds to three-quarters of the total average gain from nudging—while naive predictive targeting can underperform. The results emphasize the value of integrating predictive modeling with careful causal inference for policy design, highlight the challenges of estimating heterogeneity in noisy settings, and offer practical guidance on when simple, regularized models can match or outperform fully non-parametric approaches.

Abstract

In many settings, interventions may be more effective for some individuals than others, so that targeting interventions may be beneficial. We analyze the value of targeting in the context of a large-scale field experiment with over 53,000 college students, where the goal was to use "nudges" to encourage students to renew their financial-aid applications before a non-binding deadline. We begin with baseline approaches to targeting. First, we target based on a causal forest that estimates heterogeneous treatment effects and then assigns students to treatment according to those estimated to have the highest treatment effects. Next, we evaluate two alternative targeting policies, one targeting students with low predicted probability of renewing financial aid in the absence of the treatment, the other targeting those with high probability. The predicted baseline outcome is not the ideal criterion for targeting, nor is it a priori clear whether to prioritize low, high, or intermediate predicted probability. Nonetheless, targeting on low baseline outcomes is common in practice, for example because the relationship between individual characteristics and treatment effects is often difficult or impossible to estimate with historical data. We propose hybrid approaches that incorporate the strengths of both predictive approaches (accurate estimation) and causal approaches (correct criterion); we show that targeting intermediate baseline outcomes is most effective in our specific application, while targeting based on low baseline outcomes is detrimental. In one year of the experiment, nudging all students improved early filing by an average of 6.4 percentage points over a baseline average of 37% filing, and we estimate that targeting half of the students using our preferred policy attains around 75% of this benefit.

Machine Learning Who to Nudge: Causal vs Predictive Targeting in a Field Experiment on Student Financial Aid Renewal

TL;DR

students in total. The key finding is that targeting those with intermediate baseline outcomes using a causal or semi-parametric approach yields substantial gains—roughly two-thirds to three-quarters of the total average gain from nudging—while naive predictive targeting can underperform. The results emphasize the value of integrating predictive modeling with careful causal inference for policy design, highlight the challenges of estimating heterogeneity in noisy settings, and offer practical guidance on when simple, regularized models can match or outperform fully non-parametric approaches.

Abstract

Paper Structure (14 sections, 19 equations, 13 figures, 6 tables)

This paper contains 14 sections, 19 equations, 13 figures, 6 tables.

Introduction
Experiment and Data
Estimating Treatment-Effect Heterogeneity
Causal vs Predictive Targeting
Targeting based on non-parametric treatment effect estimates
Targeting based on baseline predictions
Improving Targeting by Combining Predictive and Causal Modeling
Model-based targeting from baseline predictions
Targeting based on a hybrid model that adapts to heterogeneity
Comparison of non-parametric, simple, and hybrid targeting strategies
Non-parametric targeting with baseline predictions as a feature
Conclusion
Additional Tables and Figures
Evaluation of Assignment Policies

Figures (13)

Figure 1: Example reminder text messages sent to students in the treatment group.
Figure 2: Histograms of estimated treatment effects across years and set of covariates, using honest estimates from the causal forest method.
Figure 3: Average treatment effects by quartiles of estimated treatment effects. The $x$-axis divides the sample into quartiles of predicted cross-fitted treatment effects using ten folds. The $y$-axis plots groups and estimates based on an augmented inverse-propensity weighted estimator using an estimated ("AIPW") or the known propensity score ("Known propensity score"), along with a 95% confidence interval.
Figure 4: Average treatment effects by enrollment status (red) and by whether predicted cross-fitted treatment effects are below or above the quantile corresponding to the proportion of enrolled students (blue), using all data up to the start of the intervention. The $y$-axis plots a simple difference-in-averages for actual enrollment as well as two estimators based on augmented inverse-propensity weighted estimators using the estimated ("AIPW") or the known propensity score ("Known propensity score") constant and with estimated propensity scores, along with 95% confidence intervals.
Figure 5: Total estimated FAFSA renewal rate ($y$-axis) by targeting a given fraction ($x$-axis) of students according to different cross-fitted predictions in the 2017 data, including targeting by estimated treatment effects using the causal forest ("Non-Parametric CATE"), by a random-forest prediction of outcomes absent treatment ("Negative Baseline" for low baseline treated first, and "Positive Baseline" the reverse), and by predicted or actual enrollment ("Predicted Enrollment" and "Enrollment"). Shown are model-free unbiased augmented inverse propensity weighted estimates with 95% confidence intervals that represent the point-wise uncertainty of the difference in renewal rate relative to the random policy ("Random") that assigns the same fraction to treatment, with details provided in \ref{['apx:inference']}.
...and 8 more figures

Machine Learning Who to Nudge: Causal vs Predictive Targeting in a Field Experiment on Student Financial Aid Renewal

TL;DR

Abstract

Machine Learning Who to Nudge: Causal vs Predictive Targeting in a Field Experiment on Student Financial Aid Renewal

Authors

TL;DR

Abstract

Table of Contents

Figures (13)