Policy Learning under Endogeneity Using Instrumental Variables
Yan Liu
TL;DR
This work addresses policy learning under endogenous treatment by introducing encouragement rules that manipulate the instrument and by leveraging the marginal treatment effect (MTE) as a policy-invariant parameter to identify social welfare. It integrates MTE with Empirical Welfare Maximization (EWM) to estimate optimal, interpretable rules and derives regret bounds for feasible rules, with extensions to multiple instruments and budget-constrained settings. The author applies the approach to Indonesian data, showing that targeted tuition subsidies based on observed characteristics yield larger welfare gains than universal subsidies. Overall, the framework provides a principled, interpretable method for designing individualized policies in the presence of treatment endogeneity.
Abstract
I propose a framework for learning individualized policy rules in observational data settings characterized by endogenous treatment selection and the availability of an instrumental variable. I introduce encouragement rules that manipulate the instrument. By incorporating the marginal treatment effect (MTE) as a policy invariant parameter, I establish the identification of the social welfare criterion for the optimal encouragement rule. Focusing on binary encouragement rules, I propose to estimate the optimal encouragement rule via the Empirical Welfare Maximization (EWM) method and derive the welfare loss convergence rate. I apply my method to advise on the optimal tuition subsidy assignment in Indonesia.
