Hybrid Meta-learners for Estimating Heterogeneous Treatment Effects
Zhongyuan Liang, Lars van der Laan, Ahmed Alaa
TL;DR
The paper addresses robust estimation of conditional average treatment effects $ au(x)$ from observational data by uniting two dominant meta-learning paradigms—indirect (PO-based) and direct (CATE-based) learners—via the Hybrid Learner (H-learner). The H-learner jointly optimizes two intermediate functions $f_0,f_1$ so that their difference approximates $ au(x)$ while simultaneously regularizing toward accurate PO estimates, controlled by a tunable parameter $6lambda ight]$, with a two-stage process using a pseudo-outcome $Y_{6varphi}$. Theoretical results in the linear setting yield a closed-form solution and a bias–variance analysis showing conditions under which H-learner outperforms both direct and indirect baselines; empirical evaluations on semi-synthetic and benchmark datasets (IHDP and ACIC 2016) demonstrate that the H-learner consistently lies on the Pareto frontier, attaining state-of-the-art PEHE by adaptively balancing inductive biases. Overall, the H-learner provides a robust, model-agnostic framework that leverages the strengths of both regularization strategies, with practical implications for reliable CATE estimation under varying data-generating processes.
Abstract
Estimating conditional average treatment effects (CATE) from observational data involves modeling decisions that differ from supervised learning, particularly concerning how to regularize model complexity. Previous approaches can be grouped into two primary "meta-learner" paradigms that impose distinct inductive biases. Indirect meta-learners first fit and regularize separate potential outcome (PO) models and then estimate CATE by taking their difference, whereas direct meta-learners construct and directly regularize estimators for the CATE function itself. Neither approach consistently outperforms the other across all scenarios: indirect learners perform well when the PO functions are simple, while direct learners outperform when the CATE is simpler than individual PO functions. In this paper, we introduce the Hybrid Learner (H-learner), a novel regularization strategy that interpolates between the direct and indirect regularizations depending on the dataset at hand. The H-learner achieves this by learning intermediate functions whose difference closely approximates the CATE without necessarily requiring accurate individual approximations of the POs themselves. We demonstrate that intentionally allowing suboptimal fits to the POs improves the bias-variance tradeoff in estimating CATE. Experiments conducted on semi-synthetic and real-world benchmark datasets illustrate that the H-learner consistently operates at the Pareto frontier, effectively combining the strengths of both direct and indirect meta-learners.
