A New Causal Rule Learning Approach to Interpretable Estimation of Heterogeneous Treatment Effect
Ying Wu, Hanzhong Liu, Kai Ren, Shujie Ma, Xiangyu Chang
TL;DR
This work tackles the challenge of interpretable heterogeneous treatment effect estimation in complex diseases by introducing Causal Rule Learning (CRL). CRL combines rule-based discovery of subgroup effects via a causal forest with a sparse, LASSO-based rule selection (D-learning) to estimate individual treatment effects as a weighted linear combination of subgroup CATEs: $\tau(X)=\sum_{m=1}^M \beta_m \tau_m r_m(X)$. The framework includes a structured rule analysis stage (overall, significance, and decomposition) to enhance interpretability and validation, enabling multi-subgroup membership and nuanced ITE explanations. Across simulations and an ASD real-world application, CRL demonstrates strong estimation accuracy, effective pruning of spurious rules, and the ability to reveal clinically actionable subgroups and interactions. The approach offers a scalable, interpretable tool for clinical decision support and trial design, with theoretical guarantees on convergence and robustness to correlated covariates, while acknowledging limitations such as linearity assumptions and binary treatments.
Abstract
Interpretability plays a crucial role in the application of statistical learning to estimate heterogeneous treatment effects (HTE) in complex diseases. In this study, we leverage a rule-based workflow, namely causal rule learning (CRL), to estimate and improve our understanding of HTE for atrial septal defect, addressing an overlooked question in the previous literature: what if an individual simultaneously belongs to multiple groups with different average treatment effects? The CRL process consists of three steps: rule discovery, which generates a set of causal rules with corresponding subgroup average treatment effects; rule selection, which identifies a subset of these rules to deconstruct individual-level treatment effects as a linear combination of subgroup-level effects; and rule analysis, which presents a detailed procedure for further analyzing each selected rule from multiple perspectives to identify the most promising rules for validation. Extensive simulation studies and real-world data analysis demonstrate that CRL outperforms other methods in providing interpretable estimates of HTE, especially when dealing with complex ground truth and sufficient sample sizes.
