Table of Contents
Fetching ...

Advantages and limitations in the use of transfer learning for individual treatment effects in causal machine learning

Seyda Betul Aydin, Holger Brandt

TL;DR

This paper investigates how to improve individual treatment effect (ITE) estimation in small-sample settings by transferring causal knowledge from a larger source dataset. It extends TARNet to TL-TARNet, providing theoretical bounds that decompose target error into observable components (factual loss and distributional mismatch) and additional terms capturing treatment-selection and mechanism changes, with CITA guiding source selection. Through simulations and an empirical IHDS-II application, TL-TARNet consistently reduces bias and improves precision for ITE estimates when target data are limited, particularly under nonrandomized intervention settings. The work offers practical transfer-learning guidelines for causal inference in social and behavioral sciences, emphasizing balanced representations, staged fine-tuning, and task affinity measures to enable robust causal transfer across related studies.

Abstract

Generalizing causal knowledge across diverse environments is challenging, especially when estimates from large-scale datasets must be applied to smaller or systematically different contexts, where external validity is critical. Model-based estimators of individual treatment effects (ITE) from machine learning require large sample sizes, limiting their applicability in domains such as behavioral sciences with smaller datasets. We demonstrate how estimation of ITEs with Treatment Agnostic Representation Networks (TARNet; Shalit et al., 2017) can be improved by leveraging knowledge from source datasets and adapting it to new settings via transfer learning (TL-TARNet; Aloui et al., 2023). In simulations that vary source and sample sizes and consider both randomized and non-randomized intervention target settings, the transfer-learning extension TL-TARNet improves upon standard TARNet, reducing ITE error and attenuating bias when a large unbiased source is available and target samples are small. In an empirical application using the India Human Development Survey (IHDS-II), we estimate the effect of mothers' firewood collection time on children's weekly study time; transfer learning pulls the target mean ITEs toward the source ITE estimate, reducing bias in the estimates obtained without transfer. These results suggest that transfer learning for causal models can improve the estimation of ITE in small samples.

Advantages and limitations in the use of transfer learning for individual treatment effects in causal machine learning

TL;DR

This paper investigates how to improve individual treatment effect (ITE) estimation in small-sample settings by transferring causal knowledge from a larger source dataset. It extends TARNet to TL-TARNet, providing theoretical bounds that decompose target error into observable components (factual loss and distributional mismatch) and additional terms capturing treatment-selection and mechanism changes, with CITA guiding source selection. Through simulations and an empirical IHDS-II application, TL-TARNet consistently reduces bias and improves precision for ITE estimates when target data are limited, particularly under nonrandomized intervention settings. The work offers practical transfer-learning guidelines for causal inference in social and behavioral sciences, emphasizing balanced representations, staged fine-tuning, and task affinity measures to enable robust causal transfer across related studies.

Abstract

Generalizing causal knowledge across diverse environments is challenging, especially when estimates from large-scale datasets must be applied to smaller or systematically different contexts, where external validity is critical. Model-based estimators of individual treatment effects (ITE) from machine learning require large sample sizes, limiting their applicability in domains such as behavioral sciences with smaller datasets. We demonstrate how estimation of ITEs with Treatment Agnostic Representation Networks (TARNet; Shalit et al., 2017) can be improved by leveraging knowledge from source datasets and adapting it to new settings via transfer learning (TL-TARNet; Aloui et al., 2023). In simulations that vary source and sample sizes and consider both randomized and non-randomized intervention target settings, the transfer-learning extension TL-TARNet improves upon standard TARNet, reducing ITE error and attenuating bias when a large unbiased source is available and target samples are small. In an empirical application using the India Human Development Survey (IHDS-II), we estimate the effect of mothers' firewood collection time on children's weekly study time; transfer learning pulls the target mean ITEs toward the source ITE estimate, reducing bias in the estimates obtained without transfer. These results suggest that transfer learning for causal models can improve the estimation of ITE in small samples.

Paper Structure

This paper contains 26 sections, 17 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Representation of transfer learning.
  • Figure 2: Representation of transfer learning with IPMs.
  • Figure 3: Mean ITE for TARNet and TL-TARNet models (i.e. with and without transfer learning) across different sizes of source dataset and target data sets. The population value is indicated with a grey line (${\tau}=1$).
  • Figure 4: Mean $\varepsilon_{\text{PEHE}}$ for TARNet and TL-TARNet models (i.e. with and without transfer learning) across different sizes of source dataset and target data sets. The grey line indicates no error.
  • Figure 5: Source versus target ITE predictions for randomized (left) and non-randomized (right) intervention settings, with and without transfer learning.