Table of Contents
Fetching ...

PITE: Multi-Prototype Alignment for Individual Treatment Effect Estimation

Fuyuan Cao, Jiaxuan Zhang, Xiaoli Li

TL;DR

PITE tackles the challenge of estimating individual treatment effects from observational data by preserving local subgroup structure through a novel multi-prototype framework. It introduces within-group prototype matching and cross-group prototype alignment, optimized with a composite loss that includes clustering, alignment, and diversity terms, plus a two-head predictor for potential outcomes. Across synthetic, IHDP, and Jobs datasets, PITE achieves superior ITE accuracy and robustness compared with 13 baselines, supported by ablation and uniformity analyses. This prototype-level approach reduces distribution shift while maintaining local structure, enabling more reliable counterfactual estimation with potential for broader applicability in personalized decision-making.

Abstract

Estimating Individual Treatment Effects (ITE) from observational data is challenging due to confounding bias. Most studies tackle this bias by balancing distributions globally, but ignore individual heterogeneity and fail to capture the local structure that represents the natural clustering among individuals, which ultimately compromises ITE estimation. While instance-level alignment methods consider heterogeneity, they similarly overlook the local structure information. To address these issues, we propose an end-to-end Multi-\textbf{P}rototype alignment method for \textbf{ITE} estimation (\textbf{PITE}). PITE effectively captures local structure within groups and enforces cross-group alignment, thereby achieving robust ITE estimation. Specifically, we first define prototypes as cluster centroids based on similar individuals under the same treatment. To identify local similarity and the distribution consistency, we perform instance-to-prototype matching to assign individuals to the nearest prototype within groups, and design a multi-prototype alignment strategy to encourage the matched prototypes to be close across treatment arms in the latent space. PITE not only reduces distribution shift through fine-grained, prototype-level alignment, but also preserves the local structures of treated and control groups, which provides meaningful constraints for ITE estimation. Extensive evaluations on benchmark datasets demonstrate that PITE outperforms 13 state-of-the-art methods, achieving more accurate and robust ITE estimation.

PITE: Multi-Prototype Alignment for Individual Treatment Effect Estimation

TL;DR

PITE tackles the challenge of estimating individual treatment effects from observational data by preserving local subgroup structure through a novel multi-prototype framework. It introduces within-group prototype matching and cross-group prototype alignment, optimized with a composite loss that includes clustering, alignment, and diversity terms, plus a two-head predictor for potential outcomes. Across synthetic, IHDP, and Jobs datasets, PITE achieves superior ITE accuracy and robustness compared with 13 baselines, supported by ablation and uniformity analyses. This prototype-level approach reduces distribution shift while maintaining local structure, enabling more reliable counterfactual estimation with potential for broader applicability in personalized decision-making.

Abstract

Estimating Individual Treatment Effects (ITE) from observational data is challenging due to confounding bias. Most studies tackle this bias by balancing distributions globally, but ignore individual heterogeneity and fail to capture the local structure that represents the natural clustering among individuals, which ultimately compromises ITE estimation. While instance-level alignment methods consider heterogeneity, they similarly overlook the local structure information. To address these issues, we propose an end-to-end Multi-\textbf{P}rototype alignment method for \textbf{ITE} estimation (\textbf{PITE}). PITE effectively captures local structure within groups and enforces cross-group alignment, thereby achieving robust ITE estimation. Specifically, we first define prototypes as cluster centroids based on similar individuals under the same treatment. To identify local similarity and the distribution consistency, we perform instance-to-prototype matching to assign individuals to the nearest prototype within groups, and design a multi-prototype alignment strategy to encourage the matched prototypes to be close across treatment arms in the latent space. PITE not only reduces distribution shift through fine-grained, prototype-level alignment, but also preserves the local structures of treated and control groups, which provides meaningful constraints for ITE estimation. Extensive evaluations on benchmark datasets demonstrate that PITE outperforms 13 state-of-the-art methods, achieving more accurate and robust ITE estimation.

Paper Structure

This paper contains 20 sections, 16 equations, 6 figures, 7 tables, 1 algorithm.

Figures (6)

  • Figure 1: (a) Existing distribution-level alignment methods align the overall distributional statistics (e.g., mean) of covariates between treated and control groups but fail to preserve semantic information. (b) Our method achieves both fine-grained prototype-level alignment to reduce distribution shift and preserves local structure between treated and control groups.
  • Figure 2: An overview of the PITE framework. We perform prototype learning on the representations $\phi_t$ and $\phi_c$ for treated and control groups respectively via k-means to capture local structures within each group. Cross-treatment prototype alignment ($\mathcal{L}_{align}$) enforces correspondence between treated and control prototypes to reduce distribution shift. Finally, two separate neural networks, $h_{1} (\phi(x))$ and $h_{0} (\phi(x))$, are used to estimate potential outcomes under different treatments.
  • Figure 3: Visualization of representation uniformity of four typical methods on IHDP dataset. We visualize the overall feature distributions with Gaussian kernel density estimation (KDE) in ${\mathbb{R}}^{2}$, where the color gradient represents density levels from low (blue) to high (red). The uniformity metric is computed by measuring the pairwise distances between normalized representations on the hypersphere, with lower values indicating superior uniformity.
  • Figure 4: ITE estimation performance of PITE under different parameters on IHDP dataset.
  • Figure 5: T-SNE visualizations of the covariates as $\gamma$ varies.
  • ...and 1 more figures

Theorems & Definitions (1)

  • Definition 1: Prototype