Table of Contents
Fetching ...

PUATE: Efficient Average Treatment Effect Estimation from Treated (Positive) and Unlabeled Units

Masahiro Kato, Fumiaki Kozai, Ryo Inokuchi

TL;DR

PUATE tackles average treatment effect estimation when treatment labels are partially observed, framing the problem as learning from positive and unlabeled data. It derives semiparametric efficiency bounds and efficient influence functions for two DGPs—the censoring and case-control settings—and constructs estimators that achieve these bounds, with doubly robust and cross-fitting properties to ensure $\sqrt{n}$-consistency and asymptotic normality. The work connects PU learning with causal inference under missing data, providing theoretically optimal estimators and practical guidance for settings like implicit feedback in recommender systems and incomplete treatment data in medicine. Overall, it delivers efficient, robust tools for ATE estimation when only treated and unlabeled units are available, expanding causal inference paradigms in weakly supervised contexts.

Abstract

The estimation of average treatment effects (ATEs), defined as the difference in expected outcomes between treatment and control groups, is a central topic in causal inference. This study develops semiparametric efficient estimators for ATE in a setting where only a treatment group and an unlabeled group, consisting of units whose treatment status is unknown, are observed. This scenario constitutes a variant of learning from positive and unlabeled data (PU learning) and can be viewed as a special case of ATE estimation with missing data. For this setting, we derive the semiparametric efficiency bounds, which characterize the lowest achievable asymptotic variance for regular estimators. We then construct semiparametric efficient ATE estimators that attain these bounds. Our results contribute to the literature on causal inference with missing data and weakly supervised learning.

PUATE: Efficient Average Treatment Effect Estimation from Treated (Positive) and Unlabeled Units

TL;DR

PUATE tackles average treatment effect estimation when treatment labels are partially observed, framing the problem as learning from positive and unlabeled data. It derives semiparametric efficiency bounds and efficient influence functions for two DGPs—the censoring and case-control settings—and constructs estimators that achieve these bounds, with doubly robust and cross-fitting properties to ensure -consistency and asymptotic normality. The work connects PU learning with causal inference under missing data, providing theoretically optimal estimators and practical guidance for settings like implicit feedback in recommender systems and incomplete treatment data in medicine. Overall, it delivers efficient, robust tools for ATE estimation when only treated and unlabeled units are available, expanding causal inference paradigms in weakly supervised contexts.

Abstract

The estimation of average treatment effects (ATEs), defined as the difference in expected outcomes between treatment and control groups, is a central topic in causal inference. This study develops semiparametric efficient estimators for ATE in a setting where only a treatment group and an unlabeled group, consisting of units whose treatment status is unknown, are observed. This scenario constitutes a variant of learning from positive and unlabeled data (PU learning) and can be viewed as a special case of ATE estimation with missing data. For this setting, we derive the semiparametric efficiency bounds, which characterize the lowest achievable asymptotic variance for regular estimators. We then construct semiparametric efficient ATE estimators that attain these bounds. Our results contribute to the literature on causal inference with missing data and weakly supervised learning.

Paper Structure

This paper contains 61 sections, 11 theorems, 87 equations, 4 figures, 4 tables, 2 algorithms.

Key Result

Lemma 4.1

If Assumptions asm:unconfoundedness_censoring--asm:overlap_censoring, then the efficient influence function is given as $\Psi^{\mathrm{cens}}(X, O, Y; \mu_{{\mathrm{T}}, 0}, \nu_0, \pi_0, g_0, \tau_0)$, where

Figures (4)

  • Figure 1: Illustration of the censoring and case-control settings
  • Figure 2: Empirical distributions of ATE estimates.
  • Figure 3: Empirical distributions of ATE estimates.
  • Figure 5: Response surface A. Left: censoring setting; Right: case‐control setting.

Theorems & Definitions (19)

  • Lemma 4.1
  • Theorem 4.2: Efficiency bound in the censoring setting
  • Remark : Estimation equation
  • Theorem 4.4: Consistency in the censoring setting
  • Theorem 4.7: Asymptotic normality in the censoring setting
  • Remark : Inefficiency of the Inverse Probability Weighting (IPW) estimator
  • Remark : Direct Method (DM) estimator
  • Corollary 4.9: Asymptotic normality in the censoring setting
  • Theorem 5.1: Asymptotic normality in the case-control setting (Informal)
  • Remark : Kennedy2020efficientnonparametric
  • ...and 9 more