Table of Contents
Fetching ...

Hierarchical and Density-based Causal Clustering

Kwangho Kim, Jisu Kim, Larry A. Wasserman, Edward H. Kennedy

TL;DR

This paper proposes plug-in estimators that are simple and readily implementable using off-the-shelf algorithms that significantly extend the capabilities of the causal clustering framework, thereby contributing to the progression of methodologies for identifying homogeneous subgroups in treatment response, consequently facilitating more nuanced and targeted interventions.

Abstract

Understanding treatment effect heterogeneity is vital for scientific and policy research. However, identifying and evaluating heterogeneous treatment effects pose significant challenges due to the typically unknown subgroup structure. Recently, a novel approach, causal k-means clustering, has emerged to assess heterogeneity of treatment effect by applying the k-means algorithm to unknown counterfactual regression functions. In this paper, we expand upon this framework by integrating hierarchical and density-based clustering algorithms. We propose plug-in estimators that are simple and readily implementable using off-the-shelf algorithms. Unlike k-means clustering, which requires the margin condition, our proposed estimators do not rely on strong structural assumptions on the outcome process. We go on to study their rate of convergence, and show that under the minimal regularity conditions, the additional cost of causal clustering is essentially the estimation error of the outcome regression functions. Our findings significantly extend the capabilities of the causal clustering framework, thereby contributing to the progression of methodologies for identifying homogeneous subgroups in treatment response, consequently facilitating more nuanced and targeted interventions. The proposed methods also open up new avenues for clustering with generic pseudo-outcomes. We explore finite sample properties via simulation, and illustrate the proposed methods in voting and employment projection datasets.

Hierarchical and Density-based Causal Clustering

TL;DR

This paper proposes plug-in estimators that are simple and readily implementable using off-the-shelf algorithms that significantly extend the capabilities of the causal clustering framework, thereby contributing to the progression of methodologies for identifying homogeneous subgroups in treatment response, consequently facilitating more nuanced and targeted interventions.

Abstract

Understanding treatment effect heterogeneity is vital for scientific and policy research. However, identifying and evaluating heterogeneous treatment effects pose significant challenges due to the typically unknown subgroup structure. Recently, a novel approach, causal k-means clustering, has emerged to assess heterogeneity of treatment effect by applying the k-means algorithm to unknown counterfactual regression functions. In this paper, we expand upon this framework by integrating hierarchical and density-based clustering algorithms. We propose plug-in estimators that are simple and readily implementable using off-the-shelf algorithms. Unlike k-means clustering, which requires the margin condition, our proposed estimators do not rely on strong structural assumptions on the outcome process. We go on to study their rate of convergence, and show that under the minimal regularity conditions, the additional cost of causal clustering is essentially the estimation error of the outcome regression functions. Our findings significantly extend the capabilities of the causal clustering framework, thereby contributing to the progression of methodologies for identifying homogeneous subgroups in treatment response, consequently facilitating more nuanced and targeted interventions. The proposed methods also open up new avenues for clustering with generic pseudo-outcomes. We explore finite sample properties via simulation, and illustrate the proposed methods in voting and employment projection datasets.

Paper Structure

This paper contains 14 sections, 14 theorems, 80 equations, 6 figures.

Key Result

Proposition 3.1

Let $D$ denote the single, average, or complete linkage between sets of points, induced by the distance function such that $d(x,y) \lesssim \left\Vert x - y \right\Vert_{1}$. Then under Assumption assumption:A1-sample-splitting-entropy, for any two sets $S_1, S_2$ in $\{\mu_{(i)}\}$ and their estima

Figures (6)

  • Figure 1: Two instances in which the three clustering techniques result in distinct subgroups for the projected sample. The grey dotted diagonal line indicates no treatment effects.
  • Figure 2: (a), (b): The y-axis represents classification error from hierarchical (causal) clustering, where we fix $\nu=0.01, 0.1$ and vary $\alpha$. (c), (d): The y-axis represents the average of $H(\widehat{L}_{h,t},L_{h,t})$ from density-based (causal) clustering, where we fix $t=0.05, 0.1$ and vary $n$.
  • Figure 3: (a) Histogram of the true CATE in the test set. In the original study nie2017quasi, individuals with zero treatment effects are assigned to the label $L=0$. (b) The result of density-based causal clustering. Units in Cluster C1 appear to have higher baseline risk ($\mu_0$). (c) We observe that points in Clusters C1 and C2 are more concentrated around the right upper area (larger $\mu_0, \mu_1$) and the lower left area (smaller $\mu_0, \mu_1$), respectively.
  • Figure 4: The estimated causal clusters on two principal-component hyperplanes with axes representing the first and second, second and third principal components in the conditional counterfactual mean vector space, respectively.
  • Figure 5: The density plots of the pariwise CATE of six other education levels relative to the doctoral degree across clusters. We observe a substantial degree of effect heterogeneity. The red dashed vertical lines denote the zero CATE.
  • ...and 1 more figures

Theorems & Definitions (29)

  • Proposition 3.1
  • Definition 3.1: $(\alpha, \nu)$-Good Neighborhood Property for Distribution
  • Theorem 3.2
  • Definition 4.1: Hausdorff Distance
  • Definition 4.2: Level Set Stability
  • Theorem 4.1
  • Definition A.1: Property 3 in balcan2014robust
  • Theorem A.1: Theorem 11 in balcan2014robust
  • Lemma B.1
  • proof : Proof of Lemma \ref{['lem:xhatprime_prime']}
  • ...and 19 more