Table of Contents
Fetching ...

Bayesian Supervised Causal Clustering

Luwei Wang, Nazir Lone, Sohan Seth

TL;DR

This work proposes Bayesian Supervised Causal Clustering (BSCC), with treatment effect as outcome to guide the clustering process, and identifies homogenous subgroups of individuals who are similar in their covariate profiles as well as their treatment effects.

Abstract

Finding patient subgroups with similar characteristics is crucial for personalized decision-making in various disciplines such as healthcare and policy evaluation. While most existing approaches rely on unsupervised clustering methods, there is a growing trend toward using supervised clustering methods that identify operationalizable subgroups in the context of a specific outcome of interest. We propose Bayesian Supervised Causal Clustering (BSCC), with treatment effect as outcome to guide the clustering process. BSCC identifies homogenous subgroups of individuals who are similar in their covariate profiles as well as their treatment effects. We evaluate BSCC on simulated datasets as well as real-world dataset from the third International Stroke Trial to assess the practical usefulness of the framework.

Bayesian Supervised Causal Clustering

TL;DR

This work proposes Bayesian Supervised Causal Clustering (BSCC), with treatment effect as outcome to guide the clustering process, and identifies homogenous subgroups of individuals who are similar in their covariate profiles as well as their treatment effects.

Abstract

Finding patient subgroups with similar characteristics is crucial for personalized decision-making in various disciplines such as healthcare and policy evaluation. While most existing approaches rely on unsupervised clustering methods, there is a growing trend toward using supervised clustering methods that identify operationalizable subgroups in the context of a specific outcome of interest. We propose Bayesian Supervised Causal Clustering (BSCC), with treatment effect as outcome to guide the clustering process. BSCC identifies homogenous subgroups of individuals who are similar in their covariate profiles as well as their treatment effects. We evaluate BSCC on simulated datasets as well as real-world dataset from the third International Stroke Trial to assess the practical usefulness of the framework.
Paper Structure (25 sections, 11 equations, 14 figures, 16 tables)

This paper contains 25 sections, 11 equations, 14 figures, 16 tables.

Figures (14)

  • Figure 1: (a) Illustrating the conceptual difference between (left) unsupervised clustering with Gaussian Mixture Model (gmm) (right) supervised clustering with bscc. gmm fails to recover the subgroups relevant to treatment while bscc can. (b) Difference between bscc clustering samples based on both covariates and treatment effect in a supervised clustering set-up, as opposed to causal clustering or supervised learning based approaches clustering based on potential outcomes or treatment effect only.
  • Figure 2: Plate Diagram of bscc with feature selection.
  • Figure 3: Visualization of UMAP Projection of the Simulated Dataset in simulation hte: Cluster assignments are given by (a) ground truth of constant treatment effects $(0.5,5,-5,0,0)$, (b)gmm, (c)sgmm, (d)mob, (e)bscc, (f)cf, (g)bart, (h)DR-learner, (i)R-learner and (j)cc.
  • Figure 4: bscc on IST-3 Dataset: (a) The estimated cluster means and (b) corresponding ORs with $95\%$ CIs for training (in log scale), and (c) the empirical cluster means and (d) corresponding ORs with $95\%$ CIs for test set. Each axis representing a continuous covariate in the radar chart is transformed back to its original scale to enhance interpretability. (e) Cluster-specific feature importance.
  • Figure B.1: simulation fs: (a) Simulated dataset and (b) learned cluster-specific feature importance by bscc.
  • ...and 9 more figures