Table of Contents
Fetching ...

CausalPrism: A Visual Analytics Approach for Subgroup-based Causal Heterogeneity Exploration

Jiehui Zhou, Xumeng Wang, Kam-Kwai Wong, Wei Zhang, Xingyu Liu, Juntian Zhang, Minfeng Zhu, Wei Chen

TL;DR

This work tackles heterogeneous treatment effect analysis in observational data by formulating causal subgroup discovery as a constrained multi-objective optimization problem and solving it with a heuristic genetic algorithm to yield Pareto-front subgroups described by interpretable rules. It then delivers a visual analytics prototype, CausalPrism, with three coordinated views for subgroup discovery, covariate projection, and treatment-effect validation to support interactive exploration, ranking, and explanation. Quantitative experiments show improved precision and interpretability over state-of-the-art baselines, while case studies and expert interviews demonstrate practical usability and trust in the results. The approach enables human-in-the-loop, transparent subgroup analysis with potential impact in precision medicine, marketing, and policy evaluation on observational data.

Abstract

In causal inference, estimating Heterogeneous Treatment Effects (HTEs) from observational data is critical for understanding how different subgroups respond to treatments, with broad applications such as precision medicine and targeted advertising. However, existing work on HTE, subgroup discovery, and causal visualization is insufficient to address two challenges: first, the sheer number of potential subgroups and the necessity to balance multiple objectives (e.g., high effects and low variances) pose a considerable analytical challenge. Second, effective subgroup analysis has to follow the analysis goal specified by users and provide causal results with verification. To this end, we propose a visual analytics approach for subgroup-based causal heterogeneity exploration. Specifically, we first formulate causal subgroup discovery as a constrained multi-objective optimization problem and adopt a heuristic genetic algorithm to learn the Pareto front of optimal subgroups described by interpretable rules. Combining with this model, we develop a prototype system, CausalPrism, that incorporates tabular visualization, multi-attribute rankings, and uncertainty plots to support users in interactively exploring and sorting subgroups and explaining treatment effects. Quantitative experiments validate that the proposed model can efficiently mine causal subgroups that outperform state-of-the-art HTE and subgroup discovery methods, and case studies and expert interviews demonstrate the effectiveness and usability of the system. Code is available at https://osf.io/jaqmf/?view_only=ac9575209945476b955bf829c85196e9.

CausalPrism: A Visual Analytics Approach for Subgroup-based Causal Heterogeneity Exploration

TL;DR

This work tackles heterogeneous treatment effect analysis in observational data by formulating causal subgroup discovery as a constrained multi-objective optimization problem and solving it with a heuristic genetic algorithm to yield Pareto-front subgroups described by interpretable rules. It then delivers a visual analytics prototype, CausalPrism, with three coordinated views for subgroup discovery, covariate projection, and treatment-effect validation to support interactive exploration, ranking, and explanation. Quantitative experiments show improved precision and interpretability over state-of-the-art baselines, while case studies and expert interviews demonstrate practical usability and trust in the results. The approach enables human-in-the-loop, transparent subgroup analysis with potential impact in precision medicine, marketing, and policy evaluation on observational data.

Abstract

In causal inference, estimating Heterogeneous Treatment Effects (HTEs) from observational data is critical for understanding how different subgroups respond to treatments, with broad applications such as precision medicine and targeted advertising. However, existing work on HTE, subgroup discovery, and causal visualization is insufficient to address two challenges: first, the sheer number of potential subgroups and the necessity to balance multiple objectives (e.g., high effects and low variances) pose a considerable analytical challenge. Second, effective subgroup analysis has to follow the analysis goal specified by users and provide causal results with verification. To this end, we propose a visual analytics approach for subgroup-based causal heterogeneity exploration. Specifically, we first formulate causal subgroup discovery as a constrained multi-objective optimization problem and adopt a heuristic genetic algorithm to learn the Pareto front of optimal subgroups described by interpretable rules. Combining with this model, we develop a prototype system, CausalPrism, that incorporates tabular visualization, multi-attribute rankings, and uncertainty plots to support users in interactively exploring and sorting subgroups and explaining treatment effects. Quantitative experiments validate that the proposed model can efficiently mine causal subgroups that outperform state-of-the-art HTE and subgroup discovery methods, and case studies and expert interviews demonstrate the effectiveness and usability of the system. Code is available at https://osf.io/jaqmf/?view_only=ac9575209945476b955bf829c85196e9.
Paper Structure (31 sections, 7 equations, 6 figures, 2 tables)

This paper contains 31 sections, 7 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: An illustrative toy example. There is only one covariate, and the change in the outcome between the treatment and control group can be informally thought of as the treatment effect. Subgroup 3 has a high effect and low variance, which is better than Subgroup 1 and 2.
  • Figure 2: A four-step workflow for subgroup-based causal heterogeneity exploration. (A) The model automatically mines subgroups with significant treatment effects from observational data. (B) Subgroups can be explored through tabular and multi-attribute visualizations. (C) Users can interactively analyze new subgroup hypotheses and achieve multi-criteria decision-making based on their preferences. (D) Effect interpretation based on matched units simulates the user's familiar A/B testing, aiding in result validation.
  • Figure 3: Illustration of the proposed model to discover optimal causal subgroups. (A) Schematic diagram of the Pareto front, where circles represent feasible subgroups for which small objective values are preferred over large values. The subgroup in Front 1 is not dominated by other subgroups, and the subgroup in Front 2 is dominated only by those in Front 1. (B) Illustration of the iterative heuristic algorithm for solving the multi-objective optimization problem. In each iteration, offspring subgroups are generated from the parent subgroup and survived by first comparing the front level and then the crowding distance.
  • Figure 4: Illustration of propensity score matching. Colors represent different covariates, and drug icons indicate treatment. Matching reduces confounding bias by finding comparable treatment and control units.
  • Figure 5: Descriptions of a subgroup with good credit in Case 1. (A) Explanation of treatment effects shows that the subgroup has low effect and variance. (B) A balanced histogram of propensity scores and most matched pairs have a zero ITE.
  • ...and 1 more figures