Assessment of evidence against homogeneity in exhaustive subgroup treatment effect plots
Björn Bornkamp, Jiarui Lu, Frank Bretz
TL;DR
This paper addresses how to formally assess evidence against homogeneity in exhaustive subgroup treatment effect plots, a visualization that shows treatment effects across many subgroups with varying sizes. It introduces a Doubly Robust (DR) learner–based approach to generate pseudo-outcomes and compute divergence-based p-values and gamma-homogeneity regions, enabling a quantitative assessment of compatibility with homogeneous effects. Through a cardiovascular case study and extensive simulations, the authors demonstrate well-calibrated inference and improved performance over simple mean-difference approaches. The method yields interpretable, graded evidence (via p-values and S-values) and can be integrated into interactive workflows to support decision-making in exploratory subgroup analyses.
Abstract
Exhaustive subgroup treatment effect plots are constructed by displaying all subgroup treatment effects of interest against subgroup sample size, providing a useful overview of the observed treatment effect heterogeneity in a clinical trial. As in any exploratory subgroup analysis, however, the observed estimates suffer from small sample sizes and multiplicity issues. To facilitate more interpretable exploratory assessments, this paper introduces a computationally efficient method to generate homogeneity regions within exhaustive subgroup treatment effect plots. Using the Doubly Robust (DR) learner, pseudo-outcomes are used to estimate subgroup effects and derive reference distributions, quantifying how surprising observed heterogeneity is under a homogeneous effects model. Explicit formulas are derived for the homogeneity region and different methods for calculation of the critical values are compared. The method is illustrated with a cardiovascular trial and evaluated via simulation, showing well-calibrated inference and improved performance over standard approaches using simple differences of observed group means.
