Let the Experts Speak: Improving Survival Prediction & Calibration via Mixture-of-Experts Heads

Todd Morrill; Aahlad Puli; Murad Megjhani; Soojin Park; Richard Zemel

Let the Experts Speak: Improving Survival Prediction & Calibration via Mixture-of-Experts Heads

Todd Morrill, Aahlad Puli, Murad Megjhani, Soojin Park, Richard Zemel

TL;DR

The study addresses survival analysis under right-censoring by aiming to improve calibration and predictive accuracy while enabling clustering of patients. It introduces three discrete-time deep mixture-of-experts heads—Fixed MoE, Adjustable MoE, and Personalized MoE—trained with a multitask logistic regression loss over a discrete time grid of $m=100$ time bins. Across Survival MNIST (synthetic) and real-world datasets SUPPORT2 and Sepsis, the Personalized MoE achieves best calibration and competitive discrimination and Brier scores, with performance gains amplified by higher expert expressivity. The approach surfaces clinically meaningful patient groups through routing analyses, remains robust to the number of experts, and offers a practical framework for reasoning by analogy to similar patients in clinical decision support systems.

Abstract

Deep mixture-of-experts models have attracted a lot of attention for survival analysis problems, particularly for their ability to cluster similar patients together. In practice, grouping often comes at the expense of key metrics such as calibration error and predictive accuracy. This is due to the restrictive inductive bias that mixture-of-experts imposes, that predictions for individual patients must look like predictions for the group they're assigned to. Might we be able to discover patient group structure, where it exists, while improving calibration and predictive accuracy? In this work, we introduce several discrete-time deep mixture-of-experts (MoE)-based architectures for survival analysis problems, one of which achieves all desiderata: clustering, calibration, and predictive accuracy. We show that a key differentiator between this array of MoEs is how expressive their experts are. We find that more expressive experts that tailor predictions per patient outperform experts that rely on fixed group prototypes.

Let the Experts Speak: Improving Survival Prediction & Calibration via Mixture-of-Experts Heads

TL;DR

Abstract

Let the Experts Speak: Improving Survival Prediction & Calibration via Mixture-of-Experts Heads

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)