Longitudinal Risk Prediction in Mammography with Privileged History Distillation

Banafsheh Karimian; Alexis Guichemerre; Soufiane Belharbi; Natacha Gillet; Luke McCaffrey; Mohammadhadi Shateri; Eric Granger

Longitudinal Risk Prediction in Mammography with Privileged History Distillation

Banafsheh Karimian, Alexis Guichemerre, Soufiane Belharbi, Natacha Gillet, Luke McCaffrey, Mohammadhadi Shateri, Eric Granger

Abstract

Breast cancer remains a leading cause of cancer-related mortality worldwide. Longitudinal mammography risk prediction models improve multi-year breast cancer risk prediction based on prior screening exams. However, in real-world clinical practice, longitudinal histories are often incomplete, irregular, or unavailable due to missed screenings, first-time examinations, heterogeneous acquisition schedules, or archival constraints. The absence of prior exams degrades the performance of longitudinal risk models and limits their practical applicability. While substantial longitudinal history is available during training, prior exams are commonly absent at test time. In this paper, we address missing history at inference time and propose a longitudinal risk prediction method that uses mammography history as privileged information during training and distills its prognostic value into a student model that only requires the current exam at inference time. The key idea is a privileged multi-teacher distillation scheme with horizon-specific teachers: each teacher is trained on the full longitudinal history to specialize in one prediction horizon, while the student receives only a reconstructed history derived from the current exam. This allows the student to inherit horizon-dependent longitudinal risk cues without requiring prior screening exams at deployment. Our new Privileged History Distillation (PHD) method is validated on a large longitudinal mammography dataset with multi-year cancer outcomes, CSAW-CC, comparing full-history and no-history baselines to their distilled counterparts. Using time-dependent AUC across horizons, our privileged history distillation method markedly improves the performance of long-horizon prediction over no-history models and is comparable to that of full-history models, while using only the current exam at inference time.

Longitudinal Risk Prediction in Mammography with Privileged History Distillation

Abstract

Paper Structure (11 sections, 5 equations, 3 figures, 1 table)

This paper contains 11 sections, 5 equations, 3 figures, 1 table.

Introduction
Related Work
Proposed Privileged History Distillation Method
Image and Exam Representation Learning:
Historical Embedding Distillation:
Longitudinal Aggregation and Risk Prediction:
Per-Horizon Logit Distillation:
Experimental Validation
Experimental Methodology:
Results and Discussion:
Conclusion

Figures (3)

Figure 1: Partial AUC at 10% FPR (pAUC@10%) for LoMaR and VMRA at 4- and 5-year horizons as a function of available screening history.
Figure 2: Proposed PHD method for longitudinal risk prediction in mammography. Visit embeddings are extracted from each exam (mammogram), and missing historical embeddings are predicted from the current exam. The generated sequence is aggregated by a longitudinal model and passed to an additive hazard layer for multi-year risk prediction, with a frozen true-history multi-teacher pathway providing per-horizon distillation.
Figure 3: a) Ablation studies showing that multi-teacher supervision (Student (5 teacher)) yields the strongest gains, particularly at the 5-year horizon, b) Comparing ROC curves for VMRA and LoMaR under varying history availability and the proposed distilled-history model. Although VMRA and LoMaR performance increases with more prior exams, VMRA+PHD and LoMaR+PHD achieves the strongest sensitivity in the low-FPR despite operating without history, matching or exceeding the full-history model.

Longitudinal Risk Prediction in Mammography with Privileged History Distillation

Abstract

Longitudinal Risk Prediction in Mammography with Privileged History Distillation

Authors

Abstract

Table of Contents

Figures (3)