Table of Contents
Fetching ...

Longitudinal Mammogram Risk Prediction

Batuhan K. Karaman, Katerina Dodelzon, Gozde B. Akar, Mert R. Sabuncu

TL;DR

This work tackles risk prediction for future breast cancer by exploiting longitudinal mammographic history. It introduces LoMaR, a transformer-based model that aggregates per-visit image embeddings and cross-visit temporal information to produce a probabilistic, year-by-year cancer risk using an additive-hazard survival framework; the key equation is $P(t_{cancer}=k|m)=\sigma(B(m)+\sum_{i=1}^{k}H_i(m))$. On a large Karolinska-derived dataset, LoMaR achieves state-of-the-art performance even with only the present mammogram and shows substantial gains when four prior annual mammograms are available, particularly for longer-term horizons. The approach demonstrates the predictive value of longitudinal imaging data and improves lesion localization through history-aware predictions, with results robust to incomplete histories and potential biases. The work provides code and model weights to promote reproducibility and practical deployment in screening programs.

Abstract

Breast cancer is one of the leading causes of mortality among women worldwide. Early detection and risk assessment play a crucial role in improving survival rates. Therefore, annual or biennial mammograms are often recommended for screening in high-risk groups. Mammograms are typically interpreted by expert radiologists based on the Breast Imaging Reporting and Data System (BI-RADS), which provides a uniform way to describe findings and categorizes them to indicate the level of concern for breast cancer. Recently, machine learning (ML) and computational approaches have been developed to automate and improve the interpretation of mammograms. However, both BI-RADS and the ML-based methods focus on the analysis of data from the present and sometimes the most recent prior visit. While it is clear that temporal changes in image features of the longitudinal scans should carry value for quantifying breast cancer risk, no prior work has conducted a systematic study of this. In this paper, we extend a state-of-the-art ML model to ingest an arbitrary number of longitudinal mammograms and predict future breast cancer risk. On a large-scale dataset, we demonstrate that our model, LoMaR, achieves state-of-the-art performance when presented with only the present mammogram. Furthermore, we use LoMaR to characterize the predictive value of prior visits. Our results show that longer histories (e.g., up to four prior annual mammograms) can significantly boost the accuracy of predicting future breast cancer risk, particularly beyond the short-term. Our code and model weights are available at https://github.com/batuhankmkaraman/LoMaR.

Longitudinal Mammogram Risk Prediction

TL;DR

This work tackles risk prediction for future breast cancer by exploiting longitudinal mammographic history. It introduces LoMaR, a transformer-based model that aggregates per-visit image embeddings and cross-visit temporal information to produce a probabilistic, year-by-year cancer risk using an additive-hazard survival framework; the key equation is . On a large Karolinska-derived dataset, LoMaR achieves state-of-the-art performance even with only the present mammogram and shows substantial gains when four prior annual mammograms are available, particularly for longer-term horizons. The approach demonstrates the predictive value of longitudinal imaging data and improves lesion localization through history-aware predictions, with results robust to incomplete histories and potential biases. The work provides code and model weights to promote reproducibility and practical deployment in screening programs.

Abstract

Breast cancer is one of the leading causes of mortality among women worldwide. Early detection and risk assessment play a crucial role in improving survival rates. Therefore, annual or biennial mammograms are often recommended for screening in high-risk groups. Mammograms are typically interpreted by expert radiologists based on the Breast Imaging Reporting and Data System (BI-RADS), which provides a uniform way to describe findings and categorizes them to indicate the level of concern for breast cancer. Recently, machine learning (ML) and computational approaches have been developed to automate and improve the interpretation of mammograms. However, both BI-RADS and the ML-based methods focus on the analysis of data from the present and sometimes the most recent prior visit. While it is clear that temporal changes in image features of the longitudinal scans should carry value for quantifying breast cancer risk, no prior work has conducted a systematic study of this. In this paper, we extend a state-of-the-art ML model to ingest an arbitrary number of longitudinal mammograms and predict future breast cancer risk. On a large-scale dataset, we demonstrate that our model, LoMaR, achieves state-of-the-art performance when presented with only the present mammogram. Furthermore, we use LoMaR to characterize the predictive value of prior visits. Our results show that longer histories (e.g., up to four prior annual mammograms) can significantly boost the accuracy of predicting future breast cancer risk, particularly beyond the short-term. Our code and model weights are available at https://github.com/batuhankmkaraman/LoMaR.
Paper Structure (18 sections, 2 equations, 2 figures, 3 tables)

This paper contains 18 sections, 2 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Schematic representation of our Longitudinal Mammogram Risk model (LoMaR), where year 0 represents the current point in time.
  • Figure 2: Grad-CAM visualization of LoMaR with mammograms from five representative test subjects with (third row) and without (second row) prior mammograms.