Table of Contents
Fetching ...

PersonalizedUS: Interpretable Breast Cancer Risk Assessment with Local Coverage Uncertainty Quantification

Alek Fröhlich, Thiago Ramos, Gustavo Cabello, Isabela Buzatto, Rafael Izbicki, Daniel Tiezzi

TL;DR

This work tackles the challenge of accurately assessing breast lesion malignancy from ultrasound while reducing unnecessary biopsies and maintaining clinician trust. It presents PersonalizedUS, an interpretable risk predictor based on regularized logistic regression augmented with Locart-based local conformal prediction to provide personalized uncertainty and conditional coverage across lesion subgroups. The approach identifies meaningful lesion subgroups via a learned partition, yielding reliable, locally calibrated prediction sets and enabling substantial biopsy reductions (e.g., ~65% for BI-RADS 4a/4b) without missing cancers. Deployed as a web application across four centers, the method demonstrates strong discrimination (AUC ~0.958, AUPRC ~0.960) and practical clinical utility, with open-data and code planned for release to facilitate reproducibility and broader adoption.

Abstract

Correctly assessing the malignancy of breast lesions identified during ultrasound examinations is crucial for effective clinical decision-making. However, the current "golden standard" relies on manual BI-RADS scoring by clinicians, often leading to unnecessary biopsies and a significant mental health burden on patients and their families. In this paper, we introduce PersonalizedUS, an interpretable machine learning system that leverages recent advances in conformal prediction to provide precise and personalized risk estimates with local coverage guarantees and sensitivity, specificity, and predictive values above 0.9 across various threshold levels. In particular, we identify meaningful lesion subgroups where distribution-free, model-agnostic conditional coverage holds, with approximately 90% of our prediction sets containing only the ground truth in most lesion subgroups, thus explicitly characterizing for which patients the model is most suitably applied. Moreover, we make available a curated tabular dataset of 1936 biopsied breast lesions from a recent observational multicenter study and benchmark the performance of several state-of-the-art learning algorithms. We also report a successful case study of the deployed system in the same multicenter context. Concrete clinical benefits include up to a 65% reduction in requested biopsies among BI-RADS 4a and 4b lesions, with minimal to no missed cancer cases.

PersonalizedUS: Interpretable Breast Cancer Risk Assessment with Local Coverage Uncertainty Quantification

TL;DR

This work tackles the challenge of accurately assessing breast lesion malignancy from ultrasound while reducing unnecessary biopsies and maintaining clinician trust. It presents PersonalizedUS, an interpretable risk predictor based on regularized logistic regression augmented with Locart-based local conformal prediction to provide personalized uncertainty and conditional coverage across lesion subgroups. The approach identifies meaningful lesion subgroups via a learned partition, yielding reliable, locally calibrated prediction sets and enabling substantial biopsy reductions (e.g., ~65% for BI-RADS 4a/4b) without missing cancers. Deployed as a web application across four centers, the method demonstrates strong discrimination (AUC ~0.958, AUPRC ~0.960) and practical clinical utility, with open-data and code planned for release to facilitate reproducibility and broader adoption.

Abstract

Correctly assessing the malignancy of breast lesions identified during ultrasound examinations is crucial for effective clinical decision-making. However, the current "golden standard" relies on manual BI-RADS scoring by clinicians, often leading to unnecessary biopsies and a significant mental health burden on patients and their families. In this paper, we introduce PersonalizedUS, an interpretable machine learning system that leverages recent advances in conformal prediction to provide precise and personalized risk estimates with local coverage guarantees and sensitivity, specificity, and predictive values above 0.9 across various threshold levels. In particular, we identify meaningful lesion subgroups where distribution-free, model-agnostic conditional coverage holds, with approximately 90% of our prediction sets containing only the ground truth in most lesion subgroups, thus explicitly characterizing for which patients the model is most suitably applied. Moreover, we make available a curated tabular dataset of 1936 biopsied breast lesions from a recent observational multicenter study and benchmark the performance of several state-of-the-art learning algorithms. We also report a successful case study of the deployed system in the same multicenter context. Concrete clinical benefits include up to a 65% reduction in requested biopsies among BI-RADS 4a and 4b lesions, with minimal to no missed cancer cases.
Paper Structure (18 sections, 2 theorems, 3 equations, 5 figures, 2 tables)

This paper contains 18 sections, 2 theorems, 3 equations, 5 figures, 2 tables.

Key Result

Theorem 1

Let $\{\mathcal{X}_1, \ldots, \mathcal{X}_K\}$ be a finite partition of the lesion space. Given an exchangeable sequence $(X_i, y_i)_{i=1}^{n+1}$ and a miscoverage level $\alpha \in (0,1)$, the following holds: for all $j = 1, \ldots, K$ where $\tilde{\alpha} = \frac{\lceil(k_j + 1)\cdot\alpha\rceil}{|k_j|}$ and $k_j \coloneqq \left|\{i \in [n] \mid X_i \in \mathcal{X}_j\}\right|$.

Figures (5)

  • Figure 1: PersonalizedUS pipeline: Starting with a suspicious breast lesion identified by US, clinical, BI-RADS, and Doppler features are fed into a logistic regression model to estimate malignancy risk. The most influential features are highlighted. The lesion is then fed into a decision tree, whose leaves reflect difficulty levels with respect to the prediction model. Finally, a prediction set with a local conditional coverage guarantee, ensuring that lesions within more challenging leaves are associated with more uncertain predictions.
  • Figure 2: Results of risk estimation on the calibration set: (a) Classification metrics and (b) Calibration curve.
  • Figure 3: Lesion subgroups analysis of BIRADS categories, malignancy, predictive accuracy, and subgroup size.
  • Figure 4: Decision tree regressor trained for predicting the classification residuals of our logistic regression model over the calibration set.
  • Figure 5: Stacked plot of (local) average set sizes.

Theorems & Definitions (2)

  • Theorem 1: Theorem 2 in cabezas2024regression
  • Theorem 2