Table of Contents
Fetching ...

From 2D to 3D Without Extra Baggage: Data-Efficient Cancer Detection in Digital Breast Tomosynthesis

Yen Nhi Truong Vu, Dan Guo, Sripad Joshi, Harshit Kumar, Jason Su, Thomas Paul Matthews

TL;DR

This paper tackles data scarcity in Digital Breast Tomosynthesis (DBT) by introducing M&M-3D, a parameter-free extension of a FFDM-pretrained 2D detector (M&M) that enables learnable 3D reasoning without adding new parameters. It constructs sparse 3D proposals spanning all DBT slices and repeatedly mixes these with slice-level features through six cascade heads, using malignancy-guided weighting to fuse slice information into a coherent 3D representation and derive implicit $z$-axis localization via $z_i = \arg\max_s w_{i,s}$. M&M-3D demonstrates strong data efficiency, outperforming 2D projection and slice-based baselines by up to 54% in localization and 10% in classification, and matching or surpassing complex 3D reasoning methods in low-data regimes while achieving state-of-the-art results on the BCS-DBT benchmark after finetuning. The approach preserves FFDM transferability, achieves high generalizability across datasets, and aligns with radiologists’ workflows by focusing supervision on the most suspicious slice for each finding. Overall, M&M-3D provides a scalable path toward unified 2D-3D learning for cancer detection in DBT with practical clinical impact and data-efficient deployment.

Abstract

Digital Breast Tomosynthesis (DBT) enhances finding visibility for breast cancer detection by providing volumetric information that reduces the impact of overlapping tissues; however, limited annotated data has constrained the development of deep learning models for DBT. To address data scarcity, existing methods attempt to reuse 2D full-field digital mammography (FFDM) models by either flattening DBT volumes or processing slices individually, thus discarding volumetric information. Alternatively, 3D reasoning approaches introduce complex architectures that require more DBT training data. Tackling these drawbacks, we propose M&M-3D, an architecture that enables learnable 3D reasoning while remaining parameter-free relative to its FFDM counterpart, M&M. M&M-3D constructs malignancy-guided 3D features, and 3D reasoning is learned through repeatedly mixing these 3D features with slice-level information. This is achieved by modifying operations in M&M without adding parameters, thus enabling direct weight transfer from FFDM. Extensive experiments show that M&M-3D surpasses 2D projection and 3D slice-based methods by 11-54% for localization and 3-10% for classification. Additionally, M&M-3D outperforms complex 3D reasoning variants by 20-47% for localization and 2-10% for classification in the low-data regime, while matching their performance in high-data regime. On the popular BCS-DBT benchmark, M&M-3D outperforms previous top baseline by 4% for classification and 10% for localization.

From 2D to 3D Without Extra Baggage: Data-Efficient Cancer Detection in Digital Breast Tomosynthesis

TL;DR

This paper tackles data scarcity in Digital Breast Tomosynthesis (DBT) by introducing M&M-3D, a parameter-free extension of a FFDM-pretrained 2D detector (M&M) that enables learnable 3D reasoning without adding new parameters. It constructs sparse 3D proposals spanning all DBT slices and repeatedly mixes these with slice-level features through six cascade heads, using malignancy-guided weighting to fuse slice information into a coherent 3D representation and derive implicit -axis localization via . M&M-3D demonstrates strong data efficiency, outperforming 2D projection and slice-based baselines by up to 54% in localization and 10% in classification, and matching or surpassing complex 3D reasoning methods in low-data regimes while achieving state-of-the-art results on the BCS-DBT benchmark after finetuning. The approach preserves FFDM transferability, achieves high generalizability across datasets, and aligns with radiologists’ workflows by focusing supervision on the most suspicious slice for each finding. Overall, M&M-3D provides a scalable path toward unified 2D-3D learning for cancer detection in DBT with practical clinical impact and data-efficient deployment.

Abstract

Digital Breast Tomosynthesis (DBT) enhances finding visibility for breast cancer detection by providing volumetric information that reduces the impact of overlapping tissues; however, limited annotated data has constrained the development of deep learning models for DBT. To address data scarcity, existing methods attempt to reuse 2D full-field digital mammography (FFDM) models by either flattening DBT volumes or processing slices individually, thus discarding volumetric information. Alternatively, 3D reasoning approaches introduce complex architectures that require more DBT training data. Tackling these drawbacks, we propose M&M-3D, an architecture that enables learnable 3D reasoning while remaining parameter-free relative to its FFDM counterpart, M&M. M&M-3D constructs malignancy-guided 3D features, and 3D reasoning is learned through repeatedly mixing these 3D features with slice-level information. This is achieved by modifying operations in M&M without adding parameters, thus enabling direct weight transfer from FFDM. Extensive experiments show that M&M-3D surpasses 2D projection and 3D slice-based methods by 11-54% for localization and 3-10% for classification. Additionally, M&M-3D outperforms complex 3D reasoning variants by 20-47% for localization and 2-10% for classification in the low-data regime, while matching their performance in high-data regime. On the popular BCS-DBT benchmark, M&M-3D outperforms previous top baseline by 4% for classification and 10% for localization.

Paper Structure

This paper contains 41 sections, 8 equations, 6 figures, 10 tables.

Figures (6)

  • Figure 1: Common DBT approaches either (left) project the input into 2D, making z-localization impossible, or (middle) process the volume slice by slice, relying on heuristics for output aggregation. M&M-3D (right) enables seamless 3D reasoning by dynamically fusing slice-level features into 3D representations, which repeatedly interact with the slices to facilitate 3D information mixing.
  • Figure 2: M&M-3D extends M&M (blue, \ref{['sec:prereq']}) with parameter-free 3D reasoning (yellow, \ref{['sec:methods']}). 3D proposals, parameterized by 3D features $\mathbf{h}_{i-1}$ and 2D extent $\mathbf{b}_{i-1}$ spanning all slices $s$, are refined using 6 cascade heads ($1 \leq i \leq 6$). These 3D features interact with 2D RoI features $\mathbf{f}_{\mathbf{b}_{i-1},s}$ across slices, producing slice-level object features $\mathbf{h}_{i,s}$ enhanced with 3D context. The classification (Cls.) module is reused to produce finding-slice scores $\mathbf{w}_{i,s}$, which are used to fuse $\mathbf{h}_{i,s}$ into refined 3D features $\mathbf{h}_{i}$ focusing on the most suspicious slices. $z$-axis localization is obtained as the slice with maximum score, i.e., $\arg\max_{s} \mathbf{w}_{i,s}$.
  • Figure 3: Qualitative results. From left to right: (1) FFDM: The finding is obscured by surrounding tissue, leading to low detection scores. (2) MIP still suffers from occlusion and introduces false positives. (3) Buda* assigns a high score to the correct finding on the most suspicious slice but also increases the number of false positives. (4) M&M-3D successfully assigns a high score to the malignant finding on the most suspicious slice while maintaining low scores in other regions.
  • Figure 4: Finding-slice scores produced by M&M-3D. For a malignant finding, we identify the highest scoring proposal $m$ that matches it, and plot the scores $\mathbf{w}_{6,s}[m]$. The red bar denotes the range of slices where the finding is visible based on ground truth annotations. Top: a small cluster of calcifications visible in only 2/16 slices, yielding a sharp localized peak in $\mathbf{w}_{6,s}[m]$. Bottom: a mass visible on 10/16 slices, yielding a broad span of elevated $\mathbf{w}_{6,s}[m]$ values. Insets show representative slices.
  • Figure 5: Comparison of M&M-3D with complex 3D reasoning variants. M&M-3D performs similarly to all alternatives in high-data regime but outperforms them significantly in low-data regime, illustrating its data efficiency. See Appendix \ref{['apd:fig5']} for figure data.
  • ...and 1 more figures