Table of Contents
Fetching ...

CoMoTo: Unpaired Cross-Modal Lesion Distillation Improves Breast Lesion Detection in Tomosynthesis

Muhammad Alberb, Marawan Elbatel, Aya Elgebaly, Ricardo Montoya-del-Angel, Xiaomeng Li, Robert Martí

TL;DR

CoMoTo tackles the data scarcity challenge in digital breast tomosynthesis (DBT) lesion detection by transferring lesion-focused knowledge from unpaired mammography data. It introduces two components, Lesion-specific KD (LsKD) and Intra-modal Point Alignment (ImPA), to distill and align lesion features at the point level rather than across whole images, mitigating issues from background tissue and modality differences. The approach yields consistent gains over pretraining and image-level KD, especially in low-data settings, and ablations show that lesion point alignment and intra-modal consistency are key contributors. Practically, CoMoTo enables improved DBT detection without requiring mammography at inference, offering a scalable path to better breast cancer screening with limited DBT data.

Abstract

Digital Breast Tomosynthesis (DBT) is an advanced breast imaging modality that offers superior lesion detection accuracy compared to conventional mammography, albeit at the trade-off of longer reading time. Accelerating lesion detection from DBT using deep learning is hindered by limited data availability and huge annotation costs. A possible solution to this issue could be to leverage the information provided by a more widely available modality, such as mammography, to enhance DBT lesion detection. In this paper, we present a novel framework, CoMoTo, for improving lesion detection in DBT. Our framework leverages unpaired mammography data to enhance the training of a DBT model, improving practicality by eliminating the need for mammography during inference. Specifically, we propose two novel components, Lesion-specific Knowledge Distillation (LsKD) and Intra-modal Point Alignment (ImPA). LsKD selectively distills lesion features from a mammography teacher model to a DBT student model, disregarding background features. ImPA further enriches LsKD by ensuring the alignment of lesion features within the teacher before distilling knowledge to the student. Our comprehensive evaluation shows that CoMoTo is superior to traditional pretraining and image-level KD, improving performance by 7% Mean Sensitivity under low-data setting. Our code is available at https://github.com/Muhammad-Al-Barbary/CoMoTo .

CoMoTo: Unpaired Cross-Modal Lesion Distillation Improves Breast Lesion Detection in Tomosynthesis

TL;DR

CoMoTo tackles the data scarcity challenge in digital breast tomosynthesis (DBT) lesion detection by transferring lesion-focused knowledge from unpaired mammography data. It introduces two components, Lesion-specific KD (LsKD) and Intra-modal Point Alignment (ImPA), to distill and align lesion features at the point level rather than across whole images, mitigating issues from background tissue and modality differences. The approach yields consistent gains over pretraining and image-level KD, especially in low-data settings, and ablations show that lesion point alignment and intra-modal consistency are key contributors. Practically, CoMoTo enables improved DBT detection without requiring mammography at inference, offering a scalable path to better breast cancer screening with limited DBT data.

Abstract

Digital Breast Tomosynthesis (DBT) is an advanced breast imaging modality that offers superior lesion detection accuracy compared to conventional mammography, albeit at the trade-off of longer reading time. Accelerating lesion detection from DBT using deep learning is hindered by limited data availability and huge annotation costs. A possible solution to this issue could be to leverage the information provided by a more widely available modality, such as mammography, to enhance DBT lesion detection. In this paper, we present a novel framework, CoMoTo, for improving lesion detection in DBT. Our framework leverages unpaired mammography data to enhance the training of a DBT model, improving practicality by eliminating the need for mammography during inference. Specifically, we propose two novel components, Lesion-specific Knowledge Distillation (LsKD) and Intra-modal Point Alignment (ImPA). LsKD selectively distills lesion features from a mammography teacher model to a DBT student model, disregarding background features. ImPA further enriches LsKD by ensuring the alignment of lesion features within the teacher before distilling knowledge to the student. Our comprehensive evaluation shows that CoMoTo is superior to traditional pretraining and image-level KD, improving performance by 7% Mean Sensitivity under low-data setting. Our code is available at https://github.com/Muhammad-Al-Barbary/CoMoTo .
Paper Structure (11 sections, 8 equations, 3 figures, 2 tables)

This paper contains 11 sections, 8 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: CoMoTo Overview. Feature maps are extracted from the encoders of modality dedicated object detection networks. Mammography critical lesion features corresponding to target bounding boxes are aligned with each other. Subsequently, DBT critical lesion features are aligned to those of mammography. During inference, the DBT model is used alone, improving efficiency and practicality.
  • Figure 2: Qualitative assessment on testing data shows CoMoTo's more accurate lesion fitting compared to other SOTA approaches when trained over 10% of the data.
  • Figure 3: (a) Comparing CoMoTo with SOTA at varying data ratios. (b) $\alpha$ effect on lesion-specific KD. (c) Effect of different distillation points. C, E, and S refer to the center, edges, and sides midpoints of the lesion bounding boxes, respectively.