CoMoTo: Unpaired Cross-Modal Lesion Distillation Improves Breast Lesion Detection in Tomosynthesis
Muhammad Alberb, Marawan Elbatel, Aya Elgebaly, Ricardo Montoya-del-Angel, Xiaomeng Li, Robert Martí
TL;DR
CoMoTo tackles the data scarcity challenge in digital breast tomosynthesis (DBT) lesion detection by transferring lesion-focused knowledge from unpaired mammography data. It introduces two components, Lesion-specific KD (LsKD) and Intra-modal Point Alignment (ImPA), to distill and align lesion features at the point level rather than across whole images, mitigating issues from background tissue and modality differences. The approach yields consistent gains over pretraining and image-level KD, especially in low-data settings, and ablations show that lesion point alignment and intra-modal consistency are key contributors. Practically, CoMoTo enables improved DBT detection without requiring mammography at inference, offering a scalable path to better breast cancer screening with limited DBT data.
Abstract
Digital Breast Tomosynthesis (DBT) is an advanced breast imaging modality that offers superior lesion detection accuracy compared to conventional mammography, albeit at the trade-off of longer reading time. Accelerating lesion detection from DBT using deep learning is hindered by limited data availability and huge annotation costs. A possible solution to this issue could be to leverage the information provided by a more widely available modality, such as mammography, to enhance DBT lesion detection. In this paper, we present a novel framework, CoMoTo, for improving lesion detection in DBT. Our framework leverages unpaired mammography data to enhance the training of a DBT model, improving practicality by eliminating the need for mammography during inference. Specifically, we propose two novel components, Lesion-specific Knowledge Distillation (LsKD) and Intra-modal Point Alignment (ImPA). LsKD selectively distills lesion features from a mammography teacher model to a DBT student model, disregarding background features. ImPA further enriches LsKD by ensuring the alignment of lesion features within the teacher before distilling knowledge to the student. Our comprehensive evaluation shows that CoMoTo is superior to traditional pretraining and image-level KD, improving performance by 7% Mean Sensitivity under low-data setting. Our code is available at https://github.com/Muhammad-Al-Barbary/CoMoTo .
