Features Fusion for Dual-View Mammography Mass Detection
Arina Varlamova, Valery Belotsky, Grigory Novikov, Anton Konushin, Evgeny Sidorov
TL;DR
The paper addresses the challenge of leveraging both views in dual-view mammography for mass detection. It introduces MAMM-Net, which fuses CC and MLO features at the pixel level via a Fusion Layer based on deformable attention, integrated with a View-Interactive Transformer Decoder and a Lesion Linker to establish cross-view correspondences and malignancy predictions. Key contributions include the Fusion Layer, feature-level fusion across views, and state-of-the-art results on the DDSM dataset with $R@0.25=81.6$, $R@0.5=87.9$, $R@1.0=90.6$, plus malignancy metrics (ROC-AUC $=85.3$, sensitivity $=80.2$, specificity $=76.2$). The approach reduces false positives while retaining high recall, aligning with radiologists’ two-view reasoning and enhancing potential clinical utility for computer-aided diagnosis.
Abstract
Detection of malignant lesions on mammography images is extremely important for early breast cancer diagnosis. In clinical practice, images are acquired from two different angles, and radiologists can fully utilize information from both views, simultaneously locating the same lesion. However, for automatic detection approaches such information fusion remains a challenge. In this paper, we propose a new model called MAMM-Net, which allows the processing of both mammography views simultaneously by sharing information not only on an object level, as seen in existing works, but also on a feature level. MAMM-Net's key component is the Fusion Layer, based on deformable attention and designed to increase detection precision while keeping high recall. Our experiments show superior performance on the public DDSM dataset compared to the previous state-of-the-art model, while introducing new helpful features such as lesion annotation on pixel-level and classification of lesions malignancy.
