Table of Contents
Fetching ...

Deep BI-RADS Network for Improved Cancer Detection from Mammograms

Gil Ben-Artzi, Feras Daragma, Shahar Mahpod

TL;DR

A novel multi-modal approach is introduced that combines textual BI-RADS lesion descriptors with visual mammogram content and employs iterative attention layers to effectively fuse these different modalities, significantly improving classification performance over image-only models.

Abstract

While state-of-the-art models for breast cancer detection leverage multi-view mammograms for enhanced diagnostic accuracy, they often focus solely on visual mammography data. However, radiologists document valuable lesion descriptors that contain additional information that can enhance mammography-based breast cancer screening. A key question is whether deep learning models can benefit from these expert-derived features. To address this question, we introduce a novel multi-modal approach that combines textual BI-RADS lesion descriptors with visual mammogram content. Our method employs iterative attention layers to effectively fuse these different modalities, significantly improving classification performance over image-only models. Experiments on the CBIS-DDSM dataset demonstrate substantial improvements across all metrics, demonstrating the contribution of handcrafted features to end-to-end.

Deep BI-RADS Network for Improved Cancer Detection from Mammograms

TL;DR

A novel multi-modal approach is introduced that combines textual BI-RADS lesion descriptors with visual mammogram content and employs iterative attention layers to effectively fuse these different modalities, significantly improving classification performance over image-only models.

Abstract

While state-of-the-art models for breast cancer detection leverage multi-view mammograms for enhanced diagnostic accuracy, they often focus solely on visual mammography data. However, radiologists document valuable lesion descriptors that contain additional information that can enhance mammography-based breast cancer screening. A key question is whether deep learning models can benefit from these expert-derived features. To address this question, we introduce a novel multi-modal approach that combines textual BI-RADS lesion descriptors with visual mammogram content. Our method employs iterative attention layers to effectively fuse these different modalities, significantly improving classification performance over image-only models. Experiments on the CBIS-DDSM dataset demonstrate substantial improvements across all metrics, demonstrating the contribution of handcrafted features to end-to-end.

Paper Structure

This paper contains 22 sections, 13 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: Our model takes mammograms from the MLO and CC views along with a varying number of textual descriptors classes describing one or more lesions as input. The multi-attention layers (grayed blocks) processes these descriptors along with visual features extracted from mammogram images in different resolutions.
  • Figure 2: ROC curve for our approach
  • Figure 3: The possible configurations of the inputs for the view attention sublayer in our multi-attention layer. Q, K, and V represent the Query, Keys, and Value respectively.