Table of Contents
Fetching ...

Improving the generalization of deep learning models in the segmentation of mammography images

Jan Hurtado, Joao P. Maia, Cesar A. Sierra-Franco, Alberto Raposo

TL;DR

The paper tackles the challenge of segmenting landmark structures in mammography across images from different vendors. It proposes data-centric augmentation strategies—annotation-guided image intensity manipulation and style transfer—to enrich training data and improve generalization without additional manual labeling. Through extensive experiments on GE training data and evaluation on IMS, PLANMED, and HOLOGIC datasets (including CC and DDSM cases), the methods yield improved robustness and reduced prediction uncertainty, with the combination strategy offering the most consistent performance. The findings suggest practical potential for clinical deployment by enhancing cross-vendor segmentation accuracy while lowering labeling costs.

Abstract

Mammography stands as the main screening method for detecting breast cancer early, enhancing treatment success rates. The segmentation of landmark structures in mammography images can aid the medical assessment in the evaluation of cancer risk and the image acquisition adequacy. We introduce a series of data-centric strategies aimed at enriching the training data for deep learning-based segmentation of landmark structures. Our approach involves augmenting the training samples through annotation-guided image intensity manipulation and style transfer to achieve better generalization than standard training procedures. These augmentations are applied in a balanced manner to ensure the model learns to process a diverse range of images generated by different vendor equipments while retaining its efficacy on the original data. We present extensive numerical and visual results that demonstrate the superior generalization capabilities of our methods when compared to the standard training. For this evaluation, we consider a large dataset that includes mammography images generated by different vendor equipments. Further, we present complementary results that show both the strengths and limitations of our methods across various scenarios. The accuracy and robustness demonstrated in the experiments suggest that our method is well-suited for integration into clinical practice.

Improving the generalization of deep learning models in the segmentation of mammography images

TL;DR

The paper tackles the challenge of segmenting landmark structures in mammography across images from different vendors. It proposes data-centric augmentation strategies—annotation-guided image intensity manipulation and style transfer—to enrich training data and improve generalization without additional manual labeling. Through extensive experiments on GE training data and evaluation on IMS, PLANMED, and HOLOGIC datasets (including CC and DDSM cases), the methods yield improved robustness and reduced prediction uncertainty, with the combination strategy offering the most consistent performance. The findings suggest practical potential for clinical deployment by enhancing cross-vendor segmentation accuracy while lowering labeling costs.

Abstract

Mammography stands as the main screening method for detecting breast cancer early, enhancing treatment success rates. The segmentation of landmark structures in mammography images can aid the medical assessment in the evaluation of cancer risk and the image acquisition adequacy. We introduce a series of data-centric strategies aimed at enriching the training data for deep learning-based segmentation of landmark structures. Our approach involves augmenting the training samples through annotation-guided image intensity manipulation and style transfer to achieve better generalization than standard training procedures. These augmentations are applied in a balanced manner to ensure the model learns to process a diverse range of images generated by different vendor equipments while retaining its efficacy on the original data. We present extensive numerical and visual results that demonstrate the superior generalization capabilities of our methods when compared to the standard training. For this evaluation, we consider a large dataset that includes mammography images generated by different vendor equipments. Further, we present complementary results that show both the strengths and limitations of our methods across various scenarios. The accuracy and robustness demonstrated in the experiments suggest that our method is well-suited for integration into clinical practice.

Paper Structure

This paper contains 13 sections, 13 figures, 12 tables, 1 algorithm.

Figures (13)

  • Figure 1: Pre-processed image and its corresponding label map (ground-truth annotation). The nipple is colored in green, the pectoral muscle is colored in blue, the fibroglandular tissue is colored in magenta, the fatty tissue is colored in yellow, and the background is colored in black.
  • Figure 2: Baseline method visual results. First column: input image. Second column: ground-truth annotation. Third column: prediction. Fourth column: uncertainty map (hot color map with values in the range $[0,1]$). First row: image from GE dataset. Second row: image from IMS dataset. Third row: image from PLANMED dataset. Fourth row: image from HOLOGIC dataset.
  • Figure 3: Image manipulation example. The most left image is the image $\mathbf{I}_\text{in}$. The other images are different results of applying the image manipulation algorithm.
  • Figure 4: Style transfer post-processing. First column: annotated regions, where the background is colored in black. Second column: original image. Third column: stylized image. Fourth column: post-processed stylized image.
  • Figure 5: Style transfer examples. Each row is a different case. First column: original GE image. Second column: IMS stylization results. Third column: Second column: PLANMED stylization results. Second column: HOLOGIC stylization results.
  • ...and 8 more figures