Table of Contents
Fetching ...

MAMBO: High-Resolution Generative Approach for Mammography Images

Milica Škipina, Nikola Jovišić, Nicola Dall'Asen, Vanja Švenda, Anil Osman Tur, Slobodan Ilić, Elisa Ricci, Dubravko Ćulibrk

TL;DR

MAMBO tackles the challenge of generating authentic, full-resolution mammograms by a three-stage diffusion ensemble that conditions high-resolution patch synthesis on both global and local image context. By using separate diffusion models for global context, local context, and patch-level generation, and by overlapping patches for seamless assembly, it achieves realistic $3840\times3840$ mammograms while enabling downstream tasks such as anomaly segmentation and improved classification with synthetic data. Extensive experiments on VinDr, RSNA, and InBreast demonstrate competitive generation metrics (FID, LPIPS), radiologist-level indistinguishability, controllability via BI-RADS conditioning, and strong anomaly segmentation performance (IoU up to $0.423$) without pixel-level labels. The work also shows the utility of synthetic data for classification, maintains data privacy (no memorization), and establishes a practical, high-resolution diffusion framework that can be extended with metadata-driven guidance and finer anomaly detection in the future.

Abstract

Mammography is the gold standard for the detection and diagnosis of breast cancer. This procedure can be significantly enhanced with Artificial Intelligence (AI)-based software, which assists radiologists in identifying abnormalities. However, training AI systems requires large and diverse datasets, which are often difficult to obtain due to privacy and ethical constraints. To address this issue, the paper introduces MAMmography ensemBle mOdel (MAMBO), a novel patch-based diffusion approach designed to generate full-resolution mammograms. Diffusion models have shown breakthrough results in realistic image generation, yet few studies have focused on mammograms, and none have successfully generated high-resolution outputs required to capture fine-grained features of small lesions. To achieve this, MAMBO integrates separate diffusion models to capture both local and global (image-level) contexts. The contextual information is then fed into the final model, significantly aiding the noise removal process. This design enables MAMBO to generate highly realistic mammograms of up to 3840x3840 pixels. Importantly, this approach can be used to enhance the training of classification models and extended to anomaly segmentation. Experiments, both numerical and radiologist validation, assess MAMBO's capabilities in image generation, super-resolution, and anomaly segmentation, highlighting its potential to enhance mammography analysis for more accurate diagnoses and earlier lesion detection. The source code used in this study is publicly available at: https://github.com/iai-rs/mambo.

MAMBO: High-Resolution Generative Approach for Mammography Images

TL;DR

MAMBO tackles the challenge of generating authentic, full-resolution mammograms by a three-stage diffusion ensemble that conditions high-resolution patch synthesis on both global and local image context. By using separate diffusion models for global context, local context, and patch-level generation, and by overlapping patches for seamless assembly, it achieves realistic mammograms while enabling downstream tasks such as anomaly segmentation and improved classification with synthetic data. Extensive experiments on VinDr, RSNA, and InBreast demonstrate competitive generation metrics (FID, LPIPS), radiologist-level indistinguishability, controllability via BI-RADS conditioning, and strong anomaly segmentation performance (IoU up to ) without pixel-level labels. The work also shows the utility of synthetic data for classification, maintains data privacy (no memorization), and establishes a practical, high-resolution diffusion framework that can be extended with metadata-driven guidance and finer anomaly detection in the future.

Abstract

Mammography is the gold standard for the detection and diagnosis of breast cancer. This procedure can be significantly enhanced with Artificial Intelligence (AI)-based software, which assists radiologists in identifying abnormalities. However, training AI systems requires large and diverse datasets, which are often difficult to obtain due to privacy and ethical constraints. To address this issue, the paper introduces MAMmography ensemBle mOdel (MAMBO), a novel patch-based diffusion approach designed to generate full-resolution mammograms. Diffusion models have shown breakthrough results in realistic image generation, yet few studies have focused on mammograms, and none have successfully generated high-resolution outputs required to capture fine-grained features of small lesions. To achieve this, MAMBO integrates separate diffusion models to capture both local and global (image-level) contexts. The contextual information is then fed into the final model, significantly aiding the noise removal process. This design enables MAMBO to generate highly realistic mammograms of up to 3840x3840 pixels. Importantly, this approach can be used to enhance the training of classification models and extended to anomaly segmentation. Experiments, both numerical and radiologist validation, assess MAMBO's capabilities in image generation, super-resolution, and anomaly segmentation, highlighting its potential to enhance mammography analysis for more accurate diagnoses and earlier lesion detection. The source code used in this study is publicly available at: https://github.com/iai-rs/mambo.

Paper Structure

This paper contains 35 sections, 5 equations, 15 figures, 9 tables, 2 algorithms.

Figures (15)

  • Figure 1: Synthetic $3840x3840$ mammogram generated using MAMBO. Details at different resolutions correspond to the global context (whole image), local context, and individual patch. Best viewed when zoomed in.
  • Figure 2: In the three-stage approach of MAMBO, the first stage of the model generates a novel global context$\mathbf{x}_{0_G}$, which is then used to generate a set of local contexts$\mathbf{X}_{0_L}$ in the second stage, conditioned on shifted global contexts$\overrightarrow{\mathbf{X}_{0_G}}$. Synthetic global context and synthetic local contexts become the conditioning in the third stage to generate highly detailed patches$\mathbf{X}_{0_P}$, which are finally combined to obtain a high-resolution full mammogram.
  • Figure 3: Sample results achieved by MAMBO and PatchDM. (a) Original full-res image. (b) Image denoised from pure noise with the MAMBO pipeline using original local and global contexts from image (a). (c) Image denoised from a partially noisy original with t=750 using a baseline single-channel patch-based model. (d) Image generated with MAMBO Stage 2 and Stage 3 models using the resized original image as a global context. (e) Image generated by Patch-DM in $1024\times1024$ resolution.
  • Figure 4: Qualitative results of different methods from RSNA (first row) and VinDr (second row) datasets for $16\times$ SR.
  • Figure 5: Images generated by MAMBO alongside their top-4 nearest neighbors from the training set.
  • ...and 10 more figures