Table of Contents
Fetching ...

M3Dsynth: A dataset of medical 3D images with AI-generated local manipulations

Giada Zingarini, Davide Cozzolino, Riccardo Corvi, Giovanni Poggi, Luisa Verdoliva

TL;DR

This work addresses the risk of manipulating medical CT content by introducing M3Dsynth, a large-scale dataset with $8{,}577$ manipulated CT lung samples focused on local inpainting in nodules, generated via three models (two GANs and one diffusion model). It provides a fixed manipulation region of $32\times32\times32$ voxels with an inner $16$ mm processed to inject or remove nodules, enabling both detection and localization tasks. The authors benchmark several state-of-the-art forensic detectors, demonstrating that cross-generator training on M3Dsynth yields robust performance, including good localization by TruFor and ManTraNet and improved detection with fine-tuning. The dataset and code are publicly available, offering a valuable resource to develop and benchmark medical image forensics methods with cross-generator generalization.

Abstract

The ability to detect manipulated visual content is becoming increasingly important in many application fields, given the rapid advances in image synthesis methods. Of particular concern is the possibility of modifying the content of medical images, altering the resulting diagnoses. Despite its relevance, this issue has received limited attention from the research community. One reason is the lack of large and curated datasets to use for development and benchmarking purposes. Here, we investigate this issue and propose M3Dsynth, a large dataset of manipulated Computed Tomography (CT) lung images. We create manipulated images by injecting or removing lung cancer nodules in real CT scans, using three different methods based on Generative Adversarial Networks (GAN) or Diffusion Models (DM), for a total of 8,577 manipulated samples. Experiments show that these images easily fool automated diagnostic tools. We also tested several state-of-the-art forensic detectors and demonstrated that, once trained on the proposed dataset, they are able to accurately detect and localize manipulated synthetic content, even when training and test sets are not aligned, showing good generalization ability. Dataset and code are publicly available at https://grip-unina.github.io/M3Dsynth/.

M3Dsynth: A dataset of medical 3D images with AI-generated local manipulations

TL;DR

This work addresses the risk of manipulating medical CT content by introducing M3Dsynth, a large-scale dataset with manipulated CT lung samples focused on local inpainting in nodules, generated via three models (two GANs and one diffusion model). It provides a fixed manipulation region of voxels with an inner mm processed to inject or remove nodules, enabling both detection and localization tasks. The authors benchmark several state-of-the-art forensic detectors, demonstrating that cross-generator training on M3Dsynth yields robust performance, including good localization by TruFor and ManTraNet and improved detection with fine-tuning. The dataset and code are publicly available, offering a valuable resource to develop and benchmark medical image forensics methods with cross-generator generalization.

Abstract

The ability to detect manipulated visual content is becoming increasingly important in many application fields, given the rapid advances in image synthesis methods. Of particular concern is the possibility of modifying the content of medical images, altering the resulting diagnoses. Despite its relevance, this issue has received limited attention from the research community. One reason is the lack of large and curated datasets to use for development and benchmarking purposes. Here, we investigate this issue and propose M3Dsynth, a large dataset of manipulated Computed Tomography (CT) lung images. We create manipulated images by injecting or removing lung cancer nodules in real CT scans, using three different methods based on Generative Adversarial Networks (GAN) or Diffusion Models (DM), for a total of 8,577 manipulated samples. Experiments show that these images easily fool automated diagnostic tools. We also tested several state-of-the-art forensic detectors and demonstrated that, once trained on the proposed dataset, they are able to accurately detect and localize manipulated synthetic content, even when training and test sets are not aligned, showing good generalization ability. Dataset and code are publicly available at https://grip-unina.github.io/M3Dsynth/.
Paper Structure (6 sections, 3 figures, 3 tables)

This paper contains 6 sections, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Examples of injection (top) and removal (bottom) of a large lung nodule from our dataset. Next to the pristine image, we show the manipulated versions obtained from tools based on Pix2Pix, CycleGAN and Diffusion Models.
  • Figure 2: Scheme of the manipulation process. Pre-processing (top): selection of candidate site, extraction of cubic sample, scaling to 32$\times$32$\times$32 pixels, equalization, center masking. The input datacube feeds a GAN/DM model which generates the synthetic datacube. Post-processing (bottom): data restoration by de-equalization and inverse scaling, touch-up to improve blending into the host CT scan.
  • Figure 3: Histograms of lung nodule classification scores. Top: before manipulation, the diagnostic tools separates relatively well benign (blue) form malignant (red) nodules. Bottom: after manipulation the removed/shrinked malignant nodules (red) have the same histogram as benign nodules had before manipulation and vice-versa.