Table of Contents
Fetching ...

Local Lesion Generation is Effective for Capsule Endoscopy Image Data Augmentation in a Limited Data Setting

Adrian B. Chłopowiec, Adam R. Chłopowiec, Krzysztof Galus, Wojciech Cebula, Martin Tabakov

TL;DR

The paper tackles limited data and severe class imbalance in capsule endoscopy by introducing two local lesion generation techniques: PBDA, which uses Poisson Blending to compositely insert lesions, and IIDA, which fine-tunes LaMa for lesion inpainting within healthy tissue. Together, these methods yield substantial performance gains on the Kvasir Capsule Dataset, achieving a macro F1-score of 33.07% and surpassing prior work by up to 7.84 percentage points, with IIDA providing strong standalone improvements and PBDA offering complementary gains. The combination of PBDA and IIDA sets a new benchmark for generative data augmentation in medical imaging, outperforming de novo GANs and diffusion models by focusing edits on localized regions and maintaining label integrity. This approach is particularly impactful for clinical contexts where data are scarce, as it enables robust augmentation with annotated synthetic lesions, improving lesion detection and classification while preserving interpretability and annotation ease.

Abstract

Limited medical imaging datasets challenge deep learning models by increasing risks of overfitting and reduced generalization, particularly in Generative Adversarial Networks (GANs), where discriminators may overfit, leading to training divergence. This constraint also impairs classification models trained on small datasets. Generative Data Augmentation (GDA) addresses this by expanding training datasets with synthetic data, although it requires training a generative model. We propose and evaluate two local lesion generation approaches to address the challenge of augmenting small medical image datasets. The first approach employs the Poisson Image Editing algorithm, a classical image processing technique, to create realistic image composites that outperform current state-of-the-art methods. The second approach introduces a novel generative method, leveraging a fine-tuned Image Inpainting GAN to synthesize realistic lesions within specified regions of real training images. A comprehensive comparison of the two proposed methods demonstrates that effective local lesion generation in a data-constrained setting allows for reaching new state-of-the-art results in capsule endoscopy lesion classification. Combination of our techniques achieves a macro F1-score of 33.07%, surpassing the previous best result by 7.84 percentage points (p.p.) on the highly imbalanced Kvasir Capsule Dataset, a benchmark for capsule endoscopy. To the best of our knowledge, this work is the first to apply a fine-tuned Image Inpainting GAN for GDA in medical imaging, demonstrating that an image-conditional GAN can be adapted effectively to limited datasets to generate high-quality examples, facilitating effective data augmentation. Additionally, we show that combining this GAN-based approach with classical image processing techniques further improves the results.

Local Lesion Generation is Effective for Capsule Endoscopy Image Data Augmentation in a Limited Data Setting

TL;DR

The paper tackles limited data and severe class imbalance in capsule endoscopy by introducing two local lesion generation techniques: PBDA, which uses Poisson Blending to compositely insert lesions, and IIDA, which fine-tunes LaMa for lesion inpainting within healthy tissue. Together, these methods yield substantial performance gains on the Kvasir Capsule Dataset, achieving a macro F1-score of 33.07% and surpassing prior work by up to 7.84 percentage points, with IIDA providing strong standalone improvements and PBDA offering complementary gains. The combination of PBDA and IIDA sets a new benchmark for generative data augmentation in medical imaging, outperforming de novo GANs and diffusion models by focusing edits on localized regions and maintaining label integrity. This approach is particularly impactful for clinical contexts where data are scarce, as it enables robust augmentation with annotated synthetic lesions, improving lesion detection and classification while preserving interpretability and annotation ease.

Abstract

Limited medical imaging datasets challenge deep learning models by increasing risks of overfitting and reduced generalization, particularly in Generative Adversarial Networks (GANs), where discriminators may overfit, leading to training divergence. This constraint also impairs classification models trained on small datasets. Generative Data Augmentation (GDA) addresses this by expanding training datasets with synthetic data, although it requires training a generative model. We propose and evaluate two local lesion generation approaches to address the challenge of augmenting small medical image datasets. The first approach employs the Poisson Image Editing algorithm, a classical image processing technique, to create realistic image composites that outperform current state-of-the-art methods. The second approach introduces a novel generative method, leveraging a fine-tuned Image Inpainting GAN to synthesize realistic lesions within specified regions of real training images. A comprehensive comparison of the two proposed methods demonstrates that effective local lesion generation in a data-constrained setting allows for reaching new state-of-the-art results in capsule endoscopy lesion classification. Combination of our techniques achieves a macro F1-score of 33.07%, surpassing the previous best result by 7.84 percentage points (p.p.) on the highly imbalanced Kvasir Capsule Dataset, a benchmark for capsule endoscopy. To the best of our knowledge, this work is the first to apply a fine-tuned Image Inpainting GAN for GDA in medical imaging, demonstrating that an image-conditional GAN can be adapted effectively to limited datasets to generate high-quality examples, facilitating effective data augmentation. Additionally, we show that combining this GAN-based approach with classical image processing techniques further improves the results.

Paper Structure

This paper contains 36 sections, 13 equations, 35 figures, 6 tables, 2 algorithms.

Figures (35)

  • Figure 1: Diagram of the proposed methods. The data preparation step is shared for PBDA and IIDA. This stage increases data variability and quality by addressing the specific characteristics of VCE data. Image pairs and ROIs obtained in this stage are subsequently used for further processing. PBDA uses image pairs and ROIs to create synthetic lesions through Poisson Blending. IIDA fine-tunes image inpainting models for each lesion class, which are then used to inpaint lesions in locations selected through the data preparation stage.
  • Figure 2: Interpretation of the introduced notation. The image definition domain $S$ represents the target image, with $\Omega$ denoting the blending location, where $\partial\Omega$ is the border of the location. The function $f^*$ is defined over $S$ minus the interior of $\Omega$ and represents e.g. the background of a target image. The function $g$ can be a lesion and the unknown $f$ defined over $\Omega$ represents the blended pathology.
  • Figure 3: The data preparation diagram. Latent representations of images are extracted using DinoV2-Giant model oquab2023dinov2. Deduplication stage increases data variability and quality by removing redundant images. Following this, image pairs and ROIs are selected for further processing.
  • Figure 4: LaMa inpainting on a Kvasir Capsule image. On the left is an angiectasia sample from the Kvasir Capsule Dataset, and on the right is a sample generated by the pre-trained Big Lama-Fourier model suvorov2022resolution without fine-tuning. While this model can generate realistic tissue in capsule endoscopy images and effectively remove lesions, IIDA requires the opposite functionality. The generated content integrates seamlessly with the original tissue, showing no visible border.
  • Figure 5: Samples generated using the PBDA pipeline, with the resulting lesions highlighted by a green bounding box. The PBDA pipeline demonstrates the ability to generate plausible lesions.
  • ...and 30 more figures