Table of Contents
Fetching ...

Beyond a Single Mode: GAN Ensembles for Diverse Medical Data Generation

Lorenzo Tronchin, Tommy Löfstedt, Paolo Soda, Valerio Guarrasi

TL;DR

This work tackles the fidelity-diversity-efficiency trilemma in medical imaging by constructing ensembles of GANs rather than relying on a single model. It introduces a multi-objective Pareto optimization that selects a compact, non-redundant subset of GANs—varying architectures, losses, and training iterations—to maximize fidelity to the real data distribution while maximizing coverage of its diversity. Across three diverse medical datasets, the Pareto-derived ensemble G* consistently improves downstream diagnostic performance relative to single GANs and naive ensembles, reducing the real-synthetic performance gap. The approach leverages SwAV-based embeddings for distribution quality and demonstrates practical benefits for data-scarce medical contexts, while highlighting limitations such as static ensemble usage and computational costs, with future work aimed at dynamic, budget-aware ensembles.

Abstract

The advancement of generative AI, particularly in medical imaging, confronts the trilemma of ensuring high fidelity, diversity, and efficiency in synthetic data generation. While Generative Adversarial Networks (GANs) have shown promise across various applications, they still face challenges like mode collapse and insufficient coverage of real data distributions. This work explores the use of GAN ensembles to overcome these limitations, specifically in the context of medical imaging. By solving a multi-objective optimisation problem that balances fidelity and diversity, we propose a method for selecting an optimal ensemble of GANs tailored for medical data. The selected ensemble is capable of generating diverse synthetic medical images that are representative of true data distributions and computationally efficient. Each model in the ensemble brings a unique contribution, ensuring minimal redundancy. We conducted a comprehensive evaluation using three distinct medical datasets, testing 22 different GAN architectures with various loss functions and regularisation techniques. By sampling models at different training epochs, we crafted 110 unique configurations. The results highlight the capability of GAN ensembles to enhance the quality and utility of synthetic medical images, thereby improving the efficacy of downstream tasks such as diagnostic modelling.

Beyond a Single Mode: GAN Ensembles for Diverse Medical Data Generation

TL;DR

This work tackles the fidelity-diversity-efficiency trilemma in medical imaging by constructing ensembles of GANs rather than relying on a single model. It introduces a multi-objective Pareto optimization that selects a compact, non-redundant subset of GANs—varying architectures, losses, and training iterations—to maximize fidelity to the real data distribution while maximizing coverage of its diversity. Across three diverse medical datasets, the Pareto-derived ensemble G* consistently improves downstream diagnostic performance relative to single GANs and naive ensembles, reducing the real-synthetic performance gap. The approach leverages SwAV-based embeddings for distribution quality and demonstrates practical benefits for data-scarce medical contexts, while highlighting limitations such as static ensemble usage and computational costs, with future work aimed at dynamic, budget-aware ensembles.

Abstract

The advancement of generative AI, particularly in medical imaging, confronts the trilemma of ensuring high fidelity, diversity, and efficiency in synthetic data generation. While Generative Adversarial Networks (GANs) have shown promise across various applications, they still face challenges like mode collapse and insufficient coverage of real data distributions. This work explores the use of GAN ensembles to overcome these limitations, specifically in the context of medical imaging. By solving a multi-objective optimisation problem that balances fidelity and diversity, we propose a method for selecting an optimal ensemble of GANs tailored for medical data. The selected ensemble is capable of generating diverse synthetic medical images that are representative of true data distributions and computationally efficient. Each model in the ensemble brings a unique contribution, ensuring minimal redundancy. We conducted a comprehensive evaluation using three distinct medical datasets, testing 22 different GAN architectures with various loss functions and regularisation techniques. By sampling models at different training epochs, we crafted 110 unique configurations. The results highlight the capability of GAN ensembles to enhance the quality and utility of synthetic medical images, thereby improving the efficacy of downstream tasks such as diagnostic modelling.

Paper Structure

This paper contains 29 sections, 8 equations, 5 figures, 12 tables.

Figures (5)

  • Figure 1: An illustration of the manifold occupied by real data, $R$, alongside the manifold covered by various GAN configurations, $S_i$. Single: Showcases three scenarios: $S_i$ entirely within the $R$, indicating high fidelity; $S_i$ partially overlapping with the $R$ representing medium fidelity; $S_i$ entirely outside the $R$ suggesting low fidelity. Single GAN selection inherently has a low diversity due to limited real data coverage. Naive (A): Depicts the collective space covered by all GANs, encompassing areas both inside and outside the $R$. This approach results in medium fidelity and high diversity. $G^*$: Demonstrates the space covered by an ensemble of GANs selected through the proposed optimisation method, which aims to maximise coverage of the $R$ to ensure diversity discharging GAN outside the $R$, to ensure fidelity, using as few GANs as possible.
  • Figure 2: This figure presents a step-by-step visualisation of the proposed methodology, comprising four main stages: Training GANs: Multiple GAN architectures $G_i \in G$ are trained on a real data, $R$. Generation: Each GAN, $G_i$, is used to generate a synthetic dataset, $S_i$. These data are evaluated with respect to their fidelity and diversity. Optimisation: Various ensembles, $G'$, of GANs, are formed from the pool of trained models, followed by a multi-objective optimisation approach which maximises the closeness to the real data using $\delta$, and minimises the overlaps using $\Delta$. This results in a single best ensemble, $G^*$, that is selected from the Pareto front and generates $S^*$.
  • Figure 3: Plots of Diversity, Fidelity, and Utility using SwAV embeddings. The top row is single GANs (grey circles), Naive (A) and Naive (R) (grey and orange star), and the optimal ensemble $G^{*}$ (blue star). The bottom row shows the real test set downstream performances using the synthetic data from single GANs. Colours near pink or purple and circles of lower or higher diameters indicate lower or higher test set performances.
  • Figure A1: Pareto plots for each dataset using SwAV as a backbone.
  • Figure A2: Pareto plots for AIforCOVID using InceptionV3 as a backbone.