Table of Contents
Fetching ...

Progressive Monitoring of Generative Model Training Evolution

Vidya Prasad, Anna Vilanova, Nicola Pezzotti

TL;DR

This work tackles biases and inefficiencies in deep generative models by proposing a progressive analysis framework that continuously monitors high-dimensional training data via evolutionary embeddings. By extracting latent representations and data distributions at regular checkpoints and projecting them into a 2D evolutionary space (using encoders such as CLIP), the approach enables mid-training bias detection and timely interventions. Applied to AttentionGAN on CelebA hair-color transformations, the method reveals gender- and age-related biases early in training and demonstrates how data augmentation and targeted image collection can mitigate these biases, improve output realism, and reduce computational costs. The results highlight the practical impact of real-time, interpretable monitoring for fairer and more efficient generative modeling, with clear potential extensions to diffusion models and other DGMs.

Abstract

While deep generative models (DGMs) have gained popularity, their susceptibility to biases and other inefficiencies that lead to undesirable outcomes remains an issue. With their growing complexity, there is a critical need for early detection of issues to achieve desired results and optimize resources. Hence, we introduce a progressive analysis framework to monitor the training process of DGMs. Our method utilizes dimensionality reduction techniques to facilitate the inspection of latent representations, the generated and real distributions, and their evolution across training iterations. This monitoring allows us to pause and fix the training method if the representations or distributions progress undesirably. This approach allows for the analysis of a models' training dynamics and the timely identification of biases and failures, minimizing computational loads. We demonstrate how our method supports identifying and mitigating biases early in training a Generative Adversarial Network (GAN) and improving the quality of the generated data distribution.

Progressive Monitoring of Generative Model Training Evolution

TL;DR

This work tackles biases and inefficiencies in deep generative models by proposing a progressive analysis framework that continuously monitors high-dimensional training data via evolutionary embeddings. By extracting latent representations and data distributions at regular checkpoints and projecting them into a 2D evolutionary space (using encoders such as CLIP), the approach enables mid-training bias detection and timely interventions. Applied to AttentionGAN on CelebA hair-color transformations, the method reveals gender- and age-related biases early in training and demonstrates how data augmentation and targeted image collection can mitigate these biases, improve output realism, and reduce computational costs. The results highlight the practical impact of real-time, interpretable monitoring for fairer and more efficient generative modeling, with clear potential extensions to diffusion models and other DGMs.

Abstract

While deep generative models (DGMs) have gained popularity, their susceptibility to biases and other inefficiencies that lead to undesirable outcomes remains an issue. With their growing complexity, there is a critical need for early detection of issues to achieve desired results and optimize resources. Hence, we introduce a progressive analysis framework to monitor the training process of DGMs. Our method utilizes dimensionality reduction techniques to facilitate the inspection of latent representations, the generated and real distributions, and their evolution across training iterations. This monitoring allows us to pause and fix the training method if the representations or distributions progress undesirably. This approach allows for the analysis of a models' training dynamics and the timely identification of biases and failures, minimizing computational loads. We demonstrate how our method supports identifying and mitigating biases early in training a Generative Adversarial Network (GAN) and improving the quality of the generated data distribution.

Paper Structure

This paper contains 10 sections, 10 figures.

Figures (10)

  • Figure 1: (a) The evolutionary embedding method proposed by EvolvED prasad2024tree vs. (b) independent t-SNE van2008visualizing embeddings per iteration on example data. Each iteration is explicitly encoded and aligned with its prior step, enabling tracing evolutions in (a).
  • Figure 2: Overview of the progressive analysis framework. As training proceeds, model representations and generated datasets are extracted every $n$ iterations. These model elements are projected to a 2D space via DR methods that explicitly encode iterations and align each with the preceding steps. This embedding is explored with images to detect undesirable progression of the model. The training can be paused for timely corrections and resumed when issues are detected.
  • Figure 3: (i) The left panel shows the evolution of CLIP image embeddings across images. Interaction between the embeddings and images reveals that the orange points correspond to generated grey instances. These grey points are initially mixed with other hair color data until iteration 5000 (clusters a and b). They become more distinct after (clusters c through f). Filtering only grey instances (ii) shows two clusters: one that includes a mix of real grey instances (cluster e) and another that is distinct (cluster f).
  • Figure 4: Aggregated loss metrics, including (from left to right) real vs. fake image losses, classification cross-entropy losses for real and fake images, and the cyclic L1 reconstruction loss across 25000 training iterations.
  • Figure 5: The real (a) and generated grey-haired images for men (c-e) and women (g-i). The evolution from iteration 5000 (c and g) to 15000 (d and h) and 25000 (e and i) reveals exacerbated effects on facial features. Initially, at iteration 5000, the hair color starts to change, but from iteration 15000, the aging effects become more pronounced. Women (g-i) exhibit more significant facial distortions.
  • ...and 5 more figures