Table of Contents
Fetching ...

Decoding Diffusion: A Scalable Framework for Unsupervised Analysis of Latent Space Biases and Representations Using Natural Language Prompts

E. Zhixuan Zeng, Yuhao Chen, Alexander Wong

TL;DR

This paper proposes a novel framework for unsupervised exploration of diffusion latent spaces that directly leverage natural language prompts and image captions to map latent directions and provides a more scalable and interpretable understanding of the semantic knowledge encoded within diffusion models.

Abstract

Recent advances in image generation have made diffusion models powerful tools for creating high-quality images. However, their iterative denoising process makes understanding and interpreting their semantic latent spaces more challenging than other generative models, such as GANs. Recent methods have attempted to address this issue by identifying semantically meaningful directions within the latent space. However, they often need manual interpretation or are limited in the number of vectors that can be trained, restricting their scope and utility. This paper proposes a novel framework for unsupervised exploration of diffusion latent spaces. We directly leverage natural language prompts and image captions to map latent directions. This method allows for the automatic understanding of hidden features and supports a broader range of analysis without the need to train specific vectors. Our method provides a more scalable and interpretable understanding of the semantic knowledge encoded within diffusion models, facilitating comprehensive analysis of latent biases and the nuanced representations these models learn. Experimental results show that our framework can uncover hidden patterns and associations in various domains, offering new insights into the interpretability of diffusion model latent spaces.

Decoding Diffusion: A Scalable Framework for Unsupervised Analysis of Latent Space Biases and Representations Using Natural Language Prompts

TL;DR

This paper proposes a novel framework for unsupervised exploration of diffusion latent spaces that directly leverage natural language prompts and image captions to map latent directions and provides a more scalable and interpretable understanding of the semantic knowledge encoded within diffusion models.

Abstract

Recent advances in image generation have made diffusion models powerful tools for creating high-quality images. However, their iterative denoising process makes understanding and interpreting their semantic latent spaces more challenging than other generative models, such as GANs. Recent methods have attempted to address this issue by identifying semantically meaningful directions within the latent space. However, they often need manual interpretation or are limited in the number of vectors that can be trained, restricting their scope and utility. This paper proposes a novel framework for unsupervised exploration of diffusion latent spaces. We directly leverage natural language prompts and image captions to map latent directions. This method allows for the automatic understanding of hidden features and supports a broader range of analysis without the need to train specific vectors. Our method provides a more scalable and interpretable understanding of the semantic knowledge encoded within diffusion models, facilitating comprehensive analysis of latent biases and the nuanced representations these models learn. Experimental results show that our framework can uncover hidden patterns and associations in various domains, offering new insights into the interpretability of diffusion model latent spaces.

Paper Structure

This paper contains 12 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Overview of the three experimental approaches for analyzing h-space representations. (a) One-to-one comparisons isolate the impact of adding or removing a single concept. (b) One-to-many comparisons rank multiple prompts along a semantic spectrum. (c) Clustering captures broader patterns and associations across diverse captions.
  • Figure 2: Average cosine distance between h-space vectors sampled from prompts with and without explicit gender references, grouped by gender and profession. Female representations for several professions (e.g., lawyer, pilot, construction worker) show greater deviation from the non-gendered "default" representation compared to male counterparts, indicating stronger biases favoring male stereotypes in these roles.
  • Figure 3: Correlation between the percentage of images classified as female (using CLIP) and the difference in cosine distances between gendered h-space vectors for each prompt. Prompts with higher cosine differences favouring female vectors are more likely to generate female-presenting images, indicating a strong association between h-space distances and perceived gender.
  • Figure 4: t-SNE visualization of h-space vectors sampled from the Food500-CAP dataset. Clusters are labelled based on the id assigned by HDBSCAN mcinnes2017accelerated. Some prominent clusters include (8) square dishes; (12) pots with soup; (27) sandwiches/bread; (44) pie; (94) takeout boxes; (98) rectangular plates; (120) salads; (127) stir fried noodles; and (144) seafood boil; (136) purple vegetables, esp. purple cabbage
  • Figure 5: (a) default image for "a photo of food" with seed 0, (b) image conditioned on "a photo of food" plus the average of cluster 8 from Figure \ref{['fig:food-tsne']}, depicting a square plate (c) image conditioned on "a photo of food" plus the average of cluster 27 from Figure \ref{['fig:food-tsne']}, depicting bread (d) image conditioned on "a photo of food" plus the average of cluster 12 from Figure \ref{['fig:food-tsne']}, depicting a pots (e) image conditioned on "a photo of food" plus the average of clusters 8 and 12 from Figure \ref{['fig:food-tsne']}, depicting a square pot