Table of Contents
Fetching ...

Contrastive CFG: Improving CFG in Diffusion Models by Contrasting Positive and Negative Concepts

Jinho Chang, Hyungjin Chung, Jong Chul Ye

TL;DR

This work presents a novel method to enhance negative CFG guidance using contrastive loss, achieving a nearly identical guiding direction to traditional CFG for positive guidance while overcoming the limitations of existing negative guidance methods.

Abstract

As Classifier-Free Guidance (CFG) has proven effective in conditional diffusion model sampling for improved condition alignment, many applications use a negated CFG term to filter out unwanted features from samples. However, simply negating CFG guidance creates an inverted probability distribution, often distorting samples away from the marginal distribution. Inspired by recent advances in conditional diffusion models for inverse problems, here we present a novel method to enhance negative CFG guidance using contrastive loss. Specifically, our guidance term aligns or repels the denoising direction based on the given condition through contrastive loss, achieving a nearly identical guiding direction to traditional CFG for positive guidance while overcoming the limitations of existing negative guidance methods. Experimental results demonstrate that our approach effectively removes undesirable concepts while maintaining sample quality across diverse scenarios, from simple class conditions to complex and overlapping text prompts.

Contrastive CFG: Improving CFG in Diffusion Models by Contrasting Positive and Negative Concepts

TL;DR

This work presents a novel method to enhance negative CFG guidance using contrastive loss, achieving a nearly identical guiding direction to traditional CFG for positive guidance while overcoming the limitations of existing negative guidance methods.

Abstract

As Classifier-Free Guidance (CFG) has proven effective in conditional diffusion model sampling for improved condition alignment, many applications use a negated CFG term to filter out unwanted features from samples. However, simply negating CFG guidance creates an inverted probability distribution, often distorting samples away from the marginal distribution. Inspired by recent advances in conditional diffusion models for inverse problems, here we present a novel method to enhance negative CFG guidance using contrastive loss. Specifically, our guidance term aligns or repels the denoising direction based on the given condition through contrastive loss, achieving a nearly identical guiding direction to traditional CFG for positive guidance while overcoming the limitations of existing negative guidance methods. Experimental results demonstrate that our approach effectively removes undesirable concepts while maintaining sample quality across diverse scenarios, from simple class conditions to complex and overlapping text prompts.

Paper Structure

This paper contains 17 sections, 22 equations, 7 figures, 5 tables, 2 algorithms.

Figures (7)

  • Figure 1: Image samples generated from StableDiffusion 1.5 using positive prompts (written in black) and negative prompts (written in red), across various negative sampling methods. While some images generated by negated CFG are satisfactory, both DNG and negated CFG show limitations: DNG often fails to negate unwanted concepts effectively, and negated CFG frequently causes severe image quality degradation. In contrast, our proposed guidance term successfully removes undesirable features while preserving the quality of the output image.
  • Figure 2: Overview of the proposed guided sampling of CCFG. We pose guided sampling as an optimization problem that minimizes the contrastive loss of the positive and negative prompts, which has no computational overhead, yet avoids pitfalls of previous strategies such as negative CFG.
  • Figure 3: Output distribution with different negative sampling methods on a manually constructed dataset with two classes. (a) Negated CFG heavily shifts the samples from the marginal distribution. (b) DNG output still contains samples that can be regarded as the red class. (c) Our method can remove all samples that satisfy a forbidden class while preserving the unrelated distribution.
  • Figure 4: The plot between the error rate and FID for nCFG, DNG, and CCFG on class-removal negative sampling, in MNIST and CIFAR10 datasets. The numbers on the plots refer to the guidance scales. Starting from the lower right, all three sampling methods share the same guidance scales, which are written as the numbers on the plots.
  • Figure 5: Examples of SD1.5 generation results with positive-negative prompt pairs and CCFG to remove various features such as objects, internal bias, and potentially unsuitable contents.
  • ...and 2 more figures