Table of Contents
Fetching ...

Fine-Tuning Text-To-Image Diffusion Models for Class-Wise Spurious Feature Generation

AprilPyone MaungMaung, Huy H. Nguyen, Hitoshi Kiya, Isao Echizen

TL;DR

The paper addresses the challenge of efficiently generating spurious features for robust classifier evaluation by fine-tuning a large-scale text-to-image diffusion model (Stable Diffusion) using a small set of reference spurious images and a novel spurious feature similarity loss. It extends DreamBooth with joint text-encoder and noise-predictor optimization and introduces the spurious feature similarity loss $\mathcal{L}_{\text{(SFSL)}}$ to steer generative outputs toward class-wise spurious cues, combining it with a prior preservation term. Experiments on six Spurious ImageNet classes show that the generated images are spurious across multiple classifiers and visually resemble reference spurious images, outperforming or complementing existing Spurious ImageNet data in spurious evaluation. The approach provides a scalable, controllable way to produce synthetic, cross-class spurious data, with practical implications for testing and training robust classifiers, albeit with artifacts and context-dependent limitations acknowledged in the discussion.

Abstract

We propose a method for generating spurious features by leveraging large-scale text-to-image diffusion models. Although the previous work detects spurious features in a large-scale dataset like ImageNet and introduces Spurious ImageNet, we found that not all spurious images are spurious across different classifiers. Although spurious images help measure the reliance of a classifier, filtering many images from the Internet to find more spurious features is time-consuming. To this end, we utilize an existing approach of personalizing large-scale text-to-image diffusion models with available discovered spurious images and propose a new spurious feature similarity loss based on neural features of an adversarially robust model. Precisely, we fine-tune Stable Diffusion with several reference images from Spurious ImageNet with a modified objective incorporating the proposed spurious-feature similarity loss. Experiment results show that our method can generate spurious images that are consistently spurious across different classifiers. Moreover, the generated spurious images are visually similar to reference images from Spurious ImageNet.

Fine-Tuning Text-To-Image Diffusion Models for Class-Wise Spurious Feature Generation

TL;DR

The paper addresses the challenge of efficiently generating spurious features for robust classifier evaluation by fine-tuning a large-scale text-to-image diffusion model (Stable Diffusion) using a small set of reference spurious images and a novel spurious feature similarity loss. It extends DreamBooth with joint text-encoder and noise-predictor optimization and introduces the spurious feature similarity loss to steer generative outputs toward class-wise spurious cues, combining it with a prior preservation term. Experiments on six Spurious ImageNet classes show that the generated images are spurious across multiple classifiers and visually resemble reference spurious images, outperforming or complementing existing Spurious ImageNet data in spurious evaluation. The approach provides a scalable, controllable way to produce synthetic, cross-class spurious data, with practical implications for testing and training robust classifiers, albeit with artifacts and context-dependent limitations acknowledged in the discussion.

Abstract

We propose a method for generating spurious features by leveraging large-scale text-to-image diffusion models. Although the previous work detects spurious features in a large-scale dataset like ImageNet and introduces Spurious ImageNet, we found that not all spurious images are spurious across different classifiers. Although spurious images help measure the reliance of a classifier, filtering many images from the Internet to find more spurious features is time-consuming. To this end, we utilize an existing approach of personalizing large-scale text-to-image diffusion models with available discovered spurious images and propose a new spurious feature similarity loss based on neural features of an adversarially robust model. Precisely, we fine-tune Stable Diffusion with several reference images from Spurious ImageNet with a modified objective incorporating the proposed spurious-feature similarity loss. Experiment results show that our method can generate spurious images that are consistently spurious across different classifiers. Moreover, the generated spurious images are visually similar to reference images from Spurious ImageNet.
Paper Structure (17 sections, 6 equations, 6 figures, 3 tables)

This paper contains 17 sections, 6 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Spurious images. Some images from Spurious ImageNet neuhaus2023spurious dataset are detected as "hummingbird" but classified as "sulphur butterfly".
  • Figure 2: Stable Diffusion fine-tuning. [V] indicates three-character unique identifier as in DreamBooth ruiz2023dreambooth.
  • Figure 3: Subjective evaluation of real and generated images.
  • Figure 4: Selected examples of generated images (second row) and Spurious ImageNet (first row). Red label describes predicted class, and black label is true subject.
  • Figure 5: Recontextualized spurious images. Left to right images were generated with prompts, "a photo of a [V] flower on the beach, $\ldots$ on Mount Fuji, $\ldots$ in a garden, $\ldots$ in a market". Images in first row are classified as "hummingbird," and those in second row are not.
  • ...and 1 more figures