Scaling Concept With Text-Guided Diffusion Models

Chao Huang; Susan Liang; Yunlong Tang; Yapeng Tian; Anurag Kumar; Chenliang Xu

Scaling Concept With Text-Guided Diffusion Models

Chao Huang, Susan Liang, Yunlong Tang, Yapeng Tian, Anurag Kumar, Chenliang Xu

TL;DR

ScalingConcept is introduced, a simple yet effective method to scale decomposed concepts up or down in real input without introducing new elements, and enables a variety of novel zero-shot applications across image and audio domains, including tasks such as canonical pose generation and generative sound highlighting or removal.

Abstract

Text-guided diffusion models have revolutionized generative tasks by producing high-fidelity content from text descriptions. They have also enabled an editing paradigm where concepts can be replaced through text conditioning (e.g., a dog to a tiger). In this work, we explore a novel approach: instead of replacing a concept, can we enhance or suppress the concept itself? Through an empirical study, we identify a trend where concepts can be decomposed in text-guided diffusion models. Leveraging this insight, we introduce ScalingConcept, a simple yet effective method to scale decomposed concepts up or down in real input without introducing new elements. To systematically evaluate our approach, we present the WeakConcept-10 dataset, where concepts are imperfect and need to be enhanced. More importantly, ScalingConcept enables a variety of novel zero-shot applications across image and audio domains, including tasks such as canonical pose generation and generative sound highlighting or removal.

Scaling Concept With Text-Guided Diffusion Models

TL;DR

Abstract

Scaling Concept With Text-Guided Diffusion Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (17)